Tag Archives: three ways

5 useful #Devops lessons about flow you can learn while getting off a plane.

One of our favourite books here at DevOpsGuys is Donald Reinertsen’s “Principles of Product Development Flow”. It’s a compelling argument about the importance of “flow” – small batch sizes, frequent releases, rapid feedback, all the key tenets of continuous delivery and DevOps.

It is, sadly, not the sort of easily consumed management book that might appeal to your ADD-afflicted manager (should you be able to tear his/her attention away from his/her iPhone or iPad).

@TheDevMgr and myself (@TheOpsMgr) were discussing this last week as we landed at Munich airport to attend the @Jetbrains partner conference (we’re huge fans of TeamCity for continuous integration).

As we went thru the process of disembarking from the flight we realised we were in the middle of a real-world analogy for benefits of flow – an analogy that might be readily understandable to those senior managers we mentioned earlier.

Let’s walk thru it.

The plane lands and reaches the stand. Immediately everyone in the aisles seats un-clicks and stands up, congesting the aisle. Once the aisle is congested (=high utilisation) the ability of people to collect luggage from the overhead lockers is significantly reduced – everything becomes less efficient.

At high rates of utilisation of any resource waiting times, thrashing between tasks and the potential for disruption are likely to go up. So that’s Useful lesson #1.

All this activity and jamming of the aisle is, ultimately, futile because no-one is going anywhere until the stairs get there and the cabin doors are opened. This is the next constraint in the pipeline.

Useful lesson #2 – until you understand the constraints you might be just rushing headlong at the next bottleneck.

Eventually they arrive and we all start shuffling off like sheep (or zombies) and walk down the stairs… to the waiting buses on the tarmac.

Useful lesson #3 – if you’re trying to optimise flow you need to look beyond the immediate constraint.

In this, case the cabin door & stairs, and look at the entire system (this is the essential message of systems thinking and the “1st way of DevOps”).

The buses were fairly large and held about 60+ people each (=large batch size), and everyone tries to cram onto the first bus… which then waits until the second bus is full before both buses head across the tarmac. When we reach the terminal the buses park up… and the second bus is actually closer to the door we need to enter.

Useful lesson #4 – don’t assume that the batch size is what you think it is (i.e. 1 bus load). It might be more (2 buses!). Also, just because something looks FIFO doesn’t mean it is…

Once we enter the terminal we then hit another constraint – clearing Immigration control. Luckily we were able to go thru the EU Resident’s queue, which flowed at a fairly constant rate due to the minimal border control. But looking at the non-EU Residents queue that the flow was turbulent – some passengers went thru quickly but others took much longer to process due to their different nationality, visa requirements or whatever had caught the Border Control officer’s attention. Unfortunately, the people who could have been processed faster were stuck in the queue behind the “complex” processing.

Useful lesson #5 – If you can break your “unit of work” down to a relatively standard size the overall flow through the system is likely to improve. This is why most Scrum teams insist that a given user story shouldn’t’ take more than 2-3 days to complete, and if it would it then it should be split up into separate stories until it does.

Luckily we avoided the queue for checked luggage as we only had carry-on and we were able to get in the queue for the taxis instead… so that’s the end of our analogy for now.

So let’s think of some theoretical optimisations to improve the flow.

Firstly, why not only let the people on ONE side of the aisle stand to collect their overhead luggage and prepare to disembark, thereby avoiding the congestion in the aisle? You can then alternate sides until everyone’s off.

Secondly, why not see if you can get a second set of stairs so you can disembark from the forward AND rear cabin doors, and alleviate that bottleneck?

Thirdly, why not have smaller buses, and dispatch them immediately they are full, and thereby reduce the batch size that arrives at Immigration?

Fourthly, why not have more agents at Border Control to alleviate that bottleneck, or create different queue classes to try to optimise flow e.g. EU Residents, “Other countries we vaguely trust and generally wave through” and “countries we generally give a hard time to”. You could even have a special queue for “dodgy looking people from whatever nationality that are about to spend some quality time with a rubber glove”. Or why not create totally new categories like “those we hand luggage” versus “those with checked luggage who are only going to have to wait at the luggage carousel anyway so you might as well wait here anyway”.

These proposed optimisations might be simplistic. For example the reason the two buses leave together is probably because ground traffic at airports is strictly controlled (in fact there is a “ground traffic controller” just the same as an “air traffic controller”). So there are often constraints we might not be aware of until we do more investigation BUT the goal of any DevOps organisation should be to try and identify the constraints, and experiment with different ways to alleviate that constraint. Try something, learn from it, iterate around again.

Hopefully by using a common, real-world analogy for product development flow you’ll be able to convince your Boss to let you apply these principles to your DevOps delivery pipeline and improve flow within your organisation!
Photo credit: aselundblad / Foter / Creative Commons Attribution-NonCommercial-ShareAlike 2.0 Generic (CC BY-NC-SA 2.0)

DevOps Cardiff kicks off with a bang!

Awesome turnout and great community involvement at our first DevOps Cardiff meetup on Wednesday night!

The pizza and beers immediately after went down very well… as did the next few rounds at the excellent Urban Tap House (well, when I say a few, we headed back to the hotel, via Wok to Walk, at about 1am. I think).

people watching a presentation
@TheDevMgr leads off!

We’re super-excited that a number of people from the Cardiff DevOps community have already put their hand up to talk at future events and if you’ve got something you want to talk about, whether its a 5 min lightning demo to a full 50 min presentation, please contact us.

Thanks also to Dyn for sponsoring and FoundersHub for hosting!

The slides from the night are on Slideshare and embedded below.

DevOps and the Knowledge Horizon

One of the things I love about SF novels is that they are always full of fascinating ideas that sometimes spark resonances in my own thinking (in this case, on DevOps).

Alastair Reynolds’ new novel “On the Steel Breeze” has this passage when discussing the rotation of Hyperion (one of Saturn’s moons):

“There’s a value in chaos theory, a number called the Lyapunov exponent, which tells you how to predict a chaotic systems boundary – it’s knowledge horizon, if you like. Hyperion’s Lyapunov exponent is just forty day – we can’t predict this moon’s motion beyond the next forty days. That’s the maximum limit of our foreknowledge! If my life depended on this moon’s motion, I would still not be able to say a word about its state beyond forty days”

My first thought when I read this was “Yep, sound like every IT project I’ve ever worked on… pretty much turns to chaos after about 40 days…”.

Now, the mathematics of the Lyapunov Exponent are way beyond my long-forgotten basics of high school calculus so I can’t comment on whether the author’s usage of the exponent is valid but the concept – that there is a limit on our foreknowledge beyond some sort of chaos event horizon – really stuck with me.

So what’s this all got to do with DevOps?

Well, “classical” project management takes a rather linear, deterministic view of the world – that we can accurately define a set of initial conditions that are complete (the mythical “requirements spec”) and that we can define a series of steps (the “waterfall process”) that will result in us achieving a desired end state (a working system that matches the specification AND the customer’s intent).

But what happens if software development is “non-linear”, chaotic and subject to “the butterfly effect”?

If “small changes to the initial conditions” (i.e. the spec is a tiny bit wrong, or more likely the customer changes their mind) can have a major impact on the end state (not to mention any of the other factors that might turn a project plan to “chaos” in the common usage of the term) does any given software project have a “Lyapunov Exponent” that limits our ability to predict the future?

I’d argue that it might… and this is intuitively what DevOps seeks to address in the “2nd Way of DevOps”

The Second Way is about creating the right to left feedback loops. The goal of almost any process improvement initiative is to shorten and amplify feedback loops so necessary corrections can be continually made.

By shortening the feedback cycle we, consciously or unconsciously, are seeking to bring back our project cycles within our “Lyapunov Exponent”. If we can carry out our planning cycles (Sprints in an Agile development world) within that “limit of our foreknowledge” we create a system where we can correct and tweak the “initial conditions” of our next cycle to seek to keep our project “on track”.

What I also find interesting is the idea that different projects have different “Lyapunov Exponents” depending on their degree of “non-linearity” and sensitivity to initial conditions etc.

Perhaps this is why different projects can have success (or failure) with different sprint durations.

Just because a one, two or four week sprints work for project X doesn’t mean that’s the best duration for Project Y because it has a different exponent. Part of our feedback cycle also involves finding the optimal value for our “knowledge horizon” – not so far out we descend into “chaos” but not so short we spend more time re-planning than we might need to otherwise do.

Sprint Duration < “Lyapunov Exponent” = DevOps Success?

Sadly I don’t think that there is any theoretical way to determine a project (or organisation’s) hypothetical “Lyapunov Exponent” when it comes to project planning… it has to be empirical based on trial and error, but I’d be curious to know if, even anecdotally, that for a given organisation or type of project the likelihood of a “successful project” falls off a cliff beyond some planning horizon. If there was some way to determine the “Lyapunov Expontent” for a project and optimise an organisation’s feedback cycle (or sprint duration) below that limit then I think that would be of significant benefit.

-TheOpsMgr

p.s. yes, I know I’ve played fairly fast and loose with the mathematical usage of the word “chaos” compared to its vernacular usage. I apologise to any mathematicians reading this blog but I hope you’ll forgive me in the spirit of exploring an interesting concept!

image source: - CC  -  donkeyhotey via Flickr -http://farm7.staticflickr.com/6077/6144165108_8758c2a5c5_d.jpg

DevOps, Death Marches and the Hero Hacker Myth

I’ve been thinking a lot about DevOps culture over the last few weeks which is probably why my ears pricked up when I overheard someone telling a story about an IT project they’d worked on where they did “10 weeks straight, 7 days a week” in order to get the project in on time.

These types of stories are a key part of any organisations culture – they form part of the mythology that defines the cultural norms and “the way things are done” within the organisation and can tell you a lot about its values.

In this case the speaker, judging by his tone of voice, was very proud of the effort they’d put in and the fact they’d managed to “pull it out of the bag” with such a super-human effort.

After all, the project was a success, right?

Death Marchs

My perception, from the outside, was a bit different. To me it sounded like a “death march project”, which Ed Yourdon defines as:

“A death march project is one for which an unbiased, objective risk assessment (which includes an assessment of technical risks, personnel risks, legal risks, political risks, etc.) determines that the likelihood of failure is ~ 50 percent.” – Edward Yourdon

Wikipedia goes on to say:

Often, the death march will involve desperate attempts to right the course of the project by asking team members to work especially grueling hours (14-hour days, 7-day weeks, etc) or by attempting to “throw (enough) bodies at the problem”, often causing burnout.
– Wikipedia “Death march (project management)

Sound familiar?

Hero Hacker Myths

Closely related to the “Death March” is the “Hero Hacker Myth”, neatly described here by Rob Mee:

“The myth of the hero hacker is one of the most pervasive pathologies to be found in Silicon Valley start-ups: the idea that a lone programmer, fuelled by pizza and caffeine, swaddled in headphones, works all hours of the night to build a complex system, all by himself. Time out. Software development, it turns out, is a team sport. All start-ups grow, if they experience any meaningful success. What works for a lone programmer will not work in a company of 10. And what’s worse, encouraging the hero mentality leads to corrosive dysfunction in software teams. Invariably the developers who do a yeoman’s 9-to-5, week after week, cranking out solid features that the business is built on, lose out to the grasping egomaniacs who stay up all night (usually just one night) looking to garner lavish praise. Rather than reward the hero, it’s better to cultivate a true esprit de corps.” – Rob Mee, Pivotal Labs

The bold and red highlighting in the quote above is mine because that sentence is central to the message of this post – if your organisational culture mythologises “Death March”-type projects or the “Hero Hacker” then you have a dysfunctional culture that needs to be fixed.

Once the Death March mythology is engrained into your organisational culture EVERY project will end up being a Death March. Why? Because if they’ve “succeeded” in the past (because of super-human effort on behalf of the project team) with a project that was only allocated 50% of the time, resources or budget it needed to have a realistic chance of success then that 50% becomes the new norm.

What manager is going to scope the next project, of equivalent size, with TWICE the time, resources or budget of their last Death March? It would probably be career suicide.

Of course, this is only possible because the hidden costs of Death March projects like staff turnover, sickness, burn-out, divorce, poor quality / huge technical debt, etc. don’t appear on the project’s balance sheet. The narrow ROI view of the project budget doesn’t include these intangibles which are normally shouldered by the wider IT Department budget, or borne solely by the individuals whose health, marriages and families are harmed in the process.

Similarly the Hero Hacker Myth is an enabler for the Death March by mythologizing the long hours and hero mentality that any Death March project requires to have any chance of success.

In addition, as in the highlighted section in Rob Mee’s post, the Hero Hacker Myth is corrosive to the idea of teamwork and shared responsibility that methodologies like Agile and DevOps seek to engender. Almost by definition the “Hero Hacker” is anti-social in his work practices and is likely to horde his knowledge in a way that hurts the team’s success.

DevOps

So what does this have to do with DevOps?

Well, if we view the Death March and the Hero Hacker through the lens of the “Three ways of DevOps” we can see that DevOps should stop Death Marches after a few miles and show the Hero Hacker the door.

The “First Way” emphasises “Systems Thinking” – seeing the big picture about how value is delivered to the organisation and the customer.

Systems thinking should encompass the intangibles discussed above and see that you can’t deliver sustainable long-term value on a foundation of projects that have a 50% chance of failure and that require super-human effort to have any chance to succeed.

It’s like tossing a coin and expecting it to come down heads time and time again – eventually it will come down tails and you’ll have a massive project failure that will probably destroy any value generated by your earlier “successful” Death March projects.

The “Second Way” says “Amplify Feedback Loops” – and the key way to do this is to shorten the feedback cycle by iterating faster. This is the essence of the “fail fast, fail often” mantra which is the antithesis of the long, destructive Death March. With short feedback loops the message that the goal is likely to be unachievable should be received faster.

If course, just because you get rapid feedback that your project is turning into a Death March (or that your Hero Hacker is on his 3rd all-nighter and is running out of Jolt Cola) doesn’t necessarily mean anyone will head the message and take corrective action.

That’s where the Third Way of DevOps comes in – foster a “Culture of Continual Experimentation and Learning”. DevOps should foster a culture of learning from your mistakes, whether it’s your previous Death March or the fast feedback on your current project that you receive from amplifying your feedback loops.

“Those who do not learn from history are doomed to repeat it” and the essence of the Third way is to create the “virtuous spiral” by taking the learnings and lessons from the previous iteration and applying them to the next cycle.

Similarly a culture of learning is ipso facto a culture of sharing which means the knowledge hoarding of the Hero Hacker should be anathema to a DevOps culture.

In summary, listen to the stories your organisation mythologises and see if those stories are taking your down the path of DevOps learning or yet another destructive Death March waiting for a Hero Hacker to save it…