Tag Archives: metrics

DevOps and the Knowledge Horizon

One of the things I love about SF novels is that they are always full of fascinating ideas that sometimes spark resonances in my own thinking (in this case, on DevOps).

Alastair Reynolds’ new novel “On the Steel Breeze” has this passage when discussing the rotation of Hyperion (one of Saturn’s moons):

“There’s a value in chaos theory, a number called the Lyapunov exponent, which tells you how to predict a chaotic systems boundary – it’s knowledge horizon, if you like. Hyperion’s Lyapunov exponent is just forty day – we can’t predict this moon’s motion beyond the next forty days. That’s the maximum limit of our foreknowledge! If my life depended on this moon’s motion, I would still not be able to say a word about its state beyond forty days”

My first thought when I read this was “Yep, sound like every IT project I’ve ever worked on… pretty much turns to chaos after about 40 days…”.

Now, the mathematics of the Lyapunov Exponent are way beyond my long-forgotten basics of high school calculus so I can’t comment on whether the author’s usage of the exponent is valid but the concept – that there is a limit on our foreknowledge beyond some sort of chaos event horizon – really stuck with me.

So what’s this all got to do with DevOps?

Well, “classical” project management takes a rather linear, deterministic view of the world – that we can accurately define a set of initial conditions that are complete (the mythical “requirements spec”) and that we can define a series of steps (the “waterfall process”) that will result in us achieving a desired end state (a working system that matches the specification AND the customer’s intent).

But what happens if software development is “non-linear”, chaotic and subject to “the butterfly effect”?

If “small changes to the initial conditions” (i.e. the spec is a tiny bit wrong, or more likely the customer changes their mind) can have a major impact on the end state (not to mention any of the other factors that might turn a project plan to “chaos” in the common usage of the term) does any given software project have a “Lyapunov Exponent” that limits our ability to predict the future?

I’d argue that it might… and this is intuitively what DevOps seeks to address in the “2nd Way of DevOps”

The Second Way is about creating the right to left feedback loops. The goal of almost any process improvement initiative is to shorten and amplify feedback loops so necessary corrections can be continually made.

By shortening the feedback cycle we, consciously or unconsciously, are seeking to bring back our project cycles within our “Lyapunov Exponent”. If we can carry out our planning cycles (Sprints in an Agile development world) within that “limit of our foreknowledge” we create a system where we can correct and tweak the “initial conditions” of our next cycle to seek to keep our project “on track”.

What I also find interesting is the idea that different projects have different “Lyapunov Exponents” depending on their degree of “non-linearity” and sensitivity to initial conditions etc.

Perhaps this is why different projects can have success (or failure) with different sprint durations.

Just because a one, two or four week sprints work for project X doesn’t mean that’s the best duration for Project Y because it has a different exponent. Part of our feedback cycle also involves finding the optimal value for our “knowledge horizon” – not so far out we descend into “chaos” but not so short we spend more time re-planning than we might need to otherwise do.

Sprint Duration < “Lyapunov Exponent” = DevOps Success?

Sadly I don’t think that there is any theoretical way to determine a project (or organisation’s) hypothetical “Lyapunov Exponent” when it comes to project planning… it has to be empirical based on trial and error, but I’d be curious to know if, even anecdotally, that for a given organisation or type of project the likelihood of a “successful project” falls off a cliff beyond some planning horizon. If there was some way to determine the “Lyapunov Expontent” for a project and optimise an organisation’s feedback cycle (or sprint duration) below that limit then I think that would be of significant benefit.

-TheOpsMgr

p.s. yes, I know I’ve played fairly fast and loose with the mathematical usage of the word “chaos” compared to its vernacular usage. I apologise to any mathematicians reading this blog but I hope you’ll forgive me in the spirit of exploring an interesting concept!

image source: - CC  -  donkeyhotey via Flickr -http://farm7.staticflickr.com/6077/6144165108_8758c2a5c5_d.jpg

DevOps – how to find the constraints in your IT services? Part I

One of the key quotes in “The Phoenix Project” was, for me, “Any improvement not made at the constraint is an illusion”.

“Any improvement not made at the constraint is an illusion”

The logic is clear – any improvements in delivery upstream from the constraint just causes more work to queue up at the bottleneck constraint, and any improvement downstream just means more idle time for the downstream resources as they wait for work to be released from the bottleneck.

So, how can you identify the constraint(s) in your IT services?

Well, in the Phoenix Project is obvious – the constraint is “Brent”, the uber-geek with his hands in every project and too much un-documented knowledge in head. However, in your organisation the constraints might be much more subtle and may need a more methodical investigation to uncover.

I am sure that the business process re-engineering and Six-Sigma experts out there have a wealth of methodologies to discover these constraints, but the purpose of this article is to outline a pragmatic approach, derived from first principles, that you can use to get started. (p.s. if you ARE a business process re-engineering and Six-Sigma expert please feel free to point us to relevant techniques and models via the comments!).

Where should you start? Clearly, your focus needs to be on those IT services (people, process and technology) that are key to your business success and contribute the most to your organisation’s strategic objectives. (BTW one of the other key lessons from the Phoenix Project was how many “mission critical projects” actually had no real linkage to the company’s strategic objects or business plans).

So, step #1 is to dig out your organisation’s annual report and most recent strategy presentation and make sure that YOU know clearly what the organisation’s goals are. This is your “BS filter” for the rest of the rest of the process… if someone complains about an IT service but can’t link that service or project back to a key business objective then put that one the bottom of the pile!

The next step is to do a bit of exploratory research with your key business users (note, business users, not IT staff!) which should quickly provide you with a list of “Top Ten IT services that get in the way of the business success”. This list might include services that you DON’T currently deliver e.g. “if we had a more flexible and responsive laptop support team e.g. “like a genius bar” where our mobile sales teams could get their laptop problems fixed fast and get back on the road selling to our customers.”.

Step #3 is to try to validate these subjective opinions with some objective data.

Now, assuming you have some type of helpdesk logging system you should have a ready source of data about the performance of your key IT services. If you DON’T have some form of logging tool then find a way to start logging your work, immediately, even if it’s just one big shared Excel spreadsheet. Personally, I like Service-now.com and have used that very successfully at a number of sites but the key point is “you can’t manage what you can’t measure” and to have any hope of demonstrating improvement you need to be able to measure the before & after impact of whatever changes you make.

“You can’t manage what you can’t measure”

What sort of things can you look at? Well, things like which tasks/services/processes are used the most? Take the longest? Breach the SLA’s the most (days overdue)? Who in your team has the most items assigned to them? Which systems/services have the lowest availability? All of these (and many others) should start pointing you in the right direction for things that need improvement.

So, now you have a list of the stuff that’s most important to the business, backed up with objective data about how frequently the issues might occur, how many users are impacted and that should enable you to make a first pass at what’s important and where to start first.

So, write them up on cards, stick them up on your Kanban Board, and move your top pick to “in progress” and start working on the detail of finding and fixing the constraints in that service.

How we’ll do that step will be in Part II!

-          TheOpsMgr

 

Photo: Flickr/bjornmeansbear

Are you really doing DevOps? 8 prerequisites you must consider

A recent “2013 State of DevOps” report from PuppetLabs indicated that “52% of [over 4,000] respondents said that they’ve been doing DevOps for > 12 months”. Our own, less formal, survey found very much the same – 60% of respondents indicated that they were currently doing DevOps”. Take part in our survey, by clicking here.

PuppetLabs Survey Results
PuppetLabs Survey Results

The question is Are you really doing DevOps?

As we pointed out in our earlier DevOpsDays wrap-up blog there are at least 7 different definitions of what DevOps is, judging by the presentations given on the day:

DevOps … is a movement, a philosophy, a way of thinking.
DevOps … is a person who can perform both Dev and Ops roles.
DevOps … means cross skilling people.
DevOps … is continuous delivery.
DevOps … is a team of developers and operation staff.
DevOps …is a culture movement.
DevOps … is monitoring.

So… what’s the “minimum viable product” (MVP) for DevOps? What core things should you be doing before you can truly say you are “doing DevOps”?

In the whitepaper “The Top 11 Things You Need To Know About DevOps” Gene Kim emphasizes the “3 Ways”:

(1) “ Emphasize the performance of the entire system” – a holistic viewpoint from requirements all the way through to Operations
(2) “Creating feedback loops” – to ensure that corrections can continually be made. A TQM philosophy, basically.
(3) “Creating a culture that fosters continual experimentation and understanding that repetition and practice are the pre-requisites to mastery”

These are excellent guidelines at a high level, but we’d like to see a more operational definition. So we’ve made up our own list!

As a starter – we propose that;

  1. You must have identified executive sponsors / stake holders who you are actively working with to promote the DevOps approach.
  2. You must have developed a clear understanding of your organisation’s “value chain” and how value is created (or destroyed) along that chain.
  3. You must have organizationally re-structured your development and operations teams to create an integrated team – otherwise you’re still in Silos.
  4. You must have changed your team incentives (e.g. bonus incentives) to reinforce that re-alignment – without shared Goals you’re still in Silos.
  5. You must be seeking repeatable standardized processes for all key activities along the value chain (the “pre-requisite to mastery”)
  6. You must be leveraging automation where possible – including continuous integration, automated deployments and “infrastructure as code”
  7. You must be adopting robust processes to measure key metrics – PuppetLab’s report focuses on improvement in 4 key metrics – Change Frequency, Change Lead Time, Change Failure Rate and MTTR. We suggest Availability, Performance and MTBF should be in there too.
  8. You must have identified well-defined feedback mechanisms to create continuous improvement.

As mentioned above, this is just a starter list – feel free to agree/disagree in the comments and suggest additions or alterations.

We’ll be writing more about “DevOps Incentives” in an upcoming post, and we’ll revisit the “Are you doing DevOps?” topic once we’ve consolidated your feedback.

-TheOpsMgr