Tag Archives: Culture

CALMS Model image

The secret of #DevOps success isn’t in the IT literature (yet)!

Can you find “DevOps Success” by reading only IT literature?

The answer is “mostly No, but a little bit Yes”, for a number of reasons.

The main reason is that many of the blogs, whitepapers and webinars around DevOps are ultimately about technology and toolchains. Whilst they might reference the DevOps C.A.L.M.S model in passing the conversation is generally focussed on the A for Automation and the M for Metrics.

CALMS Model image
Culture Automation Lean Metrics Sharing

The seminal Phoenix Project did talk about organisational culture as did Mandi Wall’s O’Reilly ebook on Building a DevOps Culture.

But what both of those books have in common is that they drew extensively from non-IT business literature.

Goldratt’s “The Goal” & TOC, Systems Thinking, Lean manufacturing, Deming and Kanban being major influences on the Phoenix project, and Mandi’s e-book drawing on business-centric cultural/organisational change literature.

“Searching on the Harvard Business Review website for “cultural change” will get you 60+ publications going back nearly 30 years” – Mandi Walls

“Lean Manufacturing”, “Strategic Alignment”, “Organisational Change”, “Culture”, “Business Transformation” and many other topics have been staples of the MBA curriculum in business schools for many years and there is a wealth of resources available online to explore. Reading beyond the IT literature and exploring the wider business context for your DevOps Transformation will, we believe, significantly increase your chances of getting business buy-in and having a successful outcome to your DevOps change programme.

In order to make this easier for you we (@DevOpsGuys) will be publishing a weekly blog post exploring an area of business literature and how it can be used in DevOps.

We’re call it the #DevOpsMBA :-)

So please subscribe to our blog (link on right), follow us on Twitter or search for the #DevOpsMBA hashtag on Twitter to keep informed!

Why companies are investing in DevOps and Continuous Delivery

A thought-provoking infographic – with some interesting data points – shows how companies are reaping real rewards from investing in agile software delivery processes. Check out the graphic – from Zend – for more on how DevOps and Continuous Delivery are bridging the speed and innovation gap between business demand and IT.

Continuous Delivery Infographic by Zend Technologies.

Continuous Delivery Infographic

7348317140_e30eeda034

DevOps and the Product Owner

In a previous post we talked a lot about the “Product-centric” approach to DevOps but what does this mean for the role of the Agile “Product Owner”?

So what is the traditional role of the Product Owner? Agile author Mike Cohn from MountainGoat Software defines it thus:

“The Scrum product owner is typically a project’s key stakeholder. Part of the product owner responsibilities is to have a vision of what he or she wishes to build, and convey that vision to the scrum team. This is key to successfully starting any agile software development project. The agile product owner does this in part through the product backlog, which is a prioritized features list for the product.

The product owner is commonly a lead user of the system or someone from marketing, product management or anyone with a solid understanding of users, the market place, the competition and of future trends for the domain or type of system being developed” – Mike Cohn

The definition above is very “project-centric” – the Product Owner’s role appears to be tied to the existence and duration of the project and their focus is on the delivery of “features”.

DevOps, conversely, asks us (in the “First Way of DevOps”) to use “Systems Thinking” and focus on the bigger picture (not just “feature-itis”) and the “Product-centric” approach says we need to focus on the entire lifecycle of the product, not just the delivery of a project/feature/phase.

Whilst decomposing the “big picture” into “features” is something we completely agree with, as features should be the “unit of work” for your Scrum teams or “Agile Software Development Factory”, it needs to be within the context of the Product Lifecycle (and the “feature roadmap”).

So the key shift here then is to start talking about the “Product Lifecycle Owner”, not just the Product Owner, and ensure that Systems Thinking is a critical skill for that role.

The second big shift with DevOps is that “Non-Functional Requirements” proposed by Operations as being critical to the manageability and stability of the product across its full lifecycle “from inception to retirement” must be seen as equally important as the functional requirements proposed by the traditional Product Owner role.

In fact, we’d like to ban the term “Non-Functional Requirements” (NFR’s) completely, as the name itself seems to carry an inherent “negativity” that we feel contributes to the lack of importance placed on NFR’s in many organisations.

We propose the term “Operational Requirements” (OR’s) as we feel that this conveys the correct “product lifecycle-centric” message about why these requirements are in the specification – “This is what we need to run and operate this product in Production across the product’s lifecycle in order to maximise the product’s likelihood of meeting the business objects set for it”.

We propose the term “Operational Requirements” (OR’s) as we feel that this conveys the correct “product lifecycle-centric” message about why these requirements are in the specification.

For the slightly more pessimistic or combative amongst you the “OR” in Operational Requirements can stand for “OR this doesn’t get deployed into Production…” .

The unresolved question is do we need an “Operational Product Owner” or does the role of the traditional, business-focussed Product Owner extend to encompass the operational requirements?

You could argue that the “Operational Product Owner” already partly exists as the “Service Delivery Manager” (SDM) within the ITIL framework but SDM’s rarely get involved in the software development lifecycle as they are focussed on the “delivery” part at the end of the SDLC. Their role could be extended to include driving Operational Requirements into the SDLC as part of the continual service improvement (CSI) process however.

That said, having two Product Owners might be problematic and confusing from the Agile development team perspective so it would probably be preferable if the traditional Business product owner was also responsible for the operational requirements as well as the functional requirements. This may require the Product Owner to have a significantly deeper understanding of technology and operations than previously otherwise trying to understand why “loosely-coupled session state management” is important to “horizontal scalability” might result in some blank faces!

So in summary a “DevOps Product Owner” needs to:

  • Embrace “System Thinking” and focus on the “Product Lifecycle” not just projects or features
  • Understand the “Operational Requirements” (and just say “No to NFR’s”!)
  • Ensure that the “OR’s” are seen as important as the “Functional Requirements” in the Product roadmap and champion their implementation

In future posts we’ll examine the impact of DevOps on other key roles in the SDLC & Operations. We’ve love to get your opinions in the comments section below!

-TheOpsMgr

image source: – CC  – CannedTuna via Flickr – http://www.flickr.com/photos/cannedtuna/7348317140/sizes/m/

5401494223_e339c6425a

DevOps and the “Product-centric” approach

When doing the research for our BrightTalk Webinar on DevOps I came across this quote from Jez Humble on “products not projects”, which really struck a chord with our thinking about what we call the “product-centric” approach to DevOps.

Products not Projects quotation

Figure 1 – DevOpsGuys BrightTalk Webinar

One of the key elements of DevOps is to ensure that IT strategy is directly linked with Business strategy.

One of the critical scenes in “The Phoenix Project” is where our hero Bill gets to meet with the CFO and they work through the list of “pet IT projects” in the organisation and find that many of them can’t really be tied back to any business strategy or organisational goal.

This is, we feel, one of the major problems with the “project-centric” view of IT and why we need to push for a “product-centric” view.

The “project-centric” view, whilst important for mobilising resources and organising activities, can easily get disconnected from the original business objectives. Any “Project”, just like the “Phoenix Project” in the novel, runs the risk of becoming its own “raison d’etre” as it becomes more about getting “the Project” over the line than whatever business benefits were originally proposed.

In contrast a “Product-centric” viewpoint is focussed on the Product (or Service) that you are taking to market. By keeping the focus on “the Product” you are constantly reminded that you are building a product, for customers, as part of an organisational objective that ties back to over-arching business strategy.

For example if you were in the travel sector you might be adding a new “Online Itinerary Manager” product to your website to enable your customers to view (and possibly update) their itinerary online as part of your business strategy to both empower customers online and reduce the number of “avoidable contacts” to your call centre (and hence reduce costs).

One of the other benefits of the “product-centric” view is also highlighted in Jez’s quote above – “From Inception to Retirement”.

The “product-centric” view reminds us to think about the “Product lifecycle” and not just the “software development lifecycle”.

The “product-centric” view reminds us to think about the “Product lifecycle”
and not just the “software development lifecycle”.

You need to understand how this product is going to be deployed, managed, patched, upgraded, enhanced and ultimately retired… and that means you need close cooperation between the business, the developers and operations (= DevOps!).

So how do you introduce the “Product-centric” view into an organisation that might already have existing website that offers products/services to customers?

Well, firstly, you need to stop referring to it as “the website”.

A “website” is a platform and a channel to market, it’s not a “product”.

A product is a good or service that you offer to the market in order to meet a perceived market need in the hope that you will in return received an economic reward.

For some “websites” there might be a single product e.g. Dominos sells pizza online (food products) and for other there might be multiple products e.g. theAA.com sells membership (breakdown services), Insurance, Financial Services, Driver Services (mostly training) and Travel Services. If you’re ever in doubt on the products your website sells your top navigation menu will probably give you a pretty good indication!

What Products does the AA Sell?

Figure 2 – what products does the AA sell?

It’s worth mentioning that the AA has another product too – the “RoutePlanner”. Although the route planner is “free” is has a value to the organisation both indirectly (by drawing traffic to the site that you might then cross-sell too) and intangibly (by offering a valuable service for “free” it enhances the brand).

Secondly, you need to identify the “product owners” for each of the core products that use your website as a channel to market, so in our AA example you’d probably have separate product owners for Membership, Insurance, Finance etc.

If you’re ever in doubt about who is, or isn’t, the right product owner then there is a simple test – “Do you have a significantly financial incentive (bonus) for Product “X” to succeed in the market?”.

The “Product Owner” Test:
“Do you have a significantly financial incentive (bonus) for Product “X” to succeed in the market?”

If the answer is “no” keep going up the hierarchy until you find someone who says “yes”. A product owner who doesn’t have any “skin in the game” regarding the success of their product line is a bad idea!

Thirdly, you need to work with the product owner to map out the product lifecycle of that product, and then identify the IT dependencies and deliverables at every stage along that product lifecycle. Within your product lifecycle you might also want to map out the “feature roadmap” if your product devolves into multiple features that will be released over time. Creating the “big picture” can be vital in motivating your teams and helping them to understand what you’re trying to achieve, and this in turn helps that to make better decisions.

Fourthly, you need to “sense check” your product lifecycle and feature roadmap against your business strategy and organisational goals. If they don’t align you either need to re-work the plan or you might decide to drop the product altogether and re-deploy those resources to a product that *IS* part of your core strategy.

Lastly you need to re-organise your DevOps teams around these products and align your delivery pipeline and processes with the product lifecycle (and feature roadmap). Your DevOps teams are responsible for the “inception to retirement” management of that product (*Top Tip – just like your “product owner” it might be a great idea to incentivise your DevOps teams with some “product success” (business) metrics in their bonus, not just technical metrics like “on-time delivery” or “system availability”. It never hurts for them to have some “skin in the game” to promote a sense of ownership in what they are delivering!).

So, to summarise, the key elements in a “product-centric approach” are:

(1)    Breakdown your “website” into the key “Products” (that generate business value)

(2)    Identify “Product Owners” with “skin in the game”

(3)    Map out the product lifecycle (and ideally feature roadmap) for each product with them

(4)    Sense check this product strategy with your organisational strategy

(5)    Align your DevOps teams with the core products (and incentivise them accordingly!)

So next time you’re in a meeting and someone proposes a new “Project” see if you can challenge them to create a new “Product” instead!

 

image source: - CC  - jeremybrooks via Flickr - http://www.flickr.com/photos/jeremybrooks/5401494223/sizes/s/
shadowitmain

Is DevOps your defence against Shadow IT?

We’ve been evangelising DevOps quite a bit lately amongst customers and partners and one of the arguments that seems to resonant the most with people about why the current paradigm of IT Development and Operations is “broken” is around the rise of “Shadow IT”.

“Shadow IT is a term often used to describe IT systems and IT solutions built and used inside organizations without organizational approval. It is also used, along with the term “Stealth IT,” to describe solutions specified and deployed by departments other than the IT department.” – Wikipedia

“Shadow IT” is nothing new – it’s been around pretty much since the invention of the PC and client/server computing – but what is new is the speed and ease at which “Shadow IT” can be deployed, and the performance, reliability and stability of that “Shadow IT” solution.

“Sally from Marketing”, armed with nothing more than a company credit card, can instantiate an arbitrary number of servers, of varying capacity and specification, from a wide variety of Cloud hosting providers. Depending on the quality of the internal IT and her choice of Cloud provider it’s quite possible that she will get better uptime and performance from her Shadow IT solution than what she might get internally.

She can then find a 3rd software house to write some bespoke software (and then someone like DevOpsGuys to deploy and manage it… Oooopps!) and her “Shadow IT solution” is up and running (not to mention the many other SaaS solutions that she could consume).

shadowit

Figure 1 – DevOpsGuys – BrightTalk Webinar

In our recent BrightTalk webinar we spoke about how Gartner predicts that “Shadow IT” is expected to grow, and that there is some evidence (the PwC survey) that there is a negative correlation between “IT control” and organisational performance.

So, in summary, the traditional silo-mentality model of IT clearly isn’t meeting the customer’s needs for flexibility, innovation and time-to-market, and Cloud Computing is enabling the growth of “Shadow IT” on a scale never seen before.

To take a military analogy this is like the invention of highly mobile, mechanised “manoeuvre warfare” during WW II. The entrenched positions of the Maginot Line (think “traditional IT departments”) were rendered irrelevant by the “blitzkrieg” tactics (think “Shadow IT”) of the Wehrmacht as they simply bypassed the fixed fortifications with their more manoeuvrable, dare I say “agile”, mechanised infantry.

What is particularly fascinating, if Wikipedia can be believed, is that “blitzkrieg”, contrary to popular belief, was never a formal warfighting “doctrine” of the German Army (emphasis mine):

“Naveh states, “The striking feature of the blitzkrieg concept is the complete absence of a coherent theory which should have served as the general cognitive basis for the actual conduct of operations”. Naveh described it as an “ad hoc solution” to operational dangers, thrown together at the last moment”  – Wikipedia

An “ad-hoc solution” to operational dangers, thrown together at the last moment” is probably a pretty good definition of “Shadow IT” too but the important fact to remember is that Blitzkrieg worked (whether it was a formal doctrine or not). It crushed the opposition and subsequently became the cornerstone of modern military “combined arms” doctrine.

So, what’s this got to do with DevOps?

Well, clearly Gartner think that “Shadow IT” is working well too, and will continue to “outflank” traditional IT Departments.

Our view is that DevOps can be seen as the perfect defence to “Shadow IT” as it co-opts many of the key “manoeuvre warfare” concepts to provide the user with the speed, flexibility and time-to-market they want, but still within the control of IT to ensure standards, security and compliance.

DevOps, by breaking down the silos between Development and Operations, seeks to create unified cross-functional teams organised around specific objectives (ideally specific products that generate value for your organisation).

Compare this to the Wikipedia definition of “combined arms” doctrine” (bold emphasis mine):

“Combined arms is an approach to warfare which seeks to integrate different combat arms of a military to achieve mutually complementary effects (for example, using infantry and armor in an urban environment, where one supports the other, or both support each other). Combined arms doctrine contrasts with segregated arms where each military unit is composed of only one type of soldier or weapon system. Segregated arms is the traditional method of unit/force organisation, employed to provide maximum unit cohesion and concentration of force in a given weapon or unit type.”

Let’s paraphrase this for DevOps…

“[DevOps] is an approach to [IT Service delivery] which seeks to integrate different [technical silos] of a [IT Department) to achieve mutually complementary effects (for example, using [Development] and [Operations] in an [e-commerce] environment, where one supports the other, or both support each other). [DevOps] doctrine contrasts with [Traditional IT] where each [IT Team] is composed of only one type of [technical specialist] or [Technology] system. [Traditional IT] the traditional method of [team/department] organisation, employed to provide maximum [team] cohesion and concentration of [technical expertise] in a given [technology] or [team] type.”

That seems to be a pretty good definition of DevOps doctrine, to me!

The only true defence to “Shadow IT” is to offer a level of service that meets the internal customer needs for speed, flexibility and time-to-market they want. If they can get it “in-house” then the impetus to build a “Shadow IT” organisation is reduced.

The best way to deliver this level of service is, in our view, to adopt the lessons of Blitzkrieg and “combined arms” doctrine as embodied within DevOps by “integrating different teams… to achieve mutually complementary effects” and leveraging new technologies (like Cloud, APM and continuous delivery) to ensure ability AND stability.

image source: - CC  - faungg's via Flickr - http://www.flickr.com/photos/44534236@N00/2461920698
coase

The Economics of DevOps Part II – Ronald Coase and Transaction costs

In an earlier blog post we talked about the classical economic theories of Adam Smith and how they applied to DevOps (particularly the role of “generalists” and “specialists” with DevOps teams).

In this post we’re going to talk about the work of Ronald Coase on transactions costs and the “Nature of the Firm” and why this is relevant to how & why the DevOps approach works.

“In order to carry out a market transaction it is necessary to discover who it is that one wishes to deal with, to conduct negotiations leading up to a bargain, to draw up the contract, to undertake the inspection needed to make sure that the terms of the contract are being observed, and so on.” – Ronald Coase

One of the key tenets of DevOps is to try and remove the “silos” that exist in “traditional” IT departments but in order to remove these “silos” we need to understand how&why they exist in the first place.

In 1937 Coase published the “Nature of the Firm” which examined some of these issues in the context of why certain activities were taking place within the boundaries of an organisation as opposed to by external suppliers (and also why certain activities need to be undertaken by governments in order for the market to work effectively).

Where do we “draw the line” between what’s economically viable to do “in-house” as opposed to “subcontracting out”? In IT terms this can be seen as analogous to the “buy vs build” decision when it comes to COTS software.

Coase postulated 6 “transaction costs” that influenced this decision, and where you drew the “boundary of the firm” depended on the your desire to minimise these supply costs for that particular service or product (and hence to maximise YOUR profits for the output of your value –added activity that was built on top of these base costs).

  • Search and information costs are costs such as those incurred in determining that the required good is available on the market, which has the lowest price, etc.
  • Bargaining costs are the costs required to come to an acceptable agreement with the other party to the transaction, drawing up an appropriate contract and so on.
  • Policing and enforcement costs are the costs of making sure the other party sticks to the terms of the contract, and taking appropriate action (often through the legal system) if this turns out not to be the case.

Source – Wikipedia

If we examine many of the current trends in both business and IT it’s clear that these cost are (consciously or unconsciously) the target for many of the new products and services we see in the market.

  • Googles entire business model is based on LOWERING search and information costs (even more so with services like Google Shopper that enable you to very quickly find information on products and prices). Lowering the “information asymmetry” between buyer and seller generally optimises the price.
  • eBay and Etsy also directly attack these costs – they collect items into a market so that lowers search and information costs, they set very clear policies (“contracts”) regarding prices etc, offer policing and enforcement mechanism in the case of disputes and generally guarantee payment via the use of escrow systems etc.
  • virtual workforce websites like elance.com or 99designs.com do the same for services – they create an efficient market where you can find the skills you require within a framework that lowers your risk and minimises these transaction costs.

So how does this apply to silos within at IT organisation and DevOps transformation?

Well, if we shrink our definition of the “firm” down to the IT department we can see how silos can form.

For example, if you want some DBA expertise it lowers search costs if you can just say “go see the DBA team over there” (silo). You can define clear processes to request and interact with that silo (lowers bargaining costs) and all the DBA’s report to a common line manager so when it comes to “policing and enforcement” you have a clear route to complain. It also makes life easier for the DBA’s in that they lower their internal “search costs” when it comes to sharing knowledge, information, workload etc.

So what is DevOps trying to change?

Well, DevOps postulates that creating silos based on expertise is ultimately inefficient when seen from the perspective of a building a customer product. It might be efficient for the DBA’s but these silos don’t scale very well.

In order to build a “product” I need DBA’s, systems administrators, business analysts, developers (of various flavours), testers etc etc. If all of these exist in silos, surrounded by processes, my transaction costs increase to a point where my “boundary of the firm” from the perspective of the product team has grown so high that it makes sense for me to redraw my boundary to have those skills “in-house”.

This is the essence of DevOps – we re-draw the boundaries to create cross-functional teams to reduce transaction costs within that (product) team.

Search and Information costs trend to zero because it’s not some remote person in another silo… the expertise I need is across the desk from me, interacting with me every day via our Scrum or Kanban processes.

Bargaining costs become personal, not abstract – as a team we are committing to our shared goals without the need for reams of formal processes.

Policing and enforcement likewise become social and transparent – if someone isn’t contributing or delivering on their promises then this rapidly becomes apparent to the rest of the DevOps team and social pressures start to come to bear to align their behaviour with the groups agreed goals.

Similarly from the “external” perspective to our “DevOps firm” if we have a problem with Product X its clear where the responsibility lies – with the DevOps team that owns that product – and not diffused across multiple silos (Dev, Ops, etc) who can “point the finger” at the other team as the “root cause”. For many organisations the time wasted in defining & executing “problem management” processes to solve this “enforcement cost problem” to find out who is responsible for fixing a problem are immense.

Viewing DevOps from economics perspective starts to give us a framework to understand why the empirical benefits we see in organisations that have adopted DevOps principles are created, which ultimately gives us clearer insight into how & why to implement DevOps.

image source: - CC  - classicsintroian via Flickr -http://farm6.staticflickr.com/5003/5305503329_b3663df5bc_d.jpg
AdamSmith

DevOps, Adam Smith and the legend of the Generalist

Is DevOps at risk of being misinterpreted as “everyone should be able to do everything”? 

That somehow DevOps means you should be able to switch seamlessly between Development and Operations tasks, writing some Java application code in the morning, re-configuring the SAN in the afternoon and finishing off with a bit of light firewall maintenance in the evening ?

Take this great quote;

I’ll tell you EXACTLY what DevOps means.

DevOps means giving a shit about your job enough to not pass the buck. DevOps means giving a s**t about your job enough to want to learn all the parts and not just your little world.

Developers need to understand infrastructure. Operations people need to understand code. People need to f***ing work with each other and not just occupy space next to each other. –  John E. Vincent (@Lusis)

Whilst we whole heartedly agree with the sentiments in the quote I think that there is a risk that the phrase “need to know” about Development or Operations will be misinterpreted by some as “we need generalist staff who can do it all”.

Sadly, this would be a tragic mistake for several reasons but first and foremost because division of labour exists for a reason.

There is literally hundreds of years of economic theory, dating back to Adam Smith’s “Wealth of Nations” in 1776 and beyond, that shows how & why division of labour increases overall productivity.

Whether it’s making pins, cars or software applications the correctly co-ordinated activity between a sequence of specialists will massively increase overall productivity. In the context of making pins Smith quoted an increase in productivity of 24,000% (in comparison to one person performing all of the tasks required to make a pin).

Note that there is a critical caveat in the paragraph above – “correctly coordinated”.

Poorly coordinated activities between specialists impedes productivity and creates bottlenecks, mistakes and re-work. Reinert’s ideas of “flow” in product development, Goldratt’s Theory of Constraints (TOC) and the whole Lean movement are about ensuring that activities in the “value chain” are correctly aligned and smoothly flow from one stage (or specialism) to another to ensure that maximal productivity and value are achieved.

So what does this mean in terms of DevOps and the quote above?

Coordination requires effective communication so I think that the quote above could be re-stated to say “need to know enough about Infrastructure/software development in order to be able to communicate effectively with other specialists”.

For example, Developer’s need to know enough about infrastructure and web operations to understand why a scalable user session state management solution is critical when running a distributed system because “just running it all on one box like we do in my dev environment” isn’t a good solution. Conversely Operations need to know enough about the language stack they support to be able to read a stack trace as part of their troubleshooting workflows to help identify the root cause of a problem.

So whilst hopefully agreeing that DevOps doesn’t mean the end of specialists performing specialized skills and ergo the creation of hybrid generalists it’s worth mentioning that Adam Smith did have some warnings on the downsides of over-specialization that have some resonance for DevOps:

“The man whose whole life is spent in performing a few simple operations, of which the effects are perhaps always the same, or very nearly the same, has no occasion to exert his understanding or to exercise his invention in finding out expedients for removing difficulties which never occur. He naturally loses, therefore, the habit of such exertion, and generally becomes as stupid and ignorant as it is possible for a human creature to become. The torpor of his mind renders him not only incapable of relishing or bearing a part in any rational conversation, but of conceiving any generous, noble, or tender sentiment, and consequently of forming any just judgment concerning many even of the ordinary duties of private life” – Adam Smith (Wealth of Nations)

Alexis de Tocqueville agreed with Smith:

“Nothing tends to materialize man, and to deprive his work of the faintest trace of mind, more than extreme division of labour.” Alexis de Tocqueville

Smith warned that workers “become ignorant and insular as their working lives are confined to a single repetitive task” which is the silo-oriented mental state that @Lusis rails against in the opening quote when he says DevOps “means giving a shit about your job enough to not pass the buck”.

DevOps “First Way” emphasises “Systems Thinking” – looking at the larger system as a whole not just your narrow silo view – so the risks raised by Smith and De Tocqueville are valid concerns for DevOps.

So there is a tension here… we need the productivity of specialisation but we don’t want the insular worldview of “extreme division of labour”.

Scrum in Agile software development addresses similar concerns by emphasising cross-functional teams, “Individuals and interactions over processes and tools” and generally focussing on high-bandwidth communication as a way to tear down silos.

DevOps can (and should) emphasise the same solutions to resolve this tension.

hacker

DevOps, Death Marches and the Hero Hacker Myth

I’ve been thinking a lot about DevOps culture over the last few weeks which is probably why my ears pricked up when I overheard someone telling a story about an IT project they’d worked on where they did “10 weeks straight, 7 days a week” in order to get the project in on time.

These types of stories are a key part of any organisations culture – they form part of the mythology that defines the cultural norms and “the way things are done” within the organisation and can tell you a lot about its values.

In this case the speaker, judging by his tone of voice, was very proud of the effort they’d put in and the fact they’d managed to “pull it out of the bag” with such a super-human effort.

After all, the project was a success, right?

Death Marchs

My perception, from the outside, was a bit different. To me it sounded like a “death march project”, which Ed Yourdon defines as:

“A death march project is one for which an unbiased, objective risk assessment (which includes an assessment of technical risks, personnel risks, legal risks, political risks, etc.) determines that the likelihood of failure is ~ 50 percent.” – Edward Yourdon

Wikipedia goes on to say:

Often, the death march will involve desperate attempts to right the course of the project by asking team members to work especially grueling hours (14-hour days, 7-day weeks, etc) or by attempting to “throw (enough) bodies at the problem”, often causing burnout.
– Wikipedia “Death march (project management)

Sound familiar?

Hero Hacker Myths

Closely related to the “Death March” is the “Hero Hacker Myth”, neatly described here by Rob Mee:

“The myth of the hero hacker is one of the most pervasive pathologies to be found in Silicon Valley start-ups: the idea that a lone programmer, fuelled by pizza and caffeine, swaddled in headphones, works all hours of the night to build a complex system, all by himself. Time out. Software development, it turns out, is a team sport. All start-ups grow, if they experience any meaningful success. What works for a lone programmer will not work in a company of 10. And what’s worse, encouraging the hero mentality leads to corrosive dysfunction in software teams. Invariably the developers who do a yeoman’s 9-to-5, week after week, cranking out solid features that the business is built on, lose out to the grasping egomaniacs who stay up all night (usually just one night) looking to garner lavish praise. Rather than reward the hero, it’s better to cultivate a true esprit de corps.” – Rob Mee, Pivotal Labs

The bold and red highlighting in the quote above is mine because that sentence is central to the message of this post – if your organisational culture mythologises “Death March”-type projects or the “Hero Hacker” then you have a dysfunctional culture that needs to be fixed.

Once the Death March mythology is engrained into your organisational culture EVERY project will end up being a Death March. Why? Because if they’ve “succeeded” in the past (because of super-human effort on behalf of the project team) with a project that was only allocated 50% of the time, resources or budget it needed to have a realistic chance of success then that 50% becomes the new norm.

What manager is going to scope the next project, of equivalent size, with TWICE the time, resources or budget of their last Death March? It would probably be career suicide.

Of course, this is only possible because the hidden costs of Death March projects like staff turnover, sickness, burn-out, divorce, poor quality / huge technical debt, etc. don’t appear on the project’s balance sheet. The narrow ROI view of the project budget doesn’t include these intangibles which are normally shouldered by the wider IT Department budget, or borne solely by the individuals whose health, marriages and families are harmed in the process.

Similarly the Hero Hacker Myth is an enabler for the Death March by mythologizing the long hours and hero mentality that any Death March project requires to have any chance of success.

In addition, as in the highlighted section in Rob Mee’s post, the Hero Hacker Myth is corrosive to the idea of teamwork and shared responsibility that methodologies like Agile and DevOps seek to engender. Almost by definition the “Hero Hacker” is anti-social in his work practices and is likely to horde his knowledge in a way that hurts the team’s success.

DevOps

So what does this have to do with DevOps?

Well, if we view the Death March and the Hero Hacker through the lens of the “Three ways of DevOps” we can see that DevOps should stop Death Marches after a few miles and show the Hero Hacker the door.

The “First Way” emphasises “Systems Thinking” – seeing the big picture about how value is delivered to the organisation and the customer.

Systems thinking should encompass the intangibles discussed above and see that you can’t deliver sustainable long-term value on a foundation of projects that have a 50% chance of failure and that require super-human effort to have any chance to succeed.

It’s like tossing a coin and expecting it to come down heads time and time again – eventually it will come down tails and you’ll have a massive project failure that will probably destroy any value generated by your earlier “successful” Death March projects.

The “Second Way” says “Amplify Feedback Loops” – and the key way to do this is to shorten the feedback cycle by iterating faster. This is the essence of the “fail fast, fail often” mantra which is the antithesis of the long, destructive Death March. With short feedback loops the message that the goal is likely to be unachievable should be received faster.

If course, just because you get rapid feedback that your project is turning into a Death March (or that your Hero Hacker is on his 3rd all-nighter and is running out of Jolt Cola) doesn’t necessarily mean anyone will head the message and take corrective action.

That’s where the Third Way of DevOps comes in – foster a “Culture of Continual Experimentation and Learning”. DevOps should foster a culture of learning from your mistakes, whether it’s your previous Death March or the fast feedback on your current project that you receive from amplifying your feedback loops.

“Those who do not learn from history are doomed to repeat it” and the essence of the Third way is to create the “virtuous spiral” by taking the learnings and lessons from the previous iteration and applying them to the next cycle.

Similarly a culture of learning is ipso facto a culture of sharing which means the knowledge hoarding of the Hero Hacker should be anathema to a DevOps culture.

In summary, listen to the stories your organisation mythologises and see if those stories are taking your down the path of DevOps learning or yet another destructive Death March waiting for a Hero Hacker to save it…

DevOpsFragileBorg

DevOps, Antifragility and the Borg Collective

Whilst researching how to reconcile ITIL with DevOps I came across this interesting blog post from the IT Skeptic entitled “Kamu: a unified theory of IT management – reconciling DevOps and ITSM/ITIL”. This lead me to Jez Humble’s post on “On Antifragility in Systems and Organizational Architecture” referencing Nicholas Taleb’s book “Antifragile” and generally lead to a lot of intense cogitation on fragility versus robustness versus antifragility.

The IT Skeptic (Rob England) expands on his thoughts in this presentation which introduces this diagram below

Kamu: reconciling DevOps and ITSM/ITIL

However I struggled to mentally conceptualise the differences between the 3 points of the triangle until I came up with the following analogies (and please bear with me while I explain my thoughts behind them!):

  • Fragile = Humpty Dumpty
  • Robust = A medieval castle
  • Anti-fragile = The Borg collective

Fragile

“Fragile” systems are those (often legacy) systems that you really, really don’t want to touch if you don’t have to! Like Humpty Dumpty, one good push and all the King’s horses and all the King’s Men and a 24 hour round-the-clock marathon from the Ops team won’t get that pile of crap system up and running again.

It’s a snowflake – not documented properly, there are dependencies you can’t trace, the hardware’s out of warranty, the platform is 3 versions behind and can’t be upgraded because of some customisation that no-one understands, the code is spaghetti and the guy that wrote it retired last year.

We all know what fragile looks like!

Robust

“Robust” systems are those that have been through the ITIL life-cycle and for most of us they are probably our pride & joy. Monitored, instrumented and well documented with their own run book and wiki pages they are highly available with redundancy at every level we “know” they can withstand the slings and arrows of outrageous fortune.

Just like a medieval castle they are impregnable. The very essence of robust!

And then comes along the “black swan” event… something we haven’t anticipated, a failure mode we can’t have foreseen, a cascade of errors that we did not plan for.

Just as our predecessor, the medieval castle owner, didn’t foresee the invention of gunpowder and cannons that reduce his impregnable castle to rubble. Just like the builders of the Maginot Line didn’t anticipate the invention of Blitzkreig and mechanised warfare nor the defenders of the dams of the Ruhr Valley a bomb that bounces.

This is the key message of Taleb’s book and Jez’s post – that the “robustness” mindset often leads to a resistance to change. As Jez explains in the context of organisations:

“The problem with robust organizations is that they resist change. They aren’t quickly killed by changes to their environment, but they don’t adapt to them either – they die slowly.” – Jez Humble

A castle is robust… but it’s fixed, immobile, and its very robustness to “normal” assaults reduces the incentives to change and adapt.

Anti-fragile

Contrast these to the “anti-fragile” system (or organisation) typified by the Borg Collective. The Borg seek out new life and new civilisations to assimilate into the Collective in order to improve.

With each change and adaptation the system (the Collective) becomes more resilient – it improves as the result of the external stress (the essence of an adaptive, evolutionary system).

Anti-fragile organisations seek out and embrace change – they are inherently “outward-focused” and seek to be continually learning, adapting and assimilating (not hiding behind the walls of their castle, content in their robust impregnability).

Likewise, DevOps seeks to be “anti-fragile” by embracing change (and disorder a la Chaos Monkey) whilst incorporating feedback mechanisms (the “3rd Way” of DevOps) to ensure that learning is correctly assimilated.

The DevOps mindset encourages continual learning; through experimentation and collaboration in order to seek to improve the current system as opposed to a codified mindset of a fixed position of “one way of doing things” implied in a formulaic, rigid ITIL worldview.

In this way DevOps encourages what Schumpeter called “creative destruction” – clearing out the old to make way for the new (and hopefully improved) system.

Summary

I’ve summarised these 3 points of the triangle into the following table;

 

Fragile

Robust

Anti-Fragile

Icon

Humpty Dumpty

Medieval Castle

The Borg

Methodology

“Spaghetti”

ITIL

DevOps

Attitude to change

Fear Change

Resist Change

Embrace Change

Response to change

Break

Repel

Adapt

Rate of Change

Ideally never!

Slow

Rapid

Change initiated by

Needs CEO approval

Change Management Board

User-initiated
(via automation)

Focuses on

Survival

Process*

Business Value

* Yes, I know that ITIL v3 in particular *IS* in theory very focused on business value and benefits realisation BUT in my experience the end result of an “ITIL implementation” is often the triumph process over outcome.

If anyone has any ideas for more rows to add to the table please let us know on the comments!

-TheOpsMgr

Are you really doing DevOps? 8 prerequisites you must consider

A recent “2013 State of DevOps” report from PuppetLabs indicated that “52% of [over 4,000] respondents said that they’ve been doing DevOps for > 12 months”. Our own, less formal, survey found very much the same – 60% of respondents indicated that they were currently doing DevOps”. Take part in our survey, by clicking here.

PuppetLabs Survey Results
PuppetLabs Survey Results

The question is Are you really doing DevOps?

As we pointed out in our earlier DevOpsDays wrap-up blog there are at least 7 different definitions of what DevOps is, judging by the presentations given on the day:

DevOps … is a movement, a philosophy, a way of thinking.
DevOps … is a person who can perform both Dev and Ops roles.
DevOps … means cross skilling people.
DevOps … is continuous delivery.
DevOps … is a team of developers and operation staff.
DevOps …is a culture movement.
DevOps … is monitoring.

So… what’s the “minimum viable product” (MVP) for DevOps? What core things should you be doing before you can truly say you are “doing DevOps”?

In the whitepaper “The Top 11 Things You Need To Know About DevOps” Gene Kim emphasizes the “3 Ways”:

(1) “ Emphasize the performance of the entire system” – a holistic viewpoint from requirements all the way through to Operations
(2) “Creating feedback loops” – to ensure that corrections can continually be made. A TQM philosophy, basically.
(3) “Creating a culture that fosters continual experimentation and understanding that repetition and practice are the pre-requisites to mastery”

These are excellent guidelines at a high level, but we’d like to see a more operational definition. So we’ve made up our own list!

As a starter – we propose that;

  1. You must have identified executive sponsors / stake holders who you are actively working with to promote the DevOps approach.
  2. You must have developed a clear understanding of your organisation’s “value chain” and how value is created (or destroyed) along that chain.
  3. You must have organizationally re-structured your development and operations teams to create an integrated team – otherwise you’re still in Silos.
  4. You must have changed your team incentives (e.g. bonus incentives) to reinforce that re-alignment – without shared Goals you’re still in Silos.
  5. You must be seeking repeatable standardized processes for all key activities along the value chain (the “pre-requisite to mastery”)
  6. You must be leveraging automation where possible – including continuous integration, automated deployments and “infrastructure as code”
  7. You must be adopting robust processes to measure key metrics – PuppetLab’s report focuses on improvement in 4 key metrics – Change Frequency, Change Lead Time, Change Failure Rate and MTTR. We suggest Availability, Performance and MTBF should be in there too.
  8. You must have identified well-defined feedback mechanisms to create continuous improvement.

As mentioned above, this is just a starter list – feel free to agree/disagree in the comments and suggest additions or alterations.

We’ll be writing more about “DevOps Incentives” in an upcoming post, and we’ll revisit the “Are you doing DevOps?” topic once we’ve consolidated your feedback.

-TheOpsMgr