A Tale of DevOps and Daring Do

7178026433_1c9565e716_khttp://www.flickr.com/photos/chodhound/7178026433

I recently read Phil Thompson’s great post around the value of story telling ‘Story telling in a business context’.  Reading it called to mind my favorite DevOps leadership story, but in a cautionary note Phil also talks about the dilemma of creating a tale to suit one’s own point.  He recommends a subtle caveat such as “It’s like that story where….”.  As it is so hard to authenticate tales from the past I’ll take that recommendation:  It’s like a story of daring do in the hay day of the industrial revolution…

The Avon Gorge in Bristol is a wide one and half-mile long valley cutting through craggy limestone on the edge of the city, in places it is almost a hundred metres deep. In the 1800’s engineer Isambard Kingdom Brunel was commissioned to build a suspension bridge across it, the chosen location was no less than seventy five metres high. Already the commission doubted Brunel’s calculations, forcing him to build two massive piers to narrow the span to two hundred and fourteen metres. Huge towers where constructed on either side of the river then ropes were strung across the gorge, followed by cables. After building sufficient strength, the plan was simple; a basket could be hung from an iron bar, allowing workers to cross with more cables and equipment. This approach would save long, expensive boat and land trips from side to side.

Everything went to plan until someone was needed to pilot the basket across the windy gorge. This was a time when trades had little protection, there were high injury rates, and workers were almost considered expendable. So when the work force were concerned about the safety of the basket, who took the job? It was Brunel himself. He made the first crossing, showing utmost faith in the design and trades that built it. In so doing Brunel set an example, demonstrated commitment, and earned the trust of the people he worked with.

I like this story, because it demonstrates the bold steps leaders (by which I mean anyone wishing to inspire or encourage, regardless of role or title) may take to gain support for new ideas and ways of working. This is particularly true of DevOps, Agile, Lean and similar learning methods – at first they might look different, maybe even as precarious as Brunel’s basket, but there are significant benefits to be found. Teams united by a purpose, and principles, are likely to discover those benefits, and learn, quicker than teams that are divided or fearful of change. Visible acts of leadership, and the stories that follow them, are as powerful now as they were hundreds of years ago.

This post was first published on the Kainos blog.

Anti Pigeon Holing or Why Leaders Should Consider Capabilities, Not Job Titles

Often during a DevOps or agile transformation, we demonstrate the potential of fresh ways of working with a single, pioneering team. Generally these teams produce solid results and there is a strong desire to scale the approach to more teams. This moment is something of a tipping point for the department, successful scaling leads to successful teams, leading to successful projects. How people are picked for those teams is crucial. A team’s make up is just as important as the practices it adopts; personalities, skills, experience and enthusiasm will all determine the drive, output and diligence of the team.

So how are teams created in your organisation? The easy way is to ‘do what we always do’: gather people finishing a project, or copy the template from the last team created, or maybe teams don’t change at all! Slightly more adventurous is to replicate the list of job titles in the pioneering team. These cookie cutter approaches may be successful, but only if the department is already populated by talented flexible individuals. However a little consideration will often yield better results, not just for the business but for the individuals in the team.

…for the rest of  post see the Skelton Thatcher Blog

The Things I Learnt about DevOps When My Car Was Engulfed by Flames

unnamed

This is a true story, based on a talk from DevOps Days London 2016:

It was a gorgeous sunny spring day, my family and I were driving through my home town of Bristol, ready for another weekend adventure. We were cruising along when my wife said quietly: “I can smell smoke”. Now, I’m a good mechanic, I used to restore classic cars. The car we were driving was modern and recently serviced. I checked the instruments, everything normal. “It must be outside”, I declared confidently. Two minutes later tentacles of smoke were curling around my ankles and my shins were getting remarkably warm.

We can learn something relevant to DevOps, and many other disciplines, from this experience… read the rest on InfoQ

 

The Entirely Random DevOps Days London Write Up

There are some great posts out there on the superlative DevOpsDays London 2016 Conference. Of particular note are Manual PaisDevOpsGuys  and Helen Beal’s DevOpsDays London: Making Happy. These are well structured, balanced pieces which neatly represent each talk along with insight from experienced authors.  I hate to disappoint, but this post is not one of those. It’s a random collection of the things that peaked my interest at DevOps Days London.

WP_20160419_007_crop

What is Legacy?
There was lots of talk of legacy, the BiModal debate rolled on. Much like the sporadic agile wars many of the detractors use out dated definitions as convenient ammunition, Gartner’s understanding of the challenges seems to have evolved away from its original strategy towards an exploit and explore approach for recent and legacy systems respectively.  Defining legacy is challenging, with considerations beyond just the age of the software or system. There were a few definitions I really liked:
“Legacy is code you can’t iterate on as quickly as you need to” – Casey West
“Legacy is code you don’t have automated tests for. – Micheal Feathers
“Legacy is where your customer money lives” –  Bridget Kromhout

WP_20160420_034

The merits of really reading
I keep having the same conversation, like being in a endless loop, it goes like this:

Person: “Yeah, I know all about Conway’s law, it is super insightful”
Me: “Exactly, that example about teams building compilers, that’s a light bulb moment”
Person: “….the compilers?”
Me: (Thinks) “Have you really taken time to understand what you’re advising people to do, or are you just reciting tweets?”
Me: (Says) “ Check out the article, there’s lots more in it”

It is like this with so many topics, REST, OODA, Learning. That’s why I was so pleased to hear the ever-eloquent Gareth Rushgrove call out the value of reading academic papers in his talk.  These days we are so prone to snacking on sound bites we seldom get the satisfaction of a full reading meal, yet our brains cry out for this kind of nourishment. I believe papers, and source material in general, are the best way to gain a firm understanding of a topic, particularly because they build a picture of what motivated the author, not just what they did.  Much like software patterns, understanding the intent and motivation is key to successful application.

SAM_4657

Burnout
A couple of talks touched on burnout, as did a well attended and lively open space. What surprised me was how many people had direct experience of it, it remains an issue in the industry despite raised awareness and talk of sustainable, humane ways of working. Oliver Wood talked about his experiences working so many hours that he slipped a disk. Keen that others may avoid the same fate, he created GoodCoderBadPosture.  During his ignite he reminded us “you are ephemeral , you are not highly available“.  My talk (Things I learnt About DevOps When My Car Caught Fire) used the analogy of looking through the windscreen of my burnt out car, all the instruments you normally rely upon to sense the world are warped and confused, your view is fogged and distorted. If only we could see metal strain as readily as we can bad posture.

I noticed much of the burnout open space was concerned with what management and organisations can do prevent burnout, and recognise it’s occurrence. This is a reasonable standpoint given that our behaviours are generally shaped by the systems we work within. However In the spirit of DevOps we should also note that it isn’t a problem for effected individuals and their mangers to tackle alone.  While organisations take action, we might also ask ourselves:

1. “How would I know if one of my colleagues was suffering from burnout?”
2. “How would I help someone I thought on the verge of burnout?”
3. “How do my own behaviours effect the likelihood of burnout in my colleagues?”

There was a nice note in Jeromy Carriere’s talk, and a potential answer to question 3: “Work hard to make every alert exciting” this has implications for burnout, exciting alerts implies only being disturbed for hard problems, not simple switch flicking exercises.

Change
There was plenty of talk of change, particularly the danger of not evolving and experimenting.  Change strategies were discussed, including the value of heading into conversations well armed with data.  It was clear from their talk that Microsoft are changing in places, for instance setting up open team rooms or neighborhoods.  I rate this approach, it appears to balance team privacy, open communications and the distractions of full open plan.

“It is not necessary to change, survival is not mandatory” – Deming (Who wasn’t present!)
“The riskiest thing is not to change” – Joanne Molesky

The change theme included the importance of investing time in the most valuable activities, and how to discover them. It highlighted that many of those valuable activities are operational features, not just shiny new toys to please users. If you’re in the mood for self reflection you might give some thought to this quote from Bridget Kromhout:
“When evaluating yourself don’t forget to look at the value you are adding”

A conference, with a culture
The thing I love about DevOpsDays, and the way it’s organised, is that it still feels like a community event, sure it’s scaled, but the level of friendliness, inclusion and support are almost as the first time I spoke in Goteborg 2011. The story of this scaling and principles behind it were told by Kris Buytaert, it’s surprising how many of the early adopters are still active. The conference manages to short circuit a lot of anchoring and group think by giving almost 50% of the time to open spaces. Taking responsibility for the schedule out of the committee’s hands into the delegate’s ensures that topics are relevant to attendees, right then and there. The willingness of speakers to stay and participate in these sessions is key to their success and makes for some great learning.  Not bad for a movement that still can’t agree what its about.

Inflicting Trust

everestladder-2

I often introduce teams to the notion of ‘inflicting help’, those well-meaning activities which deprive the recipient of an opportunity to learn, or practice.  For example, if I always helpfully facilitate retrospectives for a team, they will miss the experience of running their own. If someone helpfully handles all the build work for the team, ‘because they are best at it’ the team will not learn how to deploy for themselves.  This notion is of particular importance for agile and DevOps teams, as they cross skill, and share responsibility.

I also believe it is possible to ‘inflict trust’, that is give so much trust that it is detrimental to the recipient.  This idea may not be popular, how can it be plausible when the agile and DevOps communities talk so much about building trust?

Consider the following example.  It is day one for a new employee in the fictional organisation Great Western Widgets. At Great Western Widgets we deploy a few times each day, and we chase the Continuous Delivery Nirvana of making deployments boring. The new employee makes some code changes and is then is asked to deploy to production, following steps in the build book. “You’ll be fine, I trust you” says the senior engineer. Except things don’t go fine; the load balancer doesn’t route traffic away from the node that’s being deployed to, alerts don’t fire and eventually end users report multiple errors.
The new start suffers a massive dent in their confidence.  They are now far less trusting of their mentors, and apprehensive about battling a reputation for being ‘the one who broke production on their first day’.

In this instance the senior engineer was too trusting of the new start, they inflicted trust. In other, more severe situations, you may recognise this lack of support as a form of negligence. As leaders and mentors that’s something we should be wary of.  In addition to good intentions, it is often convenient (time wise or politically) to use the mantra of ‘high trust’ to expect others to do things, perhaps even things we wouldn’t risk doing ourselves.  Being ready to support, and if necessary rescue, those we are placing trust in is critical to creating an environment of safety, in which people are willing to challenge themselves. It is this feeling of safety that makes teams comfortable taking calculated risks, going fast and innovating. These traits are often seen to lead to high individual and team performance, not to mention a more pleasant work environment.

When encouraging learning and granting more trust, it’s often useful to consider various likely outcomes, if, when and how to step in. Doing the thing for the person is not an option, being ready, and available, to avert to disaster is mandatory.

One example is helping a child learn to carry a tray of drinks to the table, at some point they are going to just have to get on and try it, unless their parents want to follow them to every dining hall, café and pub they visit in their lifetime. So in that moment some steps are taken, almost without thinking; don’t use the best China, don’t over load the tray (you’ve just reduced the consequences of failure) and be ready with a tissue and some wise words if anything does spill (to encourage reflection and restore confidence). The converse, ‘inflicting trust’ would be to load up the tray with boiling hot drinks, fragile, expensive crockery put them into the hands of the wide eyed child and say “Off you go, I trust you”.

The key question to ask yourself is:  When you suggest someone tries something for the first time, which style are you using, supporting or inflicting trust?

An introduction to ChatOps

I first heard of something resembling ChatOps about five years ago when I had the good fortune to share a beer with Scott Chacon, one of GitHub’s co-founders, while I ranted about Deming, he talked enthusiastically about their fledgling organization.  Surprisingly, one of the things he talked about with most passion was Hubot a sort of robot butler who hung around in Github chat rooms serving useful data and whimsical content with equal aplomb.  It seemed a great concept, simple and powerful, it improved operability whist increasing knowledge sharing and encouraging collaboration.

I often wonder why chatOps doesn’t garner much attention, especially as it appears to have played an important part in GitHub’s success.  Perhaps that’s because everyone is gazing adoringly at Docker, or perhaps because ChatOps sits discretely and indistinctly on the boundary between Culture and Tools.

By way of introduction ChatOps combines three key technologies: Asynchronous Chat, Robot Assistants and Automation; let’s spend a moment looking into each.  (The pictorially minded may prefer to spin through my spring DevOps Summit talk where ChatOps was one of my ‘Collaboration Catalysts’.

Asynchronous Chat
Asynchronous Chat needs no introduction, it allows people to congregate in a virtual space to view and post messages and media.  These apps are a good way for a distributed team to collaborate, but there are more subtle advantages – chats can be saved enabling a searching and reference.  Chats allow broadcast, without the publisher having to manage their audience.  You’ll understand the value of this if you have ever been trying to chase down a gnarly production issue with your manager over one shoulder and Project Manager on the other asking for updates.  Oddly, the speed of work does not increase with the frequency of update requests, quite the opposite in fact.
In this situation chat could be used to broadcast progress, without having to manage a distribution list, when people monitor chat, the originator doesn’t get distracted, and may even get proactive support.

In the context of ChatOps it is chat apps which can be readily extended that really matter.  That’s because many of the operations performed will be specific to an organisations and it’s systems, processes and integration requirements.  HipChat FlowdockSlack and Campfire are popular options, and choices are often driven by the lingua franca of the development team.

Robot Assistants
Robot assistants lurk in chat rooms waiting to do the user’s bidding.  They may wait to be summoned by a specific command, or step in when they think it’s needed.  Assistants may grab things, like logs from production, or find out who is on call.  This reduces the interruption cost for a user, who is already thinking and collaborating in chat.

A good bot also recognizes the value of play, amusing features are almost mandatory, from adding a mustache to a photo, meme generation to playing tunes.  A useful side effect of this is it encourages folks to hangout in chat rooms, humor keeps people engaged, and generally engaged people are more productive and ready to innovative.  Notable bots include LitaHubot,  Err and Stack Storm.  Iron Man’s J.A.R.V.I.S is similar in concept, but somewhat less likely to inundate you with pictures of small miserable faced dogs.

By way of an aside, Terri Winnograd, who later went on to mentor one Larry Page, pondered the utility of robot assistants as early as 1970.  Perhaps he had a premonition of clippy when he wrote:

“I should reiterate that good programming systems do not demand a super-intelligent program. We can get by with a moderately stupid assistant as long as he doesn’t make mistakes. The degree of Al needed is much less than that needed for a full-fledged natural language or vision system”

Automate, mate.
The third component is Automation.  Hooking the bot up to automation, and other deployment and operations tools, is where things get really interesting.  If a bot can integrate with search engines and meme generators, why not link it to development environments, perhaps even production?  Then, if people are discussing a thorny deployment problem they can call in logs, graphs and pertinent data.  The chat room, becomes the war-room; distributed, observable and documented for later learning.

Perhaps the pinnacle of ChatOps is allowing deployment orchestration through chat.   As Jessie Newland describes it succinctly in his highly recommended ChatOps at GitHub talk “Chat becomes the primary control surface for ops”  not only is it is convenient, but a chat client is more portable.  Chat can also serve as a layer of abstraction over the under laying tech, enabling it to change and evolve independently of the commands driving it.   This abstraction opens an opportunity for training, enabling production commands to be executed against a sandbox.  Of course, there is some risk to be considered, and it is possible to restrict commands to people or rooms.

Still not impressed?  In the same talk Jessie outlines a scenario where he makes a deployment, observes a problem, orchestrates load balancers, fixes and redeploys.  Impressively, it all happens in chat, all while keeping his team updated, and leaving a record, with minimum extra effort.

More than just tools?
Looking beyond tools, ChatOps brings more to teams than mere efficiencies. ChatOps liberates Institutionalized knowledge once locked in the heads of key, time challenged, individuals.  Once in the open, ways of doing things can be inspected and built upon.  This isn’t necessarily a threat to those people; often freeing them up to tackle more challenging problems.

ChatOps can be an excellent training tool.  Like the gallery trainee doctors use to observe a surgeon at work, chats can be reviewed and replayed for education.  Need to know how something is done?  Check the archives, and look at the commands used last time, or ask in Chat, someone can demonstrate directly, and show everyone else at the same time.

Reflections
Having written this, I realize I have to some extent answered my own question: Why don’t we hear more about ChatOps?

Effective ChatOps requires maturity of culture and tools.  Even small things, like knowing more senior or experienced people are able to see, and potentially respond to, every comment, takes some getting used to for both parties.  The organisation’s culture must encourage the openness which allows productivity to thrive in the chat.  As such, striving towards ChatOps may provide a useful mechanism to highlight organisational and cultural impediments.  To make operational features available in chat requires not just trust, but investment in tech, safely connecting all those moving parts is not trivial.  To the many organizations who struggle to deploy once a month, ChatOps must seem like a distant Nirvana.

Despite the necessary investment ChatOps can bring many benefits, and can do it unobtrusively, at a pace of change that suits the community.  Using Chat as gateway to operations, adding capabilities when it is considered safe to do so, is an excellent way to introduce and observe new ideas.  ChatOps invites collaboration, and not just because it’s novel.  If all the engineers, regardless of title, hang out and work in the same space it helps build an appreciation of other’s challenges and responsibilities, not to mention attitude and sense of humour.

 

Five Gators Of The Apocalypse?

Wacky gators arcade machine

I generally dislike war and military metaphors for team and making activities.   Admittedly IT has a lot to learn from the military in terms of teams and scale, but in the wrong hands these metaphors seem to encourage unproductive conflict and counter-collaborative behaviours. This strikes me as odd because although prepared for conflict, the military spend much of their time avoiding or minimising it.  However, I do need to call upon a slightly violent metaphor to describe the relationship between constraints encountered when building a continuous delivery capability in an organisation.

The process of change reminds me of the nineties arcade game Whacky Gators, where a number of cheeky gators poke their heads out of caves, and you biff them with a large hammer, hands, or other appendage depending on personal preference. You never know which gator will appear, or when, and more than one might show up at once.

When encouraging continuous delivery (and by extension DevOps) those gators might be named: Culture, Tools, Organisation, Process, and Architecture.

These five are interdependent constraints, each affecting the other.  However, while inside Whacky Gators is a fairly simple randomiser determining which gator will surface, behind the scenes our organisations look more like a hideous HR Giger meets Heath Robinson mash up.  We can’t readily inspect them to determine what to change.

My theory is that when one constraint is eased it will reveal a new constraint in a different area. This is a tenet of most agile and learning methods – surface a significant issue, deal with it, see what surfaces next.  Often a method, and our expertise, focuses on just a couple of areas, we’re well versed at solving problems with technical solutions, or just improving our own team’s capability in isolation.

A good continuous delivery capability involves the whole engineering organisation (a great one involves the entire organisation). This means it is crucial to consider all five constraints, and when there is a problem be ready to shift focus and look for the solution in one of the other areas.  In fact, this simple shifting may lead to the root of a problem.  Do reams of process indicate a risk adverse culture?  The solution may not be more process, but a different attitude.  Are those tools covering up or compensating for some thorny, unspoken issue no one dared to face?  When trying to improve delivery capability there may be a temptation to replace an old tool with an improved version, maybe the need for that tool (and associated overheads) can disappear with an ounce of process and a healthy dollop of collaboration?

Returning to our Whacky Gators metaphor, the big question is how are you playing?  Do you simply wait for that same familiar gator to return?  The one you’re most comfortable dealing with?  Do you hover where you are comfortable while other opportunities pass by or, are you nimble, and brave enough to tackle each constraint as it appears?

Footnote:
While I was looking up Whacky Gators, I couldn’t resist a peak in the machine service manual, there I found this uncanny quote on success, as applicable in the game, as it is in change:
“The player does not score extra points by hitting harder; a light touch will activate the circuits and will lead to higher scores.”