An introduction to ChatOps

I first heard of something resembling ChatOps about five years ago when I had the good fortune to share a beer with Scott Chacon, one of GitHub’s co-founders, while I ranted about Deming, he talked enthusiastically about their fledgling organization.  Surprisingly, one of the things he talked about with most passion was Hubot a sort of robot butler who hung around in Github chat rooms serving useful data and whimsical content with equal aplomb.  It seemed a great concept, simple and powerful, it improved operability whist increasing knowledge sharing and encouraging collaboration.

I often wonder why chatOps doesn’t garner much attention, especially as it appears to have played an important part in GitHub’s success.  Perhaps that’s because everyone is gazing adoringly at Docker, or perhaps because ChatOps sits discretely and indistinctly on the boundary between Culture and Tools.

By way of introduction ChatOps combines three key technologies: Asynchronous Chat, Robot Assistants and Automation; let’s spend a moment looking into each.  (The pictorially minded may prefer to spin through my spring DevOps Summit talk where ChatOps was one of my ‘Collaboration Catalysts’.

Asynchronous Chat
Asynchronous Chat needs no introduction, it allows people to congregate in a virtual space to view and post messages and media.  These apps are a good way for a distributed team to collaborate, but there are more subtle advantages – chats can be saved enabling a searching and reference.  Chats allow broadcast, without the publisher having to manage their audience.  You’ll understand the value of this if you have ever been trying to chase down a gnarly production issue with your manager over one shoulder and Project Manager on the other asking for updates.  Oddly, the speed of work does not increase with the frequency of update requests, quite the opposite in fact.
In this situation chat could be used to broadcast progress, without having to manage a distribution list, when people monitor chat, the originator doesn’t get distracted, and may even get proactive support.

In the context of ChatOps it is chat apps which can be readily extended that really matter.  That’s because many of the operations performed will be specific to an organisations and it’s systems, processes and integration requirements.  HipChat FlowdockSlack and Campfire are popular options, and choices are often driven by the lingua franca of the development team.

Robot Assistants
Robot assistants lurk in chat rooms waiting to do the user’s bidding.  They may wait to be summoned by a specific command, or step in when they think it’s needed.  Assistants may grab things, like logs from production, or find out who is on call.  This reduces the interruption cost for a user, who is already thinking and collaborating in chat.

A good bot also recognizes the value of play, amusing features are almost mandatory, from adding a mustache to a photo, meme generation to playing tunes.  A useful side effect of this is it encourages folks to hangout in chat rooms, humor keeps people engaged, and generally engaged people are more productive and ready to innovative.  Notable bots include LitaHubot,  Err and Stack Storm.  Iron Man’s J.A.R.V.I.S is similar in concept, but somewhat less likely to inundate you with pictures of small miserable faced dogs.

By way of an aside, Terri Winnograd, who later went on to mentor one Larry Page, pondered the utility of robot assistants as early as 1970.  Perhaps he had a premonition of clippy when he wrote:

“I should reiterate that good programming systems do not demand a super-intelligent program. We can get by with a moderately stupid assistant as long as he doesn’t make mistakes. The degree of Al needed is much less than that needed for a full-fledged natural language or vision system”

Automate, mate.
The third component is Automation.  Hooking the bot up to automation, and other deployment and operations tools, is where things get really interesting.  If a bot can integrate with search engines and meme generators, why not link it to development environments, perhaps even production?  Then, if people are discussing a thorny deployment problem they can call in logs, graphs and pertinent data.  The chat room, becomes the war-room; distributed, observable and documented for later learning.

Perhaps the pinnacle of ChatOps is allowing deployment orchestration through chat.   As Jessie Newland describes it succinctly in his highly recommended ChatOps at GitHub talk “Chat becomes the primary control surface for ops”  not only is it is convenient, but a chat client is more portable.  Chat can also serve as a layer of abstraction over the under laying tech, enabling it to change and evolve independently of the commands driving it.   This abstraction opens an opportunity for training, enabling production commands to be executed against a sandbox.  Of course, there is some risk to be considered, and it is possible to restrict commands to people or rooms.

Still not impressed?  In the same talk Jessie outlines a scenario where he makes a deployment, observes a problem, orchestrates load balancers, fixes and redeploys.  Impressively, it all happens in chat, all while keeping his team updated, and leaving a record, with minimum extra effort.

More than just tools?
Looking beyond tools, ChatOps brings more to teams than mere efficiencies. ChatOps liberates Institutionalized knowledge once locked in the heads of key, time challenged, individuals.  Once in the open, ways of doing things can be inspected and built upon.  This isn’t necessarily a threat to those people; often freeing them up to tackle more challenging problems.

ChatOps can be an excellent training tool.  Like the gallery trainee doctors use to observe a surgeon at work, chats can be reviewed and replayed for education.  Need to know how something is done?  Check the archives, and look at the commands used last time, or ask in Chat, someone can demonstrate directly, and show everyone else at the same time.

Reflections
Having written this, I realize I have to some extent answered my own question: Why don’t we hear more about ChatOps?

Effective ChatOps requires maturity of culture and tools.  Even small things, like knowing more senior or experienced people are able to see, and potentially respond to, every comment, takes some getting used to for both parties.  The organisation’s culture must encourage the openness which allows productivity to thrive in the chat.  As such, striving towards ChatOps may provide a useful mechanism to highlight organisational and cultural impediments.  To make operational features available in chat requires not just trust, but investment in tech, safely connecting all those moving parts is not trivial.  To the many organizations who struggle to deploy once a month, ChatOps must seem like a distant Nirvana.

Despite the necessary investment ChatOps can bring many benefits, and can do it unobtrusively, at a pace of change that suits the community.  Using Chat as gateway to operations, adding capabilities when it is considered safe to do so, is an excellent way to introduce and observe new ideas.  ChatOps invites collaboration, and not just because it’s novel.  If all the engineers, regardless of title, hang out and work in the same space it helps build an appreciation of other’s challenges and responsibilities, not to mention attitude and sense of humour.

 

Leave a comment