I received a very honest piece of advice from the CTO of a large SaaS company headquartered in the Mid-West last week, which was actually spot on, and for which am really grateful. It was a great reminder about the reality of the types of processes and cultures that are prevalent in most organizations. It triggered the observation that a lot of people still love manual processes, for a variety of reasons, and I am taking the opportunity to share the anecdote as well as related thoughts and observations.

First, The Conversation I Am Referring To

The context of the discussion was that I was requesting some feedback on what we are working on from a very experienced CTO. The executive in question, let’s call him Dave, had spent time in the Valley earlier in his career. He liked our (still in stealth mode) solution very much, but he had something to say about the way I was talking about certain trends as if they were universal truths. Although my pitch is very focused on our product, it still touches upon some typical DevOps themes like breaking down the silos, metrics and visibility, automated testing… He waited patiently for me to finish, and very candidly said that, although what we are working on is very useful, I was using the wrong language to present it for companies outside of the Silicon Valley. His feedback is that most of his colleagues in his vicinity “know about DevOps, may say that they want DevOps, but will never get DevOps in a million years”. DevOps, in his opinion, is a Silicon Valley trend that was successful because of its grassroots origins. It was successfully implemented in growing companies in highly competitive environments which, imperative for velocity permeated from the top down. When he was working in the Valley, “they had to continuously innovate, which means that they had to put new code in front of customers as fast as possible. It was a lot easier, from a culture standpoint, to question internal development, release and maintenance processes on a regular basis and implement change if needed be”. Some of the companies that started doing this grew to become massive tech companies, providing DevOps with a lot more legitimacy, which consequently helped fuel the growth of the movement. However, it is still implemented in a relatively small number of companies. His point was that DevOps is meant to change a culture that is still very much prevalent in a large proportion of enterprises around the US and other countries.

At his current company for instance, the willingness to change is shared by “maybe 20% of professionals dealing with technical operations…The folks in the site reliability team don’t necessarily get along well with the folks in the datacenter team, or in the network team, or the database team, or the helpdesk team, and vice versa and so on and so forth, so the willingness to share data and centralize processes is not there”. In his opinion, DevOps requires changing a type of structure that a lot of people “have a vested interest in preserving”. Since they don’t coordinate very well around technologies and processes, they end up using a lot of manual processes which create some unnecessary cycles and delays: for instance, valid changes to the code get overridden with older versions from other development branches on a regular basis, the database may not always be in sync with the control version repository, there are lots of time-consuming manual interruptions to the continuous delivery cycle etc. However, the reason these processes are maintained is because “that’s how people feel needed and busy…”.

Arguments for Manual Processes

I heard this warning from several folks in Tech Ops, mostly in management or executive positions: people who have been operating in reactive mode and through manual processes for years will not be very interested in changing their ways. I tend to agree, I met with a few SREs that validated that opinion. On the topic of reducing noise from too many alerts and metrics across a variety of siloed tools for instance, one such SRE argued that “it may be counter-intuitive, but actually the so-called noise is a good thing: when an issue in production occurs, we want the noise, we want abundance of information because it helps us immediately start correlating the various pieces of information in our head and get to the root cause, which ultimately is our job”. Here lies a subtle flaw in reasoning: if an an SRE sees his/her job as fixing things, he/she will see as perfectly normal that processes for identification of issues and remediation would be dependent on him/her, with all the limitations that that entails. But they have their built-in need for manual processes right there, and they seem relatively happy about it, at least up to a certain level in the organization. When I asked whether it wouldn’t be preferable to eliminate at least a portion of the noise, like superfluous or repetitive data, the same person replied that “it’s comforting to see the alerting system doing its job, it’s better to get too much data than to miss something important”. The noise means that there is activity, something is happening, which he finds reassuring. That company by the way, had ways to go, by their own admission, before hitting their goal of 99.9% (“3 9s”) SLA: “we’ll try to get there by 2018”. Sigh. I was very tempted to start arguing that the job is not defined by what it entails today, but rather by its goal: if the mission is to increase up-time and achieve higher efficiency, it’s fair game to challenge every process that is in use, every tool implemented, as long as progress toward the goal is made in a sustainable way. But I kept my mouth shut.

I am speculating that at least a part of the affinity of tech ops folks, managers and hands-on alike, for manual processes stems from a longing for the simpler ways of the manual worker. Because manual work consists of converting materials from one form to the other, the results of a manual process are tangible. Because knowledge work consists of converting information from one form to another, the results of a knowledge work are frequently intangible. “The essence of the knowledge organizations is that work is done in the head” (Zand, 1981). This means that working (and work while it is in process) can’t be seen. From the perspective of a supervisor or an engineer, this means the visibility of working is high for a manual worker and low for a knowledge worker.

I pointed out in a previous post that during my first ever outage, I had the impression that the technical operations team at my company viewed an all-nighter as kind of a badge of honor. The morning after, they always made sure to let everyone know it happened, at the water cooler or through the status update email. Manual processes make people visibly necessary and important, they are needed, without them things fall apart.

So lots of people have been used to operate in a reactive mode for years, with lots of manual processes, and that’s how they like it, whether it is because they don’t know any better, think it works well enough (“why change it?” mentality), they need to be seen as busy and necessary or are not provided incentives to take things to the next level, it doesn’t matter, it’s a reality and it was important for me to get that direct feedback so I walk-in with the right context and avoid making the wrong assumptions.

I observed another variation of the “manual enthusiast” at a company in a different context, this one being more of a Silicon Valley example. This was a hot, fast growing payment company that was already a unicorn and is still going strong; they clearly will be huge. I sat down with a number of the members of the TechOps team and the statements were something along the lines of: “the same team that handles technical operations during the day, handles it at night, we do not have a NOC team and frankly think that we don’t need…whenever we have an issue, everyone jumps in, we have this enhanced Google Doc format where everybody consolidates all the information related to the incident, it’s a long thread of contributions from a number of different people and it is going in parallel to breakout chat ops sessions, and through this fairly manual process we typically figure things out eventually”. Everyone was nodding, I am guessing due to peer pressure and potentially also buoyed by the expectation of the eventual large stock options payout. Having gone through a similar phase at a previous company, I was seeing the eventual burnout, team churn, and the “come to Jesus” realization that this process is not scalable and that they will need to change.

We Believe in People First

Now, obviously the flip side of stating that manual processes don’t scale is to advocate for automation. We do that but only to a certain extent. Please don’t get me wrong, we really think that people will always be at the helm in technical operations, and therefore companies will always rely on human brains, ultimately. The problem with manual processes is when they are so prevalent that it becomes an impediment to people achieving their full potential, and companies getting the most out of their teams. In the first example I described before, people’s abilities to deal with the task of maintaining a complex system are stretched to the limit: ultimately, too much noise leads to time sinks, mistakes and fatigue, in addition to lots of downtime. As for the always-on team in the second example, well, it’s easy to see how that is unsustainable: sooner or later the team will say that this is no life, it’s not worth it and bail out for a better job.

But I think the main point is that data centers have been around for 50 years and the vast majority of the 10 million professionals that maintain it do not yet have the benefit of the latest equipment, techniques, forward thinking DevOps ideas we take for granted in the Valley. They operate in very different environments and under rather traditional cultures and it’s important to be aware of their particular circumstances when pitching your “next generation” solution. I recognized, through that CTO’s feedback, that some people like being at the center of the process, and their view of the world is unlikely to change all that fast. So whatever our pitch is, it has to fit to their very immediate needs, it has to be seen as incremental to them as opposed to completely revolutionary to the way they have been working for years.

So armed with the fresh feedback from the CTO, in that Mid-West based company, I proceeded to meet other folks in the organization. Instead of advocating for a full system approach, I went very granular and broke down the value in terms of each individual team: this is why the database team wants something like this, this is the benefit for the technical operations team…and it seems to be working. Ultimately they’ll realize that they are all using the same platform and it is easier to connect the dots through a full system approach, but let’s take it one step at a time.

So what are your thoughts on manual processes? Feel free to discuss it here.

Follow me on Twitter @SignifAICEO

Cheers,

JP Marcos