Security Orchestration and Incident Response

Last month at the RSA Conference, I saw a lot of companies selling security incident response automation. Their promise was to replace people with computers ­– sometimes with the addition of machine learning or other artificial intelligence techniques ­– and to respond to attacks at computer speeds.

While this is a laudable goal, there’s a fundamental problem with doing this in the short term. You can only automate what you’re certain about, and there is still an enormous amount of uncertainty in cybersecurity. Automation has its place in incident response, but the focus needs to be on making the people effective, not on replacing them ­ security orchestration, not automation.

This isn’t just a choice of words ­– it’s a difference in philosophy. The US military went through this in the 1990s. What was called the Revolution in Military Affairs (RMA) was supposed to change how warfare was fought. Satellites, drones and battlefield sensors were supposed to give commanders unprecedented information about what was going on, while networked soldiers and weaponry would enable troops to coordinate to a degree never before possible. In short, the traditional fog of war would be replaced by perfect information, providing certainty instead of uncertainty. They, too, believed certainty would fuel automation and, in many circumstances, allow technology to replace people.

Of course, it didn’t work out that way. The US learned in Afghanistan and Iraq that there are a lot of holes in both its collection and coordination systems. Drones have their place, but they can’t replace ground troops. The advances from the RMA brought with them some enormous advantages, especially against militaries that didn’t have access to the same technologies, but never resulted in certainty. Uncertainty still rules the battlefield, and soldiers on the ground are still the only effective way to control a region of territory.

But along the way, we learned a lot about how the feeling of certainty affects military thinking. Last month, I attended a lecture on the topic by H.R. McMaster. This was before he became President Trump’s national security advisor-designate. Then, he was the director of the Army Capabilities Integration Center. His lecture touched on many topics, but at one point he talked about the failure of the RMA. He confirmed that military strategists mistakenly believed that data would give them certainty. But he took this change in thinking further, outlining the ways this belief in certainty had repercussions in how military strategists thought about modern conflict.

McMaster’s observations are directly relevant to Internet security incident response. We too have been led to believe that data will give us certainty, and we are making the same mistakes that the military did in the 1990s. In a world of uncertainty, there’s a premium on understanding, because commanders need to figure out what’s going on. In a world of certainty, knowing what’s going on becomes a simple matter of data collection.

I see this same fallacy in Internet security. Many companies exhibiting at the RSA Conference promised to collect and display more data and that the data will reveal everything. This simply isn’t true. Data does not equal information, and information does not equal understanding. We need data, but we also must prioritize understanding the data we have over collecting ever more data. Much like the problems with bulk surveillance, the “collect it all” approach provides minimal value over collecting the specific data that’s useful.

In a world of uncertainty, the focus is on execution. In a world of certainty, the focus is on planning. I see this manifesting in Internet security as well. My own Resilient Systems ­– now part of IBM Security –­ allows incident response teams to manage security incidents and intrusions. While the tool is useful for planning and testing, its real focus is always on execution.

Uncertainty demands initiative, while certainty demands synchronization. Here, again, we are heading too far down the wrong path. The purpose of all incident response tools should be to make the human responders more effective. They need both the ability and the capability to exercise it effectively.

When things are uncertain, you want your systems to be decentralized. When things are certain, centralization is more important. Good incident response teams know that decentralization goes hand in hand with initiative. And finally, a world of uncertainty prioritizes command, while a world of certainty prioritizes control. Again, effective incident response teams know this, and effective managers aren’t scared to release and delegate control.

Like the US military, we in the incident response field have shifted too much into the world of certainty. We have prioritized data collection, preplanning, synchronization, centralization and control. You can see it in the way people talk about the future of Internet security, and you can see it in the products and services offered on the show floor of the RSA Conference.

Automation, too, is fixed. Incident response needs to be dynamic and agile, because you are never certain and there is an adaptive, malicious adversary on the other end. You need a response system that has human controls and can modify itself on the fly. Automation just doesn’t allow a system to do that to the extent that’s needed in today’s environment. Just as the military shifted from trying to replace the soldier to making the best soldier possible, we need to do the same.

For some time, I have been talking about incident response in terms of OODA loops. This is a way of thinking about real-time adversarial relationships, originally developed for airplane dogfights, but much more broadly applicable. OODA stands for observe-orient-decide-act, and it’s what people responding to a cybersecurity incident do constantly, over and over again. We need tools that augment each of those four steps. These tools need to operate in a world of uncertainty, where there is never enough data to know everything that is going on. We need to prioritize understanding, execution, initiative, decentralization and command.

At the same time, we’re going to have to make all of this scale. If anything, the most seductive promise of a world of certainty and automation is that it allows defense to scale. The problem is that we’re not there yet. We can automate and scale parts of IT security, such as antivirus, automatic patching and firewall management, but we can’t yet scale incident response. We still need people. And we need to understand what can be automated and what can’t be.

The word I prefer is orchestration. Security orchestration represents the union of people, process and technology. It’s computer automation where it works, and human coordination where that’s necessary. It’s networked systems giving people understanding and capabilities for execution. It’s making those on the front lines of incident response the most effective they can be, instead of trying to replace them. It’s the best approach we have for cyberdefense.

Automation has its place. If you think about the product categories where it has worked, they’re all areas where we have pretty strong certainty. Automation works in antivirus, firewalls, patch management and authentication systems. None of them is perfect, but all those systems are right almost all the time, and we’ve developed ancillary systems to deal with it when they’re wrong.

Automation fails in incident response because there’s too much uncertainty. Actions can be automated once the people understand what’s going on, but people are still required. For example, IBM’s Watson for Cyber Security provides insights for incident response teams based on its ability to ingest and find patterns in an enormous amount of freeform data. It does not attempt a level of understanding necessary to take people out of the equation.

From within an orchestration model, automation can be incredibly powerful. But it’s the human-centric orchestration model –­ the dashboards, the reports, the collaboration –­ that makes automation work. Otherwise, you’re blindly trusting the machine. And when an uncertain process is automated, the results can be dangerous.

Technology continues to advance, and this is all a changing target. Eventually, computers will become intelligent enough to replace people at real-time incident response. My guess, though, is that computers are not going to get there by collecting enough data to be certain. More likely, they’ll develop the ability to exhibit understanding and operate in a world of uncertainty. That’s a much harder goal.

Yes, today, this is all science fiction. But it’s not stupid science fiction, and it might become reality during the lifetimes of our children. Until then, we need people in the loop. Orchestration is a way to achieve that.

This essay previously appeared on the Security Intelligence blog.

Posted on March 29, 2017 at 6:16 AM • 59 Comments