Automation in security operations and incident response is a topic that is now more on the radar than ever before. This is driven by the ongoing cybersecurity skills shortage, an escalation in the volume and sophistication of cyber threats and the growing adoption of automation by threat actors themselves.

Manual processes can not achieve the velocity required to effectively and rapidly respond to attacks that are often not detected until a threat actor has almost completed the kill chain. Attacks such as ransomware or phishing especially stress the effectiveness and efficiency of manual incident response, frequently resulting in disaster recovery rather than threat containment.

The consistent feedback is that automation is highly desirable, at least by security teams, who are the ones struggling on a daily basis from work overload and alert fatigue, but this desire to implement effective automation is often inhibited by doubt and fear. Doubt about the accuracy of the detection of threats and fear of the consequences of automating the containment or mitigation responses, and the prospect of the detrimental impact and damage resulting from potentially doing this wrong.

Automation is not new and enterprises have been promised automated containment capabilities before, but previous premature attempts such as through antispam and intrusion prevention systems for example, which lacked the ability to reliably identify anomalies and attacks, has since led to IT operations and some executive management teams being more reluctant to pass such powers to machines. This is despite detection capabilities having dramatically improved in recent years, especially using behavioural modelling and machine learning driven approaches.

Three Common Automation Challenges

Now let’s take a look at the three basic challenges that security teams face when considering automation and how they can be overcome so automation can be successfully implemented.

The SecOps team can assess the impact of the risk, but not the impact on production

Not all decisions can always be completely automated

IT operations lack trust in automation

SecOps Can Assess the Impact of the Risk, but Not the Impact on Production

Security operations teams are often focused purely on the risk and impact of the threat and in their own silo struggle to build up and maintain an awareness of what is going on in production and who it may affect. For example, is the affected system mission critical, is the system unstable, or is it a legacy system? Is the system currently being used to process critical business internal financial reports, or is a customer using it and being affected when they are paying for a service you should be providing? Disabling a seemingly harmless user account may actually be used to run critical processes. Dependencies, complexities and unknowns are the bane of automation.

These are all data points that most security operations teams either lack, or the information that relates to this may be out of date, but either way this can have a huge impact on how the incident response or remediation process must be conducted. The incidents or vulnerabilities should still be addressed but this may require additional time, tasks and a specific way of approaching it, and this is likely to vary from organization to organization. Regardless of this, it is important for departments to be interlinked as much as possible and for processes and procedures and related documents to be regularly updated to ensure critical information used and kept on file is always correct.

Not All Decisions Can Always Be Completely Automated

The actual containment or remediation response is not the only thing that can be automated. We can automate a wide variety of tasks, including prioritization of an incident, fetching additionally required information and context or more simply notifying and creating tasks for stakeholders.

Through using automation we can make people more efficient and can use automation to take away some of the more menial and repetitive tasks. We can even use machine learning to compute an analysis that a human would take a millennia to do manually, or could not do at all due to its complexity. But somewhere along the line a human analyst may well still be required to carry out a manual decision if needed.

The more we automate the easy tasks, the more complex and demanding the remaining tasks will be, but we can still automate the next actions to be taken regardless of this if they have been manually vetted. Analysts will be able to spend more time handling and vetting these more complex manual decisions, rather than wasting their valuable time carrying our laborious, mundane and repetitive tasks.

Gartner recommends: “Rather than to seek full automation of all SOC activities, enterprises should seek “automatability” - the capability of being automated as higher levels of confidence are achieved”.

In the simplest scenario, this means sending out a notification to the IT operations team that outlines the issue. It would include what the problem is, the potential impact, and what action is required to resolve it. It would ask them to either confirm that this can be executed automatically or require them to reject the automated action and for them to carry it out manually.

We can therefore successfully automate the action without automating the decision as and when required, based on the levels of automation we are comfortable with in our operations processes and workflows and this is also open to change over time and as experience and knowledge grows.

IT Operations Lack Trust in Automation

The downside to getting IT operations to vet an action is that IT operations teams are frequently overloaded, so that a handoff occurs from SecOps to IT Ops with a long delay in response. In the case of incidents such as ransomware, this delay can mean the difference between containment and disaster recovery and between an incident and a full blown breach. The security operations team can help to alleviate this and by building trust and confidence.

This can be achieved by keeping track of what actions are done manually including how many times the same action was taken by a human instead of a machine, and working out the difference in time and effort between the two. The idea is that if someone receives the same notification for similar incidents requiring the same manual actions a multitude of times, SecOps can demonstrate to them that this could have been safely automated. There will also be an audit trail to prove it and the data to build a business case if required. More importantly the team will be able to gather data on what tasks can be automated safely and those that couldn’t be, with their potential resulting consequences. The level of automation can then be expanded as needed as trust and confidence increases.

Conclusion

An automated action may be safe in one business unit, but not acceptable in another, therefore safely automating means selectively automating. To accommodate this, processes must support granularity, whether gathering metrics or gathering the automations themselves. Ideally, whenever automation technology is used, there must be the correct level of support for this approach by all teams involved, including SecOps, IT Ops and other teams potentially affected within the business to ensure its success. Technology can help to build trust, but when all is said and done, it is going to require that it is experienced by the people you expect to trust you.

Please enable JavaScript to view the comments powered by Disqus.