In the Fall of 2019, I joined the Splunk Global Security organization to build Splunk’s internal threat hunting program. Over a few months, we went from an organization with no defined hunting mission or program to one that can do full-scale, high-value hunting. This post is one in a short series that describes how we manage our threat hunting program.

Measuring Success

Before we operationalized our hunting program, I knew that metrics tracking needed to be a part of everything we did. Metrics are essential for describing the performance and success of any program or team; they’re extra crucial for hunting programs because threat hunting is often misunderstood even among practiced cybersecurity operators. We needed to answer the question, “What does a successful hunting program look like, and what metrics properly describe that success?”

Defining Success

Before we could measure our program, we first had to determine what a successful program is. Our program has a few core principles that we think makes us successful:

Hunting is intelligence-driven

Hunting focuses on what we can’t detect and produces new detections

Hunting is an activity that the entire team (e.g., non-hunters) can participate in

As long as our operations align with these principles, then we’re setting ourselves, the organization, and the business up for success. It’s worth mentioning that these principles may not align with every hunting program, cybersecurity organization, or business model. For example, an enterprise hunting program will get a higher return on investment (ROI) by focusing on producing new detections; in contrast, an MSSP hunting program will likely get higher ROI by focusing on identifying new incidents for their customers. To some, this distinction may seem meaningless, but it has a significant impact on operations and how leadership judges the value of the program.

Defining Metrics

With our principles in mind, we needed our metrics to address two needs:

Provide the program with data-driven feedback about what is working and not working

Share our success and struggles with the team, organization, and leaders

We used these needs to create operational metrics (OMs) and key performance indicators (KPIs); OMs are used to keep the program on track while KPIs describe our success or struggle.

These are the OMs we measure:

Number of Requested / Submitted Hunts

Number of Backlogged Hunts

Number of Completed Hunts

Completed Hunts Mapped to the Business Environment (corporate, cloud, etc.)

Completed Hunts Mapped to the Kill Chain

Number of Discovered Hunt Findings

Hunt Findings Mapped to Utilized Datasets (network, host, etc.)

Hunt Findings Mapped to Utilized Techniques (searching, frequency analysis, visualizations, etc.)

Hunt Findings Mapped to the Type of Finding (detection, security event, risk, etc.)

Hunt Findings Mapped to the Team Impacted by the Finding

Hours Taken to Complete a Hunt Task

These OMs are continuously updated in real-time through automation and are accessible by everyone in the organization (including leaders).

As the hunting program lead, these OMs help me in two areas:

They ensure that the program is staying true to our principles and mission

They help identify bias in our program and allow us to question why our operations are producing specific outcomes

These are the KPIs we measure (consider their relevance to our core principles):

Number of Reported Hunt Findings (detections tracked as KPI)

Completed Hunts Mapped to the ATT&CK framework

Percentage of Non-Hunters Participating in Hunts

The main thing to take away from our KPIs is that we keep them light: we only care about measuring our success against our core principles and mission. It doesn’t matter how many hunts we complete as long as we are producing relevant, impactful findings; it doesn’t matter how people hunt as long as they are participating and improving their analytical skills.

Similar to OMs, these are continuously updated in real-time and are accessible by everyone. Notably, we do not track a KPI related to intelligence-driven hunts; when we started the program we had a metric that described this, but quickly found that it wasn’t worth the overhead of tracking because we were always performing intelligence-driven hunts.

Making Metrics Accessible

Earlier I said that everyone could see our metrics, so I thought that I should share a heavily redacted version of our metrics dashboards.

The screenshot below shows our “thousand-foot view” of the hunting program; it contains the majority of our OMs and KPIs: