Security Experts:

Security Operations: Moving to a Narrative-Driven Model

An Alert-based Security Operations Model Does Not Scale to Meet Today’s Complex Threat Landscape and Enerprise Demands.

The current security operations model is an alert-driven one. Alerts are generated by various different technologies and then sent to the work queue. Analysts and incident responders proceed through the work queue according to priority. Analysis and forensics are performed manually, to the extent required, for each alert, which allows a more complete picture of what occurred -- the narrative -- to be built around the alert. 

Alerts contain a snapshot of a moment in time, while narratives tell the story of what unfolded over a period of time -- the attack kill chain. Ultimately, only the narrative provides the required context and detail to allow an organization to make an educated decision regarding whether or not incident response is required, and if so, at what level.

The incident handling life cycle contains six stages: detection, analysis, containment, remediation, recovery, and lessons learned.  

Incident Response StrategyWhen performing incident response, an organization will proceed through all six stages by following its incident response process. Although all six stages are important, when an enterprise is attacked, the highest priority quickly becomes moving rapidly from detection to containment.

Since it is outside the scope of this column, I will put aside the challenge that timely and accurate detection represents for a moment. For the purposes of this column, let’s assume that good detection is in place, or that a third party has notified the organization that a breach has occurred. The organization needs to move quickly to understand what occurred, when it occurred, how it occurred, what has been impacted, what information has been taken, and what requires containment.  

The analysis stage provides the information required to answer these and other important incident response questions.

The enterprise faces two main obstacles that prevent it from proceeding rapidly from detection, through analysis, and on to containment:

1) Alerting technologies are too noisy and show too little context, preventing enterprises from properly understanding which alerts to focus on and in what context they fired.

2) Forensics technologies perform too slowly to allow enterprises to rapidly assemble a detailed picture of the narrative and identify what needs to be contained.

Addressing these issues will require changing the way we think about the security operations model. The security community needs a paradigm shift from an alert-driven security operations model to a narrative-driven security operations model. In other words, analysts need to be presented with complete narratives in their work queue, rather than alerts. This is of course, easier said than done. How would one go about this paradigm shift? There are likely several approaches, but one approach involves the fusion of network, end-point, and malware data.

There are typically three families of data most relevant to security operations -- network, endpoint, and malware. Each one provides its own perspective and visibility into activity within the organization, and each one provides unique insight and adds value to the others.

For example, consider the scenario where malware is detected entering an organization. The malware can be analyzed to understand the details of its network and end-point behavior. That information can subsequently be leveraged against the network and end-point data. The network data can be studied to understand to which end-point(s) the malware was destined, to determine if additional malicious activity consistent with the malware analysis is present, and if any data was taken from the organization. The endpoint data can be analyzed to understand whether or not the malware has successfully infected the system, been able to persist, and what activity was observed on the end-point during the infection.

The network and end-point data may also uncover additional malware that can subsequently be analyzed to identify additional network and end-point clues. As we see, there is a virtuous feedback loop here that feeds upon itself. The more we learn, the more we find. The more we find, the better we are able to respond sooner -- before serious damage has occurred.

If you think this sounds like something that can be dropped into a repeatable process, you are correct. In fact, most organizations with a mature security operations function already have a mature incident response process that they follow. Given that, it is surprising to me how much of this important work is still performed manually.

There are always incidents that do not fit the mold of course, but if the vast majority of malicious code incidents are similar, why not take advantage of every efficiency possible? At a high level, what we are aiming to do when investigating an alert is to build the narrative around it. The narrative gives us a more complete picture of the story surrounding what happened. It allows us to make an educated decision regarding if and how to respond.

We have learned that in some of the breaches that have been reported in the media of late, alerts fired, but were not properly handled. There are undoubtedly many reasons why this may have happened, and in fact, I have discussed some of them on my personal blog.  

One of the reasons is that when alerts fire, they contain a snapshot of a moment in time. Analysts are forced to either make a split-second decision with little supporting evidence, or to spend precious cycles constructing the narrative. If the volume of alerts is too high, a sufficient narrative cannot be constructed for enough of them. Consequently, some true positives will be missed or overlooked.

If we change the paradigm and present the entire narrative (or at least most of it) to the work queue, the chance of a misjudgment decreases significantly. Do I expect this paradigm shift to occur overnight?  Certainly not. But, given my experience, I don’t see any other way that organizations can keep pace with threats and data volumes in the near future.

With today’s volume of data and the sophistication of attacks, enterprises face the challenge of rapidly re-constructing the narrative surrounding an alert, quickly assessing damage, and responding appropriately within minutes, rather than hours or days. The alert-based security operations model does not scale to meet today’s demands, and it will certainly not scale to meet tomorrow’s. Although it will take some effort as a community, shifting the paradigm to the narrative-driven model will help meet the challenges of today and tomorrow head-on.

view counter
Joshua Goldfarb (Twitter: @ananalytical) is an experienced information security leader with broad experience building and running Security Operations Centers (SOCs). Josh is currently Co-Founder and Chief Product Officer at IDRRA. Prior to joining IDRRA, Josh served as VP, CTO - Emerging Technologies at FireEye and as Chief Security Officer for nPulse Technologies until its acquisition by FireEye. Prior to joining nPulse, Josh worked as an independent consultant, applying his analytical methodology to help enterprises build and enhance their network traffic analysis, security operations, and incident response capabilities to improve their information security postures. He has consulted and advised numerous clients in both the public and private sectors at strategic and tactical levels. Earlier in his career, Josh served as the Chief of Analysis for the United States Computer Emergency Readiness Team (US-CERT) where he built from the ground up and subsequently ran the network, endpoint, and malware analysis/forensics capabilities for US-CERT.