Many Security Operations Centers (SOCs) find themselves inundated with and overwhelmed by large volumes of false positives, non-actionable alerts, and noise. People often ask me how they can address this situation to improve and strengthen their signal-to-noise ratio.
A full answer to this question would involve a lengthier discussion than the length of this piece permits. Further, this is a somewhat complex discussion, much of which depends on each individual organization’s circumstances.
On my personal blog, I have written about various different aspects of this discussion in the past. In this piece, I will examine and elaborate on one particular aspect that I have only briefly touched on in the past.
Despite the complexity of this topic, there is an interesting approach that many organizations can consider to help address this issue. This may sound radical, but it would be helpful to many organizations to turn off and throw out the default rule set on their alerting technologies and within their SIEM.
Not all security technologies are alert driven, but for those that are, there is huge potential value in turning off the default rule set. How could I possibly suggest this? Allow me to explain.
To begin with, let’s examine what a default rule set is. Most alerting technologies, and in fact, most SIEMs as well, ship with a default set of rules designed to produce some alerting right out of the box. The intention here is good – to provide customers with something they can use right out of the box without needing to develop specialized expertise. What is the issue with that? There are many issues, but chief among them is that those default rules were not designed for the specific risks and threats to your organization. Each organization is unique, and each organization will have unique risks and threats to identify, contemplate, address, and manage. Are there specific rules that ship with products that match some of your organization’s needs? Absolutely, and those rules should be retained selectively.
Herein lies the rub. The unintended consequence of the noble intention of the default rule set is a storm of false positives. Retaining the default rule set, or even certain “profiles” of rule sets that were not designed for the threats and risks specific to your organization generally leads to a deluge of alerts.
Without the context of your organization-specific requirements, those rules don’t stand much of a chance in producing reliable, actionable, high fidelity alerting. Many, if not most, of the alerts turn out to be non-actionable, and with alert volumes in the hundreds of thousands or millions per day, most of them are never looked at. Remind me again what the purpose of having an alert is if no one ever looks at it? There really is none. It defeats the purpose of the alert entirely. Further, those non-actionable alerts drown out the signal we are interested in – the true positives. The signal-to-noise ratio simply drops to too low of a level.
If I turn off all (or most) of the default rules, how do I then build a rule set to drive my alerting? That is an excellent question, but there is a question that comes even before that question. Before we can dive into any technology and begin building a solution, we need to understand what problem we are solving. How do we understand what problem we are solving? We first need to take a step back to assess and understand the risks and threats to the organization and use that information to build a series of use cases. There may be many use cases, and that’s okay. Nowhere is it written that we are only permitted to have a certain number of rules, but rather, what’s more important is the volume (reasonable) and quality (high) of alerting they produce. If the organization cannot assess all of the risks and threats that confront it, outside help can be enlisted to assist with this assessment. It is a worthwhile investment when done correctly.
Once the scope of the problem is better understood, the organization can begin to prioritize the risks and threats that it would like to address first. From there, the organization works through each use case, identifies the appropriate data sources required for visibility into the use case, and creates logic that will identify the use case and alert when it is encountered. This logic should be thoroughly tested and tuned to ensure that false positive (noise) rates are low, while ensuring that true positives (signal) are identified in a timely manner. The logic can and should mature and be tuned over time as attacker modify tactics or if false positive rates begin to increase.
Of course, building a good rule set to supply the work queue with reliable, actionable, high fidelity alerting is eternally a work in process. Threats, risks, data sources, technologies, and business requirements will continually change and will continually drive an ever-evolving set of use cases. Because of this, it is necessary to continually iterate through and revisit this process in a cyclical manner. This goes both for the creation of new rules, as well as for revisiting rules that have already been created. As a part of this, it is helpful to keep an organized inventory of rules, with the following information (along with any other relevant information) recorded for each rule:
• Rule name
• Incident category
• Pseudo-code logic
• What use case the rule addresses
• What data sources the rule relies on
• Links to corresponding incident handling/incident response processes
• Notes, including a log of modifications that were made to the rule
This documentation not only helps to track what alerts are configured and why, but also assists with metrics and reporting. For example, from time to time, management may wish to understand the quantity of rules for each incident category. Or, alternatively, management may wish to understand the true positive and false positive volumes per data source or per use case. With proper documentation of the rule set, these questions are straightforward to answer. Without that documentation, these questions become much more difficult to address.
The result of this process will be a far more finely-tuned, reliable, customized, and well-documented rule set than what came out of the box by default. It sounds radical, but throwing out the default rule set may be one of the best things you can do to improve and mature your security operations program. If you find yourself grappling with low signal-to-noise ratio and high volumes of false positives, consider this approach to help address that issue.