Security Experts:

Don't Become a Cybersecurity Data Pack Rat

Enterprise Security Teams Must Think More About How to Reduce Big Data Into Real-time Answers

Security teams are always looking for new and efficient ways to find threats, and the emerging field of security analytics is proving to be one of the most promising areas of innovation. Security analytics encompasses a wide range of analytical techniques which can be performed on an equally diverse set of data sources, such as network traffic, host-based indicators, or virtually any type of event log.

In many ways this description sounds like a version of big data analytics – the analysis of very large data sets to find unexpected correlations. However, while big data is obviously a powerful tool, it is not a silver bullet for every problem. When it comes to finding active attacks, too much data can actually overwhelm staff to the point that threats get lost in the noise. Without a clear notion of how to use the data, a big-data security analytics project can turn IT teams into the cybersecurity version of pack rat, with data piled up to the point that it becomes unusable and paralyzes the organization.

A few lessons from the past and present

We don’t have to look back very far for a lesson on how more data doesn’t always mean more value. Since the 1990s SIEM and log management vendors posited that a central collection point for all enterprise logs could be used to answer virtually any enterprise question. And while SIEMs obviously have proven essential to many organizations, they have fallen well short of becoming the all-knowing oracle of IT. Organizations have learned the hard way that mountains of data don’t magically turn into insight. 

Human expertise is typically at the heart of a successful SIEM project. Specialists are required in order to understand the different types of data and to build highly complex rules to interpret the data. Human analysts are typically required to ask the SIEM the right set of questions. This often leads to highly bespoke operations that can be very brittle and hard to change, and heavily dependent on human care and feeding. In short, simply collecting the data is the easy part. Making use of that mountain of data can be far more challenging. 

Security teams actually need data reduction

Big Data and Must Bring Big AnswersThe big data approach to security analytics is poised to replicate many of the same things that plagued SIEMs for years, albeit with much more data, and by extension, much more complexity. To avoid the pitfalls of the previous generation, we need to avoid the magical thinking that says, “if we just collect enough data, the answer will reveal itself.” The burden of this thinking almost invariably falls on the shoulders of human analysts, who must sift through the many alerts and anomalies in search of the point that matters.  

The fundamental issue is that the more data we collect, there is parallel requirement for automated data reduction. By data reduction I mean the ability to quickly reduce the many figurative haystacks down to the few points that matter. Today, we are creating a situation where the generation of haystacks is automated, but the process of finding the needles remains manual. Staff can spend all of their time investigating events that are “unusual”, but may not be an actual threat. This can lead to the pack rat scenario where everything is kept in the hope it will be useful, but actually makes normal operation impossible.

As a result, security analytics projects need to be evaluated in terms of how does the data get turned into intelligence. How is analysis automated? When an issue is detected, how conclusive is it? How much additional investigation and verification is required by staff, and how much time does it take? Once again, collecting the data is relatively easy – the value of security analytics solutions will rest in how well they reduce that data into answers.

Of course, keeping a repository of all data is not a bad thing in itself. In fact, it can prove to be very valuable when used in a forensic context. In such a case, the security teams have a very good sense that something has gone wrong, and a complete data set can allow them to go spelunking for answers. However, this is a very different use case than proactively finding and stopping an active attack in progress. Both approaches have their place. But frankly, the industry is not lacking in forensic tools that can verify and analyze a known attack. The piece that most organizations are missing is the ability to reveal the attacks that they don’t already know about. This requires us to think more about how we reduce big data into real-time answers.

view counter
Wade Williamson is Director of Product Marketing at Vectra Networks. Prior to joining Vectra, he was a Senior Threat Researcher at Shape Security. He has extensive industry experience in intrusion prevention, malware analysis, and secure mobility. He has extensive speaking experience having delivered the keynote for the EICAR malware conference and led the Malware Researcher Peer Discussion at RSA. Prior to joining Shape, he was Sr. Security Analyst at Palo Alto Networks where he led the monthly Threat Review Series and authored the Modern Malware Review. He has also led the product management team at AirMagnet where he helped to develop a variety of security and network analysis tools targeted to WiFi networks. He has been a steady and active researcher of new threats and techniques used to compromise enterprise networks and end-users.