Solving security’s big data problem is about prioritized data flow, continuously processing data for analysis and translating and exporting it to create a single security infrastructure
Typically, when someone says “security is a big data problem” they’re referring to the overwhelming amount of internal threat and event data produced from sources like your SIEM, logs, ticketing and case management systems. The volume of alerts these sources generate cause many security professionals to suffer from “alert fatigue.” Compounding the fatigue are the millions of external threat datapoint analysts are bombarded with every day from the multiple sources they subscribe to – commercial, open source, government, industry, existing security vendors – as well as frameworks like MITRE ATT&CK.
And the problem is getting worse. As business models change, adversaries take advantage of new attack vectors – like IoT devices, operational technology (OT) and the multiple personal and work devices users now move between. They also leverage human vulnerabilities, impersonating trusted colleagues and third parties to infiltrate organizations. Layering new solutions and subscribing to more feeds in an attempt to close security gaps generates new types and formats of data to be collected, and in huge volumes.
Beyond alerts and feeds
However, that’s just one side of the big data problem – the data ingestion side if you will. The other is the data export side. This aspect has garnered less attention because we typically don’t think through how data from a feed or solution works with the systems and processes we have in place. A case in point, you want a threat data feed to help you understand how threat actors operate and what to look for within your environment, but how are you going to put it to use? Apply threat data directly to the SIEM and the result is many false positives.
Another example, Security Orchestration, Automation and Response (SOAR) platforms and tools have emerged to accelerate response by automating processes. But you can’t just focus on defining a process and automating the steps needed to complete that process. You need to make sure you determine the right criteria and triggers for the process. And in a dynamic and variable environment, the operational reality is that you need to continuously ensure you have the right data to focus on what really matters to your organization, and the right processes to take the right actions, faster. To truly address SOAR use cases, we need to move from a process-driven to a data-driven approach that prioritizes data and connects systems with that data. Automating and orchestrating noisy data just amplifies the noise.
The latest newcomer to the security arsenal, Extended Detection and Response (XDR) is gaining a lot of traction as a way to enable detection and response across the enterprise. XDR requires all tools and all teams working in concert, but the challenge is that organizations typically protect themselves using many different security technologies, in the cloud and on-premises, that are from different vendors. Not to mention all the third-party data and intelligence sources they connect to for context. Silos make it extremely difficult to share data between tools or teams in any real way and end up creating an obstacle course for the attacker. A movement around open XDR promotes an open and extensible architecture focused on enabling integration and data flow across the infrastructure for prevention, detection and response.
It’s about prioritized data flow
Security teams are indeed grappling with a big data problem that checks the classic “4 Vs”: massive volume, variety, velocity and veracity of data to be ingested and exported. To solve this big data problem, we need a data-driven approach to security operations. With a platform that can get data in different formats and languages from different vendors, systems and sources to work together, we can create a continual, meaningful and usable data flow, as follows:
• The flow starts by ingesting, normalizing and correlating data to identify relationships and enrich the data with context.
• Then the data must be prioritized, ideally in an automated fashion, to reduce noise and allow security analysts to focus on what matters most to their organization.
• Now that data can be acted on. To do this, the data needs to be translated back into a usable format and language for consumption by the tools and teams that need to utilize it.
Security is a big data problem. Solving it is all about prioritized data flow, continuously processing data for analysis and translating and exporting it to create a single security infrastructure. A data-driven approach to security operations is the only way to truly close security gaps with an integrated defense, and not just create yet another, albeit more difficult, obstacle course for attackers to navigate.
Related: Inside the Battle to Control Enterprise Security Data Lakes