Incident Response

Beyond the Hype of Data Science

With RSA Conference on the horizon, odds are that if you make it to the exhibit floor, you will hear a lot about data science and machine learning.

Wade Williamson

February 22, 2016

With RSA Conference on the horizon, odds are that if you make it to the exhibit floor, you will hear a lot about data science and machine learning.

Security vendors old and new are touting the powers of data science to solve security problems. And while these technologies have real value, the terms are rapidly becoming empty marketing buzzwords.

To keep our collective heads above water, it is important to understand the realities behind these technologies so we can separate the truth from the hype and make well-informed security decisions.

A quick intro to data science and why it matters

The world of data science can be hard to navigate, not simply because it involves lots of hard math, but also because it spans an enormously broad set of disciplines. Data science is concerned with the many ways that knowledge can be extracted from data including mathematics, statistics, machine learning, and a variety of analytics just to name a few.

Cybersecurity: Using data science and machine learning technology A subset of data science, machine learning enables software to iteratively learn from data and adapt without being programmed. For example, machine learning can reveal low-level traits that command-and-control messages have in common, or signal an impending data theft when unusual employee behavior occurs. These characteristics might be unknown beforehand, but machine-learning models can recognize these signs from the data.

These examples illustrate critical concepts that make data science and machine learning important to security professionals.

First, the intelligence we extract from very large data sets tends to be fairly long-lived. Instead of chasing every URL a command-and-control server uses, we can learn its core underlying behavior and recognize it wherever it goes. This allows our security detections to stay well ahead of attackers.

Second, machine learning extends intelligence to the local environment. An intelligence feed will never be able to tell you when one of your employees starts behaving abnormally. It’s the sort of thing that must be learned locally, and is often the essential context needed to find a live threat.

All data is not equal

Advertisement. Scroll to continue reading.

Data science models inherently depend on the quality of data they consume. The better the data, the more you will be able to learn. An entire industry has been spawned by analyzing logs and events generated by other systems. While this approach may help connect the dots between observed events, it rarely finds hidden threats that go undetected in the first place.

By nature, logs are a secondary source of data that briefly summarize an event. Information that is not contained in the log is lost and unavailable for further analysis. Equally important, logs are only as good as the systems that generated them. If an upstream firewall or security device fails to detect a threat, there will be no log.

This is a fundamental issue. It is the job of a cyber security solution to detect threats that slip by standard layers of defense. Data science and machine learning can be applied to any data source, not just log data. Direct analysis of traffic, files or devices allows us to detect what was previously invisible.

Focus on answers, not data

Having looked at the inputs to a data science detection model, we can now turn our attention to what they actually deliver. And this is where things can get a little dicey if you’re not careful.

While the promises may sound enticing, the vast majority of security and analytics solutions require a significant amount of human effort and attention in order to deliver value. Needless to say, most security organizations don’t have the luxury of extra time or staff.

As a precaution, be sure to evaluate whether a prospective solution makes life easier or harder on your staff. Many products generate mountains of anomalies that require a human analyst to investigate, and this bottleneck will severely limit your real-world value.

It is critically important for security products to actually deliver high-confidence detections and answers. Of course, analysts will always need solid evidence to validate that a threat is real. But this should be an effort of verification and not require the analyst to do the heavy lifting of intensive analysis and diagnosis.

These are of course not the only factors to consider when evaluating data science and machine learning solutions. However, it can provide some context to cut through the hype and find the data science solutions that are most likely to deliver real value.

Written By Wade Williamson

Latest News

Click to comment

CIEM Chat: How to Reduce Cloud Identity Risk

March 26, 2024

Join the session as we discuss the challenges and best practices for cybersecurity leaders managing cloud identities.

Virtual Event: Ransomware Resilience & Recovery Summit

April 17, 2024

SecurityWeek’s Ransomware Resilience and Recovery Summit helps businesses to plan, prepare, and recover from a ransomware incident.

Navigating Vendor Speak: A Security Practitioner’s Guide to Seeing Through the Jargon

As a security industry, we need to focus our energies on those professionals among us who know how to walk the walk. (Joshua Goldfarb)

SD-WAN: Don’t Build a Dead End, Prepare for Future-Proof Secure Networking

SD-WAN must be scalable, stable, secure, and fully operational to serve as a strong base for seamless modernization and progression to SASE. (Etay Maor)

You Against the World: The Offenders Dilemma

Foreign attackers have many more toolsets at their disposal, so we need to make sure we’re selective about our modeling, preparation and how we assess and fortify ourselves. (Tom Eston)

Why Intelligence Sharing Is Vital to Building a Robust Collective Cyber Defense Program

With automated, detailed, contextualized threat intelligence, organizations can better anticipate malicious activity and utilize intelligence to speed detection around proven attacks. (Marc Solomon)

Know Your Audience When Speaking to Security Practitioners

How can security practitioners make sense of the vendor landscape and separate those who talk a good game from those who can execute, perform, and solve real problems for enterprises? (Joshua Goldfarb)

Application Security

Source Code Security Firm Cycode Launches With $4.6 Million in Funding

Cycode, a startup that provides solutions for protecting software source code, emerged from stealth mode on Tuesday with $4.6 million in seed funding.

Eduard KovacsSeptember 24, 2019

Cybercrime

Comodo Forums Hacked via Recently Disclosed vBulletin Vulnerability

A recently disclosed vBulletin vulnerability, which had a zero-day status for roughly two days last week, was exploited in a hacker attack targeting the...

Eduard KovacsOctober 1, 2019

Topics for 2023 Cybersecurity Insights Series

CISO Strategy

SecurityWeek Cyber Insights 2023 Series

SecurityWeek spoke with more than 300 cybersecurity experts to see what is bubbling beneath the surface, and examine how those evolving threats will present...

Kevin TownsendFebruary 13, 2023

Incident Response

Amazon’s Shuttering of Alexa Ranking Service Hits Cybersecurity Industry

Amazon has shut down Alexa.com.

Eduard KovacsMay 6, 2022

CISO Conversations

CISO Conversations: HP and Dell CISOs Discuss the Role of the Multi-National Security Chief

Joanna Burkey, CISO at HP, and Kevin Cross, CISO at Dell, discuss how the role of a CISO is different for a multinational corporation...

Kevin TownsendMay 10, 2023

CISO Conversations

CISO Conversations: Code42, BreachQuest Leaders Discuss Combining CISO and CIO Roles

In this issue of CISO Conversations we talk to two CISOs about solving the CISO/CIO conflict by combining the roles under one person.

Kevin TownsendMarch 1, 2023

CISO Strategy

Burnout in Cybersecurity – Can It Be Prevented?

Security professionals understand the need for resilience in their company’s security posture, but often fail to build their own psychological resilience to stress.

Kevin TownsendMarch 22, 2023

Hackers Stole Encrypted Backups, MFA Settings from GoTo, LastPass

Data Breaches

LastPass Says DevOps Engineer Home Computer Hacked

LastPass DevOp engineer's home computer hacked and implanted with keylogging malware as part of a sustained cyberattack that exfiltrated corporate data from the cloud...

Ryan NaraineFebruary 27, 2023

SECURITYWEEK NETWORK:

ICS:

SecurityWeek

Incident Response

Beyond the Hype of Data Science

Having looked at the inputs to a data science detection model, we can now turn our attention to what they actually deliver. And this is where things can get a little dicey if you’re not careful.

More from Wade Williamson

Latest News

Trending

CIEM Chat: How to Reduce Cloud Identity Risk

Virtual Event: Ransomware Resilience & Recovery Summit

People on the Move

Expert Insights

Navigating Vendor Speak: A Security Practitioner’s Guide to Seeing Through the Jargon

SD-WAN: Don’t Build a Dead End, Prepare for Future-Proof Secure Networking

You Against the World: The Offenders Dilemma

Why Intelligence Sharing Is Vital to Building a Robust Collective Cyber Defense Program

Know Your Audience When Speaking to Security Practitioners

Related Content

Application Security

Source Code Security Firm Cycode Launches With $4.6 Million in Funding

Cybercrime

Comodo Forums Hacked via Recently Disclosed vBulletin Vulnerability

CISO Strategy

SecurityWeek Cyber Insights 2023 Series

Incident Response

Amazon’s Shuttering of Alexa Ranking Service Hits Cybersecurity Industry

CISO Conversations

CISO Conversations: HP and Dell CISOs Discuss the Role of the Multi-National Security Chief

CISO Conversations

CISO Conversations: Code42, BreachQuest Leaders Discuss Combining CISO and CIO Roles

CISO Strategy

Burnout in Cybersecurity – Can It Be Prevented?

Data Breaches

LastPass Says DevOps Engineer Home Computer Hacked

SECURITYWEEK NETWORK:

ICS:

Having looked at the inputs to a data science detection model, we can now turn our attention to what they actually deliver. And this is where things can get a little dicey if you’re not careful.

More from Wade Williamson

Latest News

Trending

Daily Briefing Newsletter

CIEM Chat: How to Reduce Cloud Identity Risk

Virtual Event: Ransomware Resilience & Recovery Summit

People on the Move

Expert Insights

Navigating Vendor Speak: A Security Practitioner’s Guide to Seeing Through the Jargon

SD-WAN: Don’t Build a Dead End, Prepare for Future-Proof Secure Networking

You Against the World: The Offenders Dilemma

Why Intelligence Sharing Is Vital to Building a Robust Collective Cyber Defense Program

Know Your Audience When Speaking to Security Practitioners

Related Content

Application Security

Source Code Security Firm Cycode Launches With $4.6 Million in Funding

Cybercrime

Comodo Forums Hacked via Recently Disclosed vBulletin Vulnerability

CISO Strategy

SecurityWeek Cyber Insights 2023 Series

Incident Response

Amazon’s Shuttering of Alexa Ranking Service Hits Cybersecurity Industry

CISO Conversations

CISO Conversations: HP and Dell CISOs Discuss the Role of the Multi-National Security Chief

CISO Conversations

CISO Conversations: Code42, BreachQuest Leaders Discuss Combining CISO and CIO Roles

CISO Strategy

Burnout in Cybersecurity – Can It Be Prevented?

Data Breaches

LastPass Says DevOps Engineer Home Computer Hacked