Endpoint Security

Machine Learning CrowdStrike Joins VirusTotal

On May 4, VirusTotal (VT) made two specific changes to its policies that were at the time seen as particularly aimed at the nex

Kevin Townsend

Published

August 26, 2016

<p><span><span style="font-family: "trebuchet ms", geneva;"><span>On May 4, VirusTotal (VT) <a href="http://www.securityweek.com/virustotal-policy-change-rocks-anti-malware-industry">made two specific changes</a> to its policies that were at the time seen as particularly aimed at the <a href="http://www.securityweek.com/hunting-snark-machine-learning-artificial-intelligence-and-cognitive-computing" title="Threat Hunting with Machine Learning, Artificial Intelligence, and Cognitive Computing">nex

On May 4, VirusTotal (VT) made two specific changes to its policies that were at the time seen as particularly aimed at the next-gen (machine-learning, signature-less) endpoint security vendors. The first required that all companies wishing to make full use of the VT API would need to register their detection engines with the public-facing VT interface; while the second change effectively insisted on membership of the Anti Malware Testing Standards Organization (AMTSO). Crowdstrike has now become the first next-gen vendor to comply with both requirements.

The move by Google-owned VT was welcomed by the original signature-based anti-malware industry, who felt they were being unfairly treated by aggressive next-gen marketing — often based on public VT results and wrongly claiming that the traditional vendors are simply blacklist signature detection systems.

There are some signs that this aggressive marketing by the next-gen vendors is toning down, but it is still continuing today. Nevertheless, this move by CrowdStrike is a positive sign that at least one next-gen vendor is willing to integrate into the overall anti-malware market for the benefit of all users.

By ‘joining’ VirusTotal, CrowdStrike has committed to the VT policies; which include, “VirusTotal should not be used to generate comparative metrics between different antivirus products;” and “VirusTotal should not be used as deceptive means to discredit or to validate claims for or against a legitimate participant in the anti-malware industry.”

“We are the only machine learning signature-less vendor right now. We expect and encourage others to join,” CrowdStrike’s chief scientist Dr. Sven Krasser told SecurityWeek. “We want to work with the community to contribute to community standards. As a vendor offering next-gen AV solutions and advanced threat prevention, we should (and we are) also granting access to our data.”

A second criticism from the traditional anti-malware vendors is that next-gen vendors have been reluctant to submit their products to independent third-party testing. CrowdStrike is also leading a change in attitude over testing.

Simon Edwards, director of third-party testing company SELabs, told SecurityWeek, “Since the beginning of the year I’ve noticed a much greater interest in testing coming from these companies.” Indeed, CrowdStrike is an example of this, submitting itself for testing by SELabs under AMTSO guidelines in July 2016. It did rather well, achieving 100% malware detection for both known and unknown samples, with a 0% false positive rate.

All of this begs a major question: if a next-gen endpoint security vendor can integrate its machine learning detection system into VT, why can’t the traditional vendors do the same? After all, all traditional anti-malware companies have employed machine learning techniques for many years.

Advertisement. Scroll to continue reading.

The answer would seem to be that traditional vendors employ machine learning to train logic bundles that are used on the client system, designed, said F-Secure’s Andrew Patel, “to detect suspiciousness based on the structure of a file or its behavior.” The logic bundles are then delivered to the client by regular updates — and it is this process that cannot easily be replicated on VirusTotal. It’s “not only super-resource intensive,” said Patel, “it’s hell to maintain; especially when you consider that VT’s systems already contain over 50 products. Even if VT had the infrastructure available to do this for 300,000 samples times 50 vendors per day, they’d still need to hire people to maintain the environment and products.”

So the big difference between the two models is that next-gen vendors design the algorithms and turn them loose on the customer, while the traditional vendors keep the machine learning at the back end; largely, said one vendor, “because we believe that machine learning still requires a degree of human oversight.”

Early machine learning generated a large number of false positives — but it has improved dramatically over recent years (as proven by CrowdStrike’s 0% false positive certificate from SELabs). It might be time for the traditional vendors to overhaul their marketing philosophy. CrowdStrike has agreed not to use VT results to promote its own ‘scores’ above other VT results (because that is misleading). But consumers have always done this, and they will continue to do this. CrowdStrike is effectively playing by the trad vendors’ rules to its own advantage.

CrowdStrike was involved in the incident response effort following the DNC hack, and discovered evidence of two separate Russian intelligence gathering actors: CozyDuke and Fancy Bear.

In this article:

Related Content