Security Researchers Have Proposed a New and Effective Way to Detect Credential Spearphishing Attacks in the Enterprise
A new research paper, ‘Detecting Credential Spearphishing Attacks in Enterprise Settings‘, was awarded the Facebook Internet Defense Prize at the 26th USENIX Security Symposium in Vancouver, BC, August 16-18, 2017. The paper proposes and evaluates a methodology for effectively detecting credential spear phishing attacks in corporate networks while achieving a very low number of false positives.
The paper (PDF) was authored by Grant Ho, University of California, Berkeley; Aashish Sharma, Lawrence Berkeley National Laboratory; Mobin Javed, University of California, Berkeley and International Computer Science Institute; and Professor Vern Paxson, University of California, Berkeley, International Computer Science Institute.
The paper is important because it promises an effective mitigation for one of cybersecurity’s most pernicious threats: credential spear phishing. While malicious attachment spearphishing has something that can be sought and detected by increasingly sophisticated security controls, credential spear phishing contains nothing but a link to an URL that probably has a good reputation.
Credential spear phishing is, furthermore, an issue that does not lend itself to a machine learning (ML) solution — the difficulty is that there are too few known anomalies in any given dataset from which the algorithm can successfully learn. Since machine learning ‘learns’ from past behaviors, any previously unseen attacker is to some extent invisible to machine learning algorithms because there is no prior history from which to learn.
Using a dataset of 4 years of emails — about 370 million — supplied by the Lawrence Berkeley National Laboratory (LBNL), the researchers first analyzed the different stages of an attack, and then developed a new anomaly detection technique called DAS. The dataset used contained 19 known spearphishing campaigns.
“With such a small number of known spearphishing instances, standard machine learning approaches seem unlikely to succeed: the training set is too small and the class imbalance too extreme,” notes the paper.
But by breaking down the taxonomy of credential spear phishing, the researchers demonstrate that enterprises can develop their own form of reputation monitoring from enterprise traffic monitoring. For example, a spearphishing email might seek to persuade a user to visit a particular URL.
The researchers suggest that the traditional reputation of an URL is not as important as how often, if ever, this user (or all users) have visited the same URL. Such information is usually already available from existing controls, such as network intrusion detection systems (NIDS). This basic approach of working with data that is generally — certainly within an enterprise environment — already available means that the cost of implementing DAS should not be overly expensive. “Our work draws on the SMTP logs, NIDS logs, and
LDAP logs from LBNL,” point out the researchers.
The bane of all anomaly detection systems is the number of false positives. While anomaly detection systems — especially those designed to detect malware — are frequently analyzed by third-party testing organizations, there are few statistics specifically around the false positive rate (FPR) for spear phishing detection.
SecurityWeek asked Simon Edwards, director at independent testing organization SE Labs Ltd (and chairman of the board at the Anti Malware Testing Standards Organization) for his view. Although he had no relevant statistics, he has personal experience of the false positive problem in anomaly detection.
“Whereas you and I might expect a product to block an installation and alert the user,” Edwards told SecurityWeek, “what I’ve found is that the legitimate software appears to install correctly but then crashes or otherwise fails at some point in the future.”
He installed a new scanner on Windows 10. “Everything appeared to be fine until I tried to actually scan something. I received an error and no output. After a long time troubleshooting I eventually checked the logs of the next-gen product I had running on the same system. Lo and behold, a DLL had been quarantined. I marked it as clean, reinstalled and all was well again. Very annoying!”
This made him wonder what else had been quarantined. “The logs also showed all sorts of other legitimate components (rarely the full app) had been quarantined. Most of this was for the rubbish you see pre-installed on Lenovo laptops, so I’d not experienced any problems.” The point, however, is clear: false positives are a major problem for machine learning detectors.
Every one of the false positives needs to be triaged by the security team. One common statistic often quoted is an FPR of between 1% and 10%. The paper’s researchers point out that this is not acceptable. “Although quite low, an FPR of even 1% is too high for practical enterprise settings; our dataset contains over 250,000 emails per day, so an FPR of 1% would lead to 2,500 alerts each day.” If an average of 5 minutes were spent on these alerts, it would require more than 200 hours labor every day.
“In contrast,” the paper claims, “our detector can detect real-world attacks, including those from a previously unseen attacker, with a budget of 10 alerts per day.” While 10 alerts per day was the target, the achieved figure was a little different. From a random selection of 100 days, DAS returned figures ranging from 19 to zero alerts per day — the median, however, was 7 alerts per day (well below the target of 10).
Of course, such figures ar
e meaningless if the alerts are false positives, and real spearphishing attempts are missed. However, since the dataset used to develop and test the technique was historical data supplied by LBNL, the incidence of spearphishing was largely already known. The researchers’ tests discovered all but one of the known spearphishing attacks in the dataset; but also uncovered two previously undiscovered spearphishing attacks against LBNL.
“Ultimately,” conclude the researchers, “our detector’s ability to identify both known and novel attacks, and the low volume and burden of alerts it imposes, suggests that our approach provides a practical path towards detecting credential spearphishing attacks.”
DAS works. The only remaining question is whether this is simply theoretical research, or something that realistically can be implemented. One real-life implementation already exists. “Because of our approach’s ability to detect a wide range of attacks, including previously undiscovered attacks, and its low false positive cost, LBNL has implemented and deployed a version of our detector.”
One of the authors, Professor Vern Paxson, is also co-founder and chief scientist at Corelight (a network visibility company). “We’re looking at DAS to see whether it can complement our existing products,” he told SecurityWeek. “But,” he added, “our research is free and publicly available, and we hope that other vendors will take it up.”