Security Experts:

Connect with us

Hi, what are you looking for?


Endpoint Security

Microsoft, Intel Introduce ‘STAMINA’ Approach to Malware Detection

Microsoft and Intel have been working together on a new approach to malware detection that involves deep learning and the representation of malware as images.

Microsoft and Intel have been working together on a new approach to malware detection that involves deep learning and the representation of malware as images.

Referred to as STAtic Malware-as-Image Network Analysis (STAMINA), the research leverages Intel’s previous work on static malware classification through deep transfer learning and applies it to a real-world dataset from Microsoft to determine its practical value.

The approach is based on the inspection of malware binaries plotted as grayscale images, which has revealed that there are textural and structural similarities between binaries from the same malware families, and differences between different families or between malware and benign software.

In their whitepaper on STAMINA, researchers from Intel (Li Chen and Ravi Sahita) and Microsoft (Jugal Parikh and Marc Marino) argue that the classic malware detection approach that relies on signature matching is becoming less straightforward due to the rapid increase in signatures, while static and dynamic approaches might not be accurate or time-efficient.

STAMINA, the researchers explain, consists of four steps: preprocessing (image conversion), transfer learning, evaluation, and interpretation.

Preprocessing involves pixel conversion (a pixel stream is created: every byte gets a value between 0 and 255, directly corresponding to pixel intensity), reshaping (pixel streams are turned into two dimensions: width and height are determined by the file size after conversion) and resizing (“to 224 or 299 so that the image models trained on ImageNet can be used for fine tuning on the images”).

Next, transfer learning is employed to train a malware classifier for static malware classification. The step is performed on the malware and benign images during the preprocessing step, but the researchers note that, in practice, it would be difficult to train an entire deep neural network from scratch, due to the limitation of datasets.

“What has been done in the computer vision space is that, for specific tasks, models pre-trained on a large number of images are used, and transfer learning is conducted on target tasks,” the researchers note.

During the evaluation step, the researchers look at the accuracy of their method, “false positive rate, precision, recall, F1 score, and area under the receiver operating curve (ROC).” The study was performed on a Microsoft dataset that included 2.2 million malware binary hashes, along with 10 columns of data information (split into 60:20:20 segments for training, validation, and test sets).

“In particular, per feedback from malware analysis practitioners, we also reported recall at 0.1% –10% false positive rate via ROC,” the whitepaper reads.

The tests revealed that STAMINA can achieve a 99.07% accuracy with a false positive rate at 2.58% (precision is at 99.09% and recall at 99.66%).

However, the approach is only effective when applied to small-size applications. For larger-size software, STAMINA is less effective, as the software cannot convert “billions of pixels into JPEG images” and then resize them, making metadata-based methods more advantageous in such circumstances.

“For future work, we would like to evaluate hybrid models of using intermediate representations of the binaries and information extracted from binaries with deep learning approaches –these datasets are expected to be bigger but may provide higher accuracy. We also will continue to explore platform acceleration optimizations for our deep learning models so we can deploy such detection techniques with minimal power and performance impact to the end-user,” the researchers conclude.

Related: What Deep Learning Means for CyberSecurity

Related: Adoption of AI-enhanced Cybersecurity is Growing Rapidly: Report

Related: Are Artificial Intelligence and Machine Learning Just a Temporary Advantage to Defenders?

Written By

Ionut Arghire is an international correspondent for SecurityWeek.

Click to comment

Daily Briefing Newsletter

Subscribe to the SecurityWeek Email Briefing to stay informed on the latest threats, trends, and technology, along with insightful columns from industry experts.

Join this webinar to learn best practices that organizations can use to improve both their resilience to new threats and their response times to incidents.


Join this live webinar as we explore the potential security threats that can arise when third parties are granted access to a sensitive data or systems.


Expert Insights

Related Content

Application Security

Cycode, a startup that provides solutions for protecting software source code, emerged from stealth mode on Tuesday with $4.6 million in seed funding.

Management & Strategy

SecurityWeek examines how a layoff-induced influx of experienced professionals into the job seeker market is affecting or might affect, the skills gap and recruitment...


The changing nature of what we still generally call ransomware will continue through 2023, driven by three primary conditions.


No one combatting cybercrime knows everything, but everyone in the battle has some intelligence to contribute to the larger knowledge base.


A recently disclosed vBulletin vulnerability, which had a zero-day status for roughly two days last week, was exploited in a hacker attack targeting the...


Twenty-one cybersecurity-related M&A deals were announced in December 2022.

Management & Strategy

Industry professionals comment on the recent disruption of the Hive ransomware operation and its hacking by law enforcement.

CISO Strategy

SecurityWeek spoke with more than 300 cybersecurity experts to see what is bubbling beneath the surface, and examine how those evolving threats will present...