A team of Microsoft researchers has been working on improving fuzzing techniques by using deep neural networks, and initial tests have shown promising results.
Fuzzing is used to find software vulnerabilities – particularly memory corruption bugs – by injecting malformed or semi-malformed data into the targeted application. If the software crashes or behaves unexpectedly, it could indicate the presence of a security flaw.
There are three types of fuzzing: whitebox fuzzing, which tests source or disassembled code; blackbox fuzzing, which does not require access to source code; and greybox fuzzing, which is similar to blackbox fuzzing but uses results from previous executions for feedback.
Experts at Microsoft have attempted to improve this feedback loop using a type of machine learning called deep neural networks (DNN). Neural networks, a set of algorithms modeled after the human brain, are designed to recognize patterns in an effort to help classify and cluster data.
Microsoft researchers have been trying to use neural networks for a learning technique that relies on patterns in previous fuzzing iterations to guide future iterations.
“The neural models learn a function to predict good (and bad) locations in input files to perform fuzzing mutations based on the past mutations and corresponding code coverage information,” the researchers said.
The method has been implemented in American Fuzzy Lop (AFL), a popular open source fuzzer developed by Google researcher Michal Zalewski. Tests were conducted against parsers for the ELF, PDF, PNG and XML file formats.
The tests showed significant improvements in the results obtained with the neural AFL compared to the original AFL, except for PDF files, which experts believe may be too large. Improvements were seen in terms of code coverage, unique code paths and crashes.
The team behind the project believes this approach can be applied to any fuzzer, not just AFL.
“We believe our neural fuzzing research project is just scratching the surface of what can be achieved using deep neural networks for fuzzing,” explained Microsoft’s William Blum. “Right now, our model only learns fuzzing locations, but we could also use it to learn other fuzzing parameters such as the type of mutation or strategy to apply. We are also considering online versions of our machine learning model, in which the fuzzer constantly learns from ongoing fuzzing iterations.”
Blum is the lead of the engineering team for Microsoft Security Risk Detection, a recently launched cloud-based fuzzing service that uses artificial intelligence to find bugs and vulnerabilities in applications. The results of the research into the use of neural networks for fuzzing could help improve this service.
Another recently launched Microsoft tool designed for finding memory corruption bugs, VulnScan, might also be added to the Security Risk Detection service.