Microsoft this week announced the open source availability of Python code for “CyberBattleSim,” a research toolkit that supports simulating complex computer systems.
Designed to help advance artificial intelligence and machine learning, the experimental research project was designed to aid in the analysis of how “autonomous agents operate in a simulated enterprise environment using high-level abstraction of computer networks and cybersecurity concepts.”
CyberBattleSim allows for the training of automated agents, and provides a Python-based OpenAI Gym interface for that. In the simulated environments, defenders can leverage reinforcement learning algorithms and set up various cybersecurity challenges.
Reinforcement learning, Microsoft explains, is a type of machine learning that teaches autonomous agents to make decisions based on the interaction with the environment: agents improve strategies through repeated experience, similarly to playing a video game over and over to become better at it.
In software security, reinforcement learning involves the use of agents that play the role of attackers and defenders, and the analysis of their evolution in the simulated environment. The attacker seeks to steal information, while the attacker focuses on blocking the attacker or mitigating their actions.
CyberBattleSim employs OpenAI Gym for building interactive environments, and focuses on the lateral movement phase of a cyber-attack. The project simulates a fixed network with predefined vulnerabilities that the attacker model can exploit for lateral movement, while a defender agent seeks to detect the attacker and contain the intrusion.
“The simulation Gym environment is parameterized by the definition of the network layout, the list of supported vulnerabilities, and the nodes where they are planted. The simulation does not support machine code execution, and thus no security exploit actually takes place in it,” Microsoft explains.
The simulated computer network consists of systems running multiple platforms and aims to illustrate how the use of the latest operating systems and keeping them updated can deliver improved protections. Using the Gym interface, defenders can instantiate automated agents and then analyze their evolution in the environment.
“To perform well, agents now must learn from observations that are not specific to the instance they are interacting with. They cannot just remember node indices or any other value related to the network size. They can instead observe temporal features or machine properties,” the tech giant explains.
Microsoft says CyberBattleSim has a highly abstract nature and cannot be applied to real-world systems, which provides protection against the nefarious use of the trained automated agents.