Microsoft on Monday published a summary of its artificial intelligence (AI) red teaming efforts, and shared guidance and resources that can help make AI safer and more secure.
The tech giant said its AI red teaming journey started more than two decades ago, but it launched a dedicated AI Red Team in 2018. It has since been working on developing AI security resources that can be used by the whole industry.
The company has now shared five key lessons learned from its red teaming efforts. The first is that AI red teaming is now an umbrella term for probing security, as well as responsible AI (RAI) outcomes. In the case of security, it can include finding vulnerabilities and securing the underlying model, while in the case of RAI outcomes the Red Team’s focus is on identifying harmful content and fairness issues, such as stereotyping.
Microsoft also pointed out that AI red teaming focuses not only on potential threats from malicious actors, but also on how AI can generate harmful and other problematic content when users interact with it.
AI systems are constantly evolving and changing, at a faster pace compared to traditional software systems, which is why it’s important to conduct multiple rounds of red teaming and automate measurements and monitoring of the system.
This is also needed because AI systems are probabilistic — the same input can generate different outputs. Conducting multiple red teaming rounds in the same operation can reveal issues that a single attempt may not identify.
Lastly, Microsoft highlighted that — just like in the case of traditional security — the mitigation of AI failures requires a defense-in-depth approach that can include the use of classifiers for flagging harmful content, leveraging metaprompt to guide behavior, and limiting conversational drift.
Microsoft has shared several resources that could be useful to various groups of individuals interested in AI security. These resources include a guide to help Azure OpenAI model application developers create an AI red team, a bug bar for triaging attacks on machine learning (ML) systems for incident responders, and an AI risk assessment checklist for ML engineers.
The resources also include threat modeling guidance for developers, ML failure mode documentation for policymakers and engineers, and enterprise security and governance guidance for Azure ML customers.
Microsoft shared guidance and resources just a few weeks after Google introduced its AI Red Team, which is tasked with carrying out complex technical attacks on artificial intelligence systems.
Related: Now’s the Time for a Pragmatic Approach to New Technology Adoption
Related: ChatGPT Hallucinations Can Be Exploited to Distribute Malicious Code Packages
Related: AntChain, Intel Create New Privacy-Preserving Computing Platform for AI Training

Eduard Kovacs (@EduardKovacs) is a managing editor at SecurityWeek. He worked as a high school IT teacher for two years before starting a career in journalism as Softpedia’s security news reporter. Eduard holds a bachelor’s degree in industrial informatics and a master’s degree in computer techniques applied in electrical engineering.
More from Eduard Kovacs
- CISA Warns of Old JBoss RichFaces Vulnerability Being Exploited in Attacks
- NIST Publishes Final Version of 800-82r3 OT Security Guide
- Johnson Controls Hit by Ransomware
- Verisoul Raises $3.25 Million in Seed Funding to Detect Fake Users
- Government Shutdown Could Bench 80% of CISA Staff
- Google Rushes to Patch New Zero-Day Exploited by Spyware Vendor
- macOS 14 Sonoma Patches 60 Vulnerabilities
- New GPU Side-Channel Attack Allows Malicious Websites to Steal Data
Latest News
- Bankrupt IronNet Shuts Down Operations
- AWS Using MadPot Decoy System to Disrupt APTs, Botnets
- Generative AI Startup Nexusflow Raises $10.6 Million
- In Other News: RSA Encryption Attack, Meta AI Privacy, ShinyHunters Hacker Guilty Plea
- Researchers Extract Sounds From Still Images on Smartphone Cameras
- National Security Agency is Starting an Artificial Intelligence Security Center
- CISA Warns of Old JBoss RichFaces Vulnerability Being Exploited in Attacks
- Hackers Set Sights on Apache NiFi Flaw That Exposes Many Organizations to Attacks
