Artificial Intelligence

Microsoft AI Researchers Expose 38TB of Data, Including Keys, Passwords and Internal Messages

Exposed data includes backup of employees workstations, secrets, private keys, passwords, and over 30,000 internal Microsoft Teams messages.

Ryan Naraine

September 18, 2023

Researchers at Wiz have flagged another major security misstep at Microsoft that caused the exposure of 38 terabytes of private data during a routine open source AI training material update on GitHub.

The exposed data includes a disk backup of two employees’ workstations, corporate secrets, private keys, passwords, and over 30,000 internal Microsoft Teams messages, Wiz said in a note documenting the discovery.

Wiz, a cloud data security startup founded by ex-Microsoft software engineers, said the issue was discovered during routine internet scans for misconfigured storage containers. “We found a GitHub repository under the Microsoft organization named robust-models-transfer. The repository belongs to Microsoft’s AI research division, and its purpose is to provide open-source code and AI models for image recognition,” the company explained.

While sharing the files, Microsoft used an Azure feature called SAS tokens that allows data sharing from Azure Storage accounts. While the access level can be limited to specific files only; Wiz found that the link was configured to share the entire storage account — including another 38TB of private files.

“This URL allowed access to more than just open-source models. It was configured to grant permissions on the entire storage account, exposing additional private data by mistake,” Wiz noted.

“Our scan shows that this account contained 38TB of additional data — including Microsoft employees’ personal computer backups. The backups contained sensitive personal data, including passwords to Microsoft services, secret keys, and over 30,000 internal Microsoft Teams messages from 359 Microsoft employees,” it added.

In addition to what it describes as overly permissive access scope, Wiz found that the token was also misconfigured to allow “full control” permissions instead of read-only, giving attackers the power to delete and overwrite existing files.

“An attacker could have injected malicious code into all the AI models in this storage account, and every user who trusts Microsoft’s GitHub repository would’ve been infected by it,” Wiz warned.

Advertisement. Scroll to continue reading.

The repository’s primary function compounds the security concerns. Tasked with supplying AI training models, these blueprints come in a ‘ckpt‘ format, a creation of the widely-used TensorFlow and sculpted using Python’s pickle formatter. Wiz notes that the very format can be a gateway for arbitrary code execution.

According to Wiz, Microsoft’s security response team invalidated the SAS token within two days of initial disclosure in June this year. The token was replaced on GitHub a month later.

Microsoft has published its own blog post to explain how the data leak occurred and how such incidents can be prevented.

“No customer data was exposed, and no other internal services were put at risk because of this issue. No customer action is required in response to this issue,” the tech giant noted.

*updated with link to Microsoft’s blog post

Written By Ryan Naraine

Ryan Naraine is Editor-at-Large at SecurityWeek and host of the popular Security Conversations podcast series. He is a security community engagement expert who has built programs at major global brands, including Intel Corp., Bishop Fox and GReAT. Ryan is a founding-director of the Security Tinkerers non-profit, an advisor to early-stage entrepreneurs, and a regular speaker at security conferences around the world.

Latest News

Click to comment

CIEM Chat: How to Reduce Cloud Identity Risk

March 26, 2024

Join the session as we discuss the challenges and best practices for cybersecurity leaders managing cloud identities.

Virtual Event: Ransomware Resilience & Recovery Summit

April 17, 2024

SecurityWeek’s Ransomware Resilience and Recovery Summit helps businesses to plan, prepare, and recover from a ransomware incident.

Navigating Vendor Speak: A Security Practitioner’s Guide to Seeing Through the Jargon

As a security industry, we need to focus our energies on those professionals among us who know how to walk the walk. (Joshua Goldfarb)

SD-WAN: Don’t Build a Dead End, Prepare for Future-Proof Secure Networking

SD-WAN must be scalable, stable, secure, and fully operational to serve as a strong base for seamless modernization and progression to SASE. (Etay Maor)

You Against the World: The Offenders Dilemma

Foreign attackers have many more toolsets at their disposal, so we need to make sure we’re selective about our modeling, preparation and how we assess and fortify ourselves. (Tom Eston)

Why Intelligence Sharing Is Vital to Building a Robust Collective Cyber Defense Program

With automated, detailed, contextualized threat intelligence, organizations can better anticipate malicious activity and utilize intelligence to speed detection around proven attacks. (Marc Solomon)

Know Your Audience When Speaking to Security Practitioners

How can security practitioners make sense of the vendor landscape and separate those who talk a good game from those who can execute, perform, and solve real problems for enterprises? (Joshua Goldfarb)

Application Security

Source Code Security Firm Cycode Launches With $4.6 Million in Funding

Cycode, a startup that provides solutions for protecting software source code, emerged from stealth mode on Tuesday with $4.6 million in seed funding.

Eduard KovacsSeptember 24, 2019

Artificial Intelligence

AI Helps Crack NIST-Recommended Post-Quantum Encryption Algorithm

The CRYSTALS-Kyber public-key encryption and key encapsulation mechanism recommended by NIST for post-quantum cryptography has been broken using AI combined with side channel attacks.

Kevin TownsendFebruary 21, 2023

Risk Management

Cyber Insights 2023 | Supply Chain Security

The supply chain threat is directly linked to attack surface management, but the supply chain must be known and understood before it can be...

Kevin TownsendFebruary 2, 2023

CISO Conversations

CISO Conversations: CISOs in Cloud-based Services Discuss the Process of Leadership

SecurityWeek talks to Billy Spears, CISO at Teradata (a multi-cloud analytics provider), and Lea Kissner, CISO at cloud security firm Lacework.

Kevin TownsendAugust 15, 2023

Artificial Intelligence

Malicious Prompt Engineering With ChatGPT

The release of OpenAI’s ChatGPT in late 2022 has demonstrated the potential of AI for both good and bad.

Kevin TownsendJanuary 25, 2023

Cloud Security

Microsoft Cloud Hack Exposed More Than Exchange, Outlook Emails

Cloud security researcher warns that stolen Microsoft signing key was more powerful and not limited to Outlook.com and Exchange Online.

Ryan NaraineJuly 21, 2023

CISO Strategy

Okta Hack Blamed on Employee Using Personal Google Account on Company Laptop

Okta is blaming the recent hack of its support system on an employee who logged into a personal Google account on a company-managed laptop.

Ryan NaraineNovember 3, 2023

Cybersecurity Funding

The Five Stories That Shaped Cybersecurity in 2022

2022 Cybersecurity Year in Review: Top news headlines and trends that impacted the security ecosystem

Ryan NaraineDecember 29, 2022

SECURITYWEEK NETWORK:

ICS:

SecurityWeek

Artificial Intelligence

Microsoft AI Researchers Expose 38TB of Data, Including Keys, Passwords and Internal Messages

More from Ryan Naraine

Latest News

Trending

CIEM Chat: How to Reduce Cloud Identity Risk

Virtual Event: Ransomware Resilience & Recovery Summit

People on the Move

Expert Insights

Navigating Vendor Speak: A Security Practitioner’s Guide to Seeing Through the Jargon

SD-WAN: Don’t Build a Dead End, Prepare for Future-Proof Secure Networking

You Against the World: The Offenders Dilemma

Why Intelligence Sharing Is Vital to Building a Robust Collective Cyber Defense Program

Know Your Audience When Speaking to Security Practitioners

Related Content

Application Security

Source Code Security Firm Cycode Launches With $4.6 Million in Funding

Artificial Intelligence

AI Helps Crack NIST-Recommended Post-Quantum Encryption Algorithm

Risk Management

Cyber Insights 2023 | Supply Chain Security

CISO Conversations

CISO Conversations: CISOs in Cloud-based Services Discuss the Process of Leadership

Artificial Intelligence

Malicious Prompt Engineering With ChatGPT

Cloud Security

Microsoft Cloud Hack Exposed More Than Exchange, Outlook Emails

CISO Strategy

Okta Hack Blamed on Employee Using Personal Google Account on Company Laptop

Cybersecurity Funding

The Five Stories That Shaped Cybersecurity in 2022

SECURITYWEEK NETWORK:

ICS:

More from Ryan Naraine

Latest News

Trending

Daily Briefing Newsletter

CIEM Chat: How to Reduce Cloud Identity Risk

Virtual Event: Ransomware Resilience & Recovery Summit

People on the Move

Expert Insights

Navigating Vendor Speak: A Security Practitioner’s Guide to Seeing Through the Jargon

SD-WAN: Don’t Build a Dead End, Prepare for Future-Proof Secure Networking

You Against the World: The Offenders Dilemma

Why Intelligence Sharing Is Vital to Building a Robust Collective Cyber Defense Program

Know Your Audience When Speaking to Security Practitioners

Related Content

Application Security

Source Code Security Firm Cycode Launches With $4.6 Million in Funding

Artificial Intelligence

AI Helps Crack NIST-Recommended Post-Quantum Encryption Algorithm

Risk Management

Cyber Insights 2023 | Supply Chain Security

CISO Conversations

CISO Conversations: CISOs in Cloud-based Services Discuss the Process of Leadership

Artificial Intelligence

Malicious Prompt Engineering With ChatGPT

Cloud Security

Microsoft Cloud Hack Exposed More Than Exchange, Outlook Emails

CISO Strategy

Okta Hack Blamed on Employee Using Personal Google Account on Company Laptop

Cybersecurity Funding

The Five Stories That Shaped Cybersecurity in 2022