Artificial Intelligence

Microsoft AI Researchers Expose 38TB of Data, Including Keys, Passwords and Internal Messages

Exposed data includes backup of employees workstations, secrets, private keys, passwords, and over 30,000 internal Microsoft Teams messages.

Ryan Naraine

Published

September 18, 2023

Researchers at Wiz have flagged another major security misstep at Microsoft that caused the exposure of 38 terabytes of private data during a routine open source AI training material update on GitHub.

The exposed data includes a disk backup of two employees’ workstations, corporate secrets, private keys, passwords, and over 30,000 internal Microsoft Teams messages, Wiz said in a note documenting the discovery.

Wiz, a cloud data security startup founded by ex-Microsoft software engineers, said the issue was discovered during routine internet scans for misconfigured storage containers. “We found a GitHub repository under the Microsoft organization named robust-models-transfer. The repository belongs to Microsoft’s AI research division, and its purpose is to provide open-source code and AI models for image recognition,” the company explained.

While sharing the files, Microsoft used an Azure feature called SAS tokens that allows data sharing from Azure Storage accounts. While the access level can be limited to specific files only; Wiz found that the link was configured to share the entire storage account — including another 38TB of private files.

“This URL allowed access to more than just open-source models. It was configured to grant permissions on the entire storage account, exposing additional private data by mistake,” Wiz noted.

“Our scan shows that this account contained 38TB of additional data — including Microsoft employees’ personal computer backups. The backups contained sensitive personal data, including passwords to Microsoft services, secret keys, and over 30,000 internal Microsoft Teams messages from 359 Microsoft employees,” it added.

In addition to what it describes as overly permissive access scope, Wiz found that the token was also misconfigured to allow “full control” permissions instead of read-only, giving attackers the power to delete and overwrite existing files.

“An attacker could have injected malicious code into all the AI models in this storage account, and every user who trusts Microsoft’s GitHub repository would’ve been infected by it,” Wiz warned.

Advertisement. Scroll to continue reading.

The repository’s primary function compounds the security concerns. Tasked with supplying AI training models, these blueprints come in a ‘ckpt‘ format, a creation of the widely-used TensorFlow and sculpted using Python’s pickle formatter. Wiz notes that the very format can be a gateway for arbitrary code execution.

According to Wiz, Microsoft’s security response team invalidated the SAS token within two days of initial disclosure in June this year. The token was replaced on GitHub a month later.

Microsoft has published its own blog post to explain how the data leak occurred and how such incidents can be prevented.

“No customer data was exposed, and no other internal services were put at risk because of this issue. No customer action is required in response to this issue,” the tech giant noted.

*updated with link to Microsoft’s blog post

In this article:Data Exposure, Featured, GitHub, Microsoft, Wiz

Cloud Security

Wiz Raises $1 Billion at $12 Billion Valuation

Cloud security giant Wiz has raised $1 billion, which brings the total funding to $1.9 billion, at a valuation of $12 billion.

Eduard Kovacs2 days ago

CISO Strategy

Microsoft Overhauls Cybersecurity Strategy After Scathing CSRB Report

Microsoft security chief Charlie Bell pledges significant reforms and a strategic shift to prioritize security above all other product features.

Ryan Naraine6 days ago

Artificial Intelligence

Why Using Microsoft Copilot Could Amplify Existing Data Quality and Privacy Issues

Microsoft provides an easy and logical first step into GenAI for many organizations, but beware of the pitfalls.

Alastair PatersonApril 30, 2024

Malware & Threats

Russian Cyberspies Deliver ‘GooseEgg’ Malware to Government Organizations

Russia-linked APT28 deploys the GooseEgg post-exploitation tool against numerous US and European organizations.

Ionut ArghireApril 23, 2024

Cloud Security

Wiz Acquires Gem Security, Pushes Security Tools Consolidation

Financial terms of the translation were not disclosed but reports out of Tel Aviv valued the deal in the range of $350 million.

Ryan NaraineApril 12, 2024

Malware & Threats

Threat Actors Manipulate GitHub Search to Deliver Malware

Checkmarx warns of a new attack relying on GitHub search manipulation to deliver malicious code.

Ionut ArghireApril 12, 2024

Data Breaches

US Government on High Alert as Russian Hackers Steal Critical Correspondence From Microsoft

The US government says Midnight Blizzard’s compromise of Microsoft corporate email accounts "presents a grave and unacceptable risk to federal agencies."

Ryan NaraineApril 11, 2024

Cloud Security

Microsoft Plugs Gaping Hole in Azure Kubernetes Service Confidential Containers

Patch Tuesday: Microsoft warns that unauthenticated hackers can take complete control of Azure Kubernetes clusters.

Ryan NaraineApril 9, 2024

SecurityWeek

Artificial Intelligence

Microsoft AI Researchers Expose 38TB of Data, Including Keys, Passwords and Internal Messages

Related Content

Cloud Security

Wiz Raises $1 Billion at $12 Billion Valuation

CISO Strategy

Microsoft Overhauls Cybersecurity Strategy After Scathing CSRB Report

Artificial Intelligence

Why Using Microsoft Copilot Could Amplify Existing Data Quality and Privacy Issues

Malware & Threats

Russian Cyberspies Deliver ‘GooseEgg’ Malware to Government Organizations

Cloud Security

Wiz Acquires Gem Security, Pushes Security Tools Consolidation

Malware & Threats

Threat Actors Manipulate GitHub Search to Deliver Malware

Data Breaches

US Government on High Alert as Russian Hackers Steal Critical Correspondence From Microsoft

Cloud Security

Microsoft Plugs Gaping Hole in Azure Kubernetes Service Confidential Containers