Connect with us

Hi, what are you looking for?

SecurityWeekSecurityWeek

Artificial Intelligence

Claude AI APIs Can Be Abused for Data Exfiltration

An attacker can inject indirect prompts to trick the model into harvesting user data and sending it to the attacker’s account.

Claude

Attackers can use indirect prompt injections to trick Anthropic’s Claude into exfiltrating data the AI model’s users have access to, a security researcher has discovered.

The attack, Johann Rehberger of Embrace The Red explains, abuses Claude’s Files APIs, and is only possible if the AI model has network access (a feature enabled by default on certain plans and meant to allow Claude to access certain resources, such as code repositories and Anthropic APIs).

The attack is relatively straightforward: an indirect prompt injection payload can be used to read user data and store it in a file in Claude Code Interpreter’s sandbox, and then to trick the model into interacting with the Anthropic API using a key provided by the attacker.

The code in the payload requests Claude to upload the Code Interpreter file from the sandbox but, because the attacker’s API key is used, the file is uploaded to the attacker’s account.

“With this technique an adversary can exfiltrate up to 30MB at once according to the file API documentation, and of course we can upload multiple files,” Rehberger explains.

After the initial attempt was successful, Claude refused the payload, especially with the API key in plain text, and Rehberger had to mix benign code in the prompt injection, to convince Claude that it does not have malicious intent.

Advertisement. Scroll to continue reading.

The attack starts with the user loading a malicious document received from the attacker in Claude for analysis. The exploit code hijacks the model, which follows the malicious instructions to harvest the user’s data, save it to the sandbox, and then call the Anthropic File API to send it to the attacker’s account.

According to the researcher, the attack can be used to exfiltrate the user’s chat conversations, which are saved by Claude using the newly introduced ‘memories’ feature. The attacker can view and access the exfiltrated file in their console.

The researcher disclosed the attack to Anthropic via HackerOne on October 25, but the report was closed with the explanation that this was a model safety issue and not a security vulnerability.

However, after publishing information on the attack, Rehberger was notified by Anthropic that the data exfiltration vulnerability is in-scope for reporting.

Anthropic’s documentation underlines the risks associated with Claude having network access and of potential attacks carried out via external files or websites leading to code execution and information leaks. It also provides recommended mitigations against such attacks.

SecurityWeek has emailed Anthropic to inquire whether the company plans to devise a mitigation for such attacks.

Related: All Major Gen-AI Models Vulnerable to ‘Policy Puppetry’ Prompt Injection Attack

Related: Nvidia Triton Vulnerabilities Pose Big Risk to AI Models

Related: AI Sidebar Spoofing Puts ChatGPT Atlas, Perplexity Comet and Other Browsers at Risk

Related: Microsoft: Russia, China Increasingly Using AI to Escalate Cyberattacks on the US

Written By

Ionut Arghire is an international correspondent for SecurityWeek.

Daily Briefing Newsletter

Subscribe to the SecurityWeek Email Briefing for the latest cybersecurity threats, trends, and expert insights.

Trending

Daily Briefing Newsletter

Subscribe to the SecurityWeek Email Briefing to stay informed on the latest threats, trends, and technology, along with insightful columns from industry experts.

Today’s attackers are no longer breaking in — they’re logging in. Join this live webinar as we break down the modern identity attack chain and examine how recent breaches exploited weaknesses in authentication, identity verification, and access management processes.

Register

AI has accelerated both sides of the fight. Adversaries are weaponizing vulnerabilities faster, while defenders are racing to ship detections and configurations. Join this live webinar as we explore how to prove your controls actually hold against new threats, map your security maturity, and unite breach simulation with automated pentesting into a single, coordinated program.

Register

People on the Move

SolarWinds has appointed Justin Henkel as Chief Information Security Officer.

J. Paul Haynes has joined Cinchy as Chief Executive Officer.

Hatem Naguib has become Chief Executive Officer at Sysdig.

More People On The Move

Expert Insights

Four decades of incident response experience suggest that exploits are often the symptom, not the root cause, of today’s cybersecurity failures.

Daily Briefing Newsletter

Subscribe to the SecurityWeek Email Briefing to stay informed on the latest cybersecurity news, threats, and expert insights. Unsubscribe at any time.