Artificial Intelligence

Decades-Old Bash Tricks Expose AI Coding Agents to Supply Chain Attacks

Decades-old Bash shell tricks can bypass safeguards in most open source AI coding agents, potentially turning malicious repositories into supply chain attack vectors.

Kevin Townsend

| June 30, 2026 (9:00 AM ET)

Bash (Bourne Again SHell), the 1989 GNU rewrite of the original Linux Bourne Shell, can still cause problems more than three decades later through its Bash Tricks. Adversa AI has discovered a structural security flaw in multiple open source AI agents. It’s not a specific bug but a process that can get malicious Bash instructions ingested into the agent, and from there into whatever the agent does – typically with the operator’s approval.

Adversa calls this structural issue GuardFall.

“We tested eleven popular open source agents, including Hermes, OpenCode, Roo-code, and others,” explains Omer Ben Simon, lead researcher at Adversa AI. “Ten leave the gap open in one of four ways; and only one closes it.”

The ‘gap’ is a failure to guard the agent against the decades old Bash shell tricks, such as quote removal and $IFS spacing. Since these agents run with a developer’s full account authority, this can radiate into a major supply chain risk.

“If an engineer uses a vulnerable agent to read a poisoned README or Makefile from a malicious repository,” continues Ben Simon, “the agent can be tricked into silently executing commands that exfiltrate AWS credentials or wipe whole dev environments – especially in CI pipelines where ‘auto-yes’ modes are default.”

The full Adversa report explains, “We call the pattern GuardFall: bypasses against pattern-based shell guards in agentic coding tools, where Bash unwinds the obfuscation after the guard has let the command through.”

Advertisement. Scroll to continue reading.

The trigger for the research was finding a NousResearch/hermes-agent approval gate bypass via shell rewrites against a 30-pattern regex denylist. This prompted Adversa to survey and examine the most popular open-source coding agents and computer use agents as of May 2026, based on GitHub star count and community activity.

Not all of the agents failed all of the Bash tricks used by Adversa, but the bottom line is that only one of the 11 tested agents blocked all of the tricks. The tricks are described under five ‘classes’ (A through E) within the report. Class E, the most successful, is described as “Alternative argv shapes for the same destructive effect.”

“Class E survives the most guards, including the strongest tokenized guard in our survey,” explains the report, “because per-flag reasoning requires knowing, for each binary, which flag combinations flip it from benign to destructive.”

However, just as bugs can exist but be exploitable only under certain conditions, so these guard bypasses rely on their own preconditions. For example, they only work if the language model cooperates.

If you ask the AI model directly to “run this: rm” (where rm is a command that deletes files), the model will typically refuse, recognizing it as dangerous. But with indirect or disguised requests, perhaps contained within a Makefile target, the command is more likely to be accepted without objection.

The research examines whether commands embedded by an attacker in content that is ingested by the agent (from a malicious MCP server, from a fetched web page or multiple other possible sources) will be enacted by the agent. The answer is too often yes. The agent then emits a destructive shell command that runs with the operator’s authority – but only if auto-execute mode is on, or a sandbox is switched to local mode.

It’s a complex process to exploit GuardFall, but complexity hasn’t stopped bad actors in the past. For the sake of their users, open source agent maintainers should prevent such Bash tricks being possible rather than rely on the obscurity of the process.

Continue was the only agent able to maintain a guard against Adversa’s tests. “Of 21 bypass cases submitted to the evaluator, 0 reach allowedWithoutPermission, and all 12 canonical-destructive cases are correctly downgraded,” say the researchers. “The design is not perfect – Class C inside a quoted argument and the full long tail of Class E (per-argv-flag reasoning) remain open – but it is the only agent in our survey that closes the structural majority of the surface.”

The researchers studied how this was achieved, built on it, and developed their own set of recommendations to stop GuardFall and prevent the danger from invisible Bash trickery getting into the supply chain. Several of these involve guards placed around the agent.

For example, “Run agents from a scoped shell with $HOME redirected. A one-line wrapper (HOME=$HOME/.agent-sandbox-$RANDOM agent …) keeps the project directory but removes ~/.ssh/, ~/.aws/, shell history, and the other secrets in $HOME: the largest credential-exfiltration surface. This is the strongest stopgap because it is always-on and has no documented one-flag opt-out.”

Other options include disabling auto-yes modes, auditing repo-shipped configs, and blocking agent execution on fork PRs. In the end, however, these are all only stopgap solutions. “A guard inspects raw text, while system shell (Bash) expands, unquotes, and rewrites text before running it.” So, there is a mismatch between what the agent may think it is running, and what Bash actually runs. This is the structural gap exploited by Adversa’s Bash tricks.

The only long term solution is for the open source agent maintainers to implement a Continue-style tokenize‑and‑canonicalize evaluator guard inside the agent itself.

Learn More at the AI Risk Summit | Ritz-Carlton, Half Moon Bay

Written By Kevin Townsend

Kevin Townsend is a Senior Contributor at SecurityWeek. He has been writing about high tech issues since before the birth of Microsoft. For the last 15 years he has specialized in information security; and has had many thousands of articles published in dozens of different magazines – from The Times and the Financial Times to current and long-gone computer magazines.

Latest News

Webinar: Why Email Security Keeps Failing (And What Has to Change)

July 8, 2026

Join this live webinar as we break down why email-layer defenses alone can't keep pace with the modern phishing ecosystem, how agentic AI is changing the capacity equation for security teams, and more.

Virtual Event: 2026 Cloud Security Summit

July 16, 2026

This year's summit will help organizations learn how to utilize tools, controls, and design models needed to properly secure cloud environments. Interact with leading solution providers and other end users facing similar challenges in securing a variety of cloud deployments.

SECURITYWEEK NETWORK:

ICS:

SecurityWeek

Artificial Intelligence

Decades-Old Bash Tricks Expose AI Coding Agents to Supply Chain Attacks

More from Kevin Townsend

Latest News

Trending

Webinar: Why Email Security Keeps Failing (And What Has to Change)

Virtual Event: 2026 Cloud Security Summit

People on the Move

Expert Insights

The AI Token Costs That Can Break Cybersecurity

When Information Becomes the Attack Surface – Understanding AI Agent Traps

What the Latest ShinyHunters Breaches Reveal About Modern Cyberattacks

No Exploits Required

After AI Reaches Production: 12 Ways Security Teams Can Take Control

SECURITYWEEK NETWORK:

ICS:

Daily Briefing Newsletter

More from Kevin Townsend

Latest News

Trending

Daily Briefing Newsletter

Webinar: Why Email Security Keeps Failing (And What Has to Change)

Virtual Event: 2026 Cloud Security Summit

People on the Move

Expert Insights

The AI Token Costs That Can Break Cybersecurity

When Information Becomes the Attack Surface – Understanding AI Agent Traps

What the Latest ShinyHunters Breaches Reveal About Modern Cyberattacks

No Exploits Required

After AI Reaches Production: 12 Ways Security Teams Can Take Control

Daily Briefing Newsletter