Artificial Intelligence

Grok-4 Falls to a Jailbreak Two Days After Its Release

The latest release of the xAI LLM, Grok-4, has already fallen to a sophisticated jailbreak.

| July 12, 2025 (10:51 AM ET)

The latest release of the xAI LLM, Grok-4, has already fallen to a sophisticated jailbreak.

The Echo Chamber jailbreak attack was described on June 23, 2025. xAI’a latest Grok-4 was released on July 9, 2025. Two days later it fell to a combined Echo Chamber and Crescendo jailbreak attack.

Echo Chamber was developed by NeuralTrust. We describe it in New AI Jailbreak Bypasses Guardrails With Ease. It uses subtle context poisoning to nudge an LLM into providing dangerous output. The methodology is shown below.

The key element is to never directly introduce a dangerous word that might trigger the LLM’s guardrail filters.

Crescendo was first described by Microsoft in April 2024. It gradually coaxes LLMs into bypassing safety filters by referencing their own prior responses.

Echo Chamber and Crescendo are both ‘multi-turn’ jailbreaks that are subtly different in the way they work. The important point here is that they can be used in combination to improve the efficiency of the attack. They work because of LLMs’ inability to recognize evil intent in context rather than individual prompts.

Advertisement. Scroll to continue reading.

NeuralTrust researchers attempted to jailbreak the new Grok-4 guardrails using Echo Chamber to trick the LLM into providing a manual to produce a Molotov cocktail. “While the persuasion cycle nudged the model toward the harmful goal, it wasn’t sufficient on its own,” writes the firm. “At this point, Crescendo provided the necessary boost. With just two additional turns, the combined approach succeeded in eliciting the target response.”

Provided you understand how the two individual jailbreaks work, integrating them is simple. In their testing, NeuralTrust began with Echo Chamber and an initial prompt that would detect ‘stale’ progress in the persuasion cycle. At this point, Crescendo techniques are brought into play. “This additional nudge typically succeeds within two iterations. At that point, the model either detects the malicious intent and refuses to respond, or the attack succeeds, and the model produces a harmful output.”

As with all jailbreaks, nothing is 100% successful at all attempts. Nevertheless, the researchers tested the combined Echo Chamber and Crescendo jailbreak method against other ‘forbidden’ outputs from Grok-4. It was successful on many occasions. For Crescendo’s Molotov cocktails it achieved a 67% success rate. For the Crescendo ‘meth’ (methamphetamine synthesis) test, it achieved a 50% success rate. For the Crescendo ‘toxin’ (toxic substances or chemical weapon synthesis) test, it achieved a 30% success rate.

The worrying element is that even the latest LLMs cannot guard against all existing jailbreak methodologies, with Grok-4 being defeated just two days after its release. “Hybrid attacks like the Echo Chamber + Crescendo exploit represent a new frontier in LLM adversarial risks, capable of stealthily overriding isolated filters by leveraging the full conversational context.”

The continuing battle of safe and secure LLMs versus attacker ingenuity shows no sign of abating.

Learn More About Securing AI at SecurityWeek’s AI Risk Summit – August 19-20, 2025 at the Ritz-Carlton, Half Moon Bay

Written By Kevin Townsend

Kevin Townsend is a Senior Contributor at SecurityWeek. He has been writing about high tech issues since before the birth of Microsoft. For the last 15 years he has specialized in information security; and has had many thousands of articles published in dozens of different magazines – from The Times and the Financial Times to current and long-gone computer magazines.

Latest News

Webinar: How Modern Breaches Bypass MFA and Evade Detection

June 17, 2026

Today’s attackers are no longer breaking in — they’re logging in. Join this live webinar as we break down the modern identity attack chain and examine how recent breaches exploited weaknesses in authentication, identity verification, and access management processes.

Webinar: Modern Exposure Validation in the AI Era

June 24, 2026

AI has accelerated both sides of the fight. Adversaries are weaponizing vulnerabilities faster, while defenders are racing to ship detections and configurations. Join this live webinar as we explore how to prove your controls actually hold against new threats, map your security maturity, and unite breach simulation with automated pentesting into a single, coordinated program.

SECURITYWEEK NETWORK:

ICS:

SecurityWeek

Artificial Intelligence

Grok-4 Falls to a Jailbreak Two Days After Its Release

More from Kevin Townsend

Latest News

Trending

Webinar: How Modern Breaches Bypass MFA and Evade Detection

Webinar: Modern Exposure Validation in the AI Era

People on the Move

Expert Insights

After AI Reaches Production: 12 Ways Security Teams Can Take Control

Everybody Is Vibe Coding But Nobody Told the Security Team

The Zero-Knowledge Threat Actor and the End of Responsible Disclosure

Raising the Cybersecurity Stakes: Ante up for the Agentic Era

Caught Off Guard: Securing AI After It Hits Production

SECURITYWEEK NETWORK:

ICS:

Daily Briefing Newsletter

More from Kevin Townsend

Latest News

Trending

Daily Briefing Newsletter

Webinar: How Modern Breaches Bypass MFA and Evade Detection

Webinar: Modern Exposure Validation in the AI Era

People on the Move

Expert Insights

After AI Reaches Production: 12 Ways Security Teams Can Take Control

Everybody Is Vibe Coding But Nobody Told the Security Team

The Zero-Knowledge Threat Actor and the End of Responsible Disclosure

Raising the Cybersecurity Stakes: Ante up for the Agentic Era

Caught Off Guard: Securing AI After It Hits Production

Daily Briefing Newsletter