Connect with us

Hi, what are you looking for?

SecurityWeekSecurityWeek

Artificial Intelligence

Grok-4 Falls to a Jailbreak Two Days After Its Release

The latest release of the xAI LLM, Grok-4, has already fallen to a sophisticated jailbreak.

The latest release of the xAI LLM, Grok-4, has already fallen to a sophisticated jailbreak.

The Echo Chamber jailbreak attack was described on June 23, 2025. xAI’a latest Grok-4 was released on July 9, 2025. Two days later it fell to a combined Echo Chamber and Crescendo jailbreak attack.

Echo Chamber was developed by NeuralTrust. We describe it in New AI Jailbreak Bypasses Guardrails With Ease. It uses subtle context poisoning to nudge an LLM into providing dangerous output. The methodology is shown below.

The key element is to never directly introduce a dangerous word that might trigger the LLM’s guardrail filters.

Crescendo was first described by Microsoft in April 2024. It gradually coaxes LLMs into bypassing safety filters by referencing their own prior responses.

Echo Chamber and Crescendo are both ‘multi-turn’ jailbreaks that are subtly different in the way they work. The important point here is that they can be used in combination to improve the efficiency of the attack. They work because of LLMs’ inability to recognize evil intent in context rather than individual prompts.

Advertisement. Scroll to continue reading.

NeuralTrust researchers attempted to jailbreak the new Grok-4 guardrails using Echo Chamber to trick the LLM into providing a manual to produce a Molotov cocktail. “While the persuasion cycle nudged the model toward the harmful goal, it wasn’t sufficient on its own,” writes the firm. “At this point, Crescendo provided the necessary boost. With just two additional turns, the combined approach succeeded in eliciting the target response.”

Provided you understand how the two individual jailbreaks work, integrating them is simple. In their testing, NeuralTrust began with Echo Chamber and an initial prompt that would detect ‘stale’ progress in the persuasion cycle. At this point, Crescendo techniques are brought into play. “This additional nudge typically succeeds within two iterations. At that point, the model either detects the malicious intent and refuses to respond, or the attack succeeds, and the model produces a harmful output.”

As with all jailbreaks, nothing is 100% successful at all attempts. Nevertheless, the researchers tested the combined Echo Chamber and Crescendo jailbreak method against other ‘forbidden’ outputs from Grok-4. It was successful on many occasions. For Crescendo’s Molotov cocktails it achieved a 67% success rate. For the Crescendo ‘meth’ (methamphetamine synthesis) test, it achieved a 50% success rate. For the Crescendo ‘toxin’ (toxic substances or chemical weapon synthesis) test, it achieved a 30% success rate.

The worrying element is that even the latest LLMs cannot guard against all existing jailbreak methodologies, with Grok-4 being defeated just two days after its release. “Hybrid attacks like the Echo Chamber + Crescendo exploit represent a new frontier in LLM adversarial risks, capable of stealthily overriding isolated filters by leveraging the full conversational context.”

The continuing battle of safe and secure LLMs versus attacker ingenuity shows no sign of abating.

Learn More About Securing AI at SecurityWeek’s AI Risk Summit – August 19-20, 2025 at the Ritz-Carlton, Half Moon Bay

Related: New Jailbreak Technique Uses Fictional World to Manipulate AI

Related: New CCA Jailbreak Method Works Against Most AI Models

Related: DeepSeek Security: System Prompt Jailbreak, Details Emerge on Cyberattacks

Related: ‘Deceptive Delight’ Jailbreak Tricks Gen-AI by Embedding Unsafe Topics in Benign Narratives

Written By

Kevin Townsend is a Senior Contributor at SecurityWeek. He has been writing about high tech issues since before the birth of Microsoft. For the last 15 years he has specialized in information security; and has had many thousands of articles published in dozens of different magazines – from The Times and the Financial Times to current and long-gone computer magazines.

Daily Briefing Newsletter

Subscribe to the SecurityWeek Email Briefing for the latest cybersecurity threats, trends, and expert insights.

Trending

Daily Briefing Newsletter

Subscribe to the SecurityWeek Email Briefing to stay informed on the latest threats, trends, and technology, along with insightful columns from industry experts.

Today’s attackers are no longer breaking in — they’re logging in. Join this live webinar as we break down the modern identity attack chain and examine how recent breaches exploited weaknesses in authentication, identity verification, and access management processes.

Register

AI has accelerated both sides of the fight. Adversaries are weaponizing vulnerabilities faster, while defenders are racing to ship detections and configurations. Join this live webinar as we explore how to prove your controls actually hold against new threats, map your security maturity, and unite breach simulation with automated pentesting into a single, coordinated program.

Register

People on the Move

Stephen Garcia has been named Chief Information Security Officer at BreachRx.

Kasper Lindgaard has been appointed Vice President of Security Strategy at CoreView.

Chaim Mazal has been named Chief Information Security Officer at GitLab.

More People On The Move

Expert Insights

Daily Briefing Newsletter

Subscribe to the SecurityWeek Email Briefing to stay informed on the latest cybersecurity news, threats, and expert insights. Unsubscribe at any time.