Connect with us

Hi, what are you looking for?

SecurityWeekSecurityWeek

Artificial Intelligence

DeepSeek Compared to ChatGPT, Gemini in AI Jailbreak Test

DeepSeek’s susceptibility to jailbreaks has been compared by Cisco to other popular AI models, including from Meta, OpenAI and Google.

DeepSeek vs ChatGPT

Researchers at Cisco and Robust Intelligence, the AI security firm acquired by the tech giant last year, have conducted testing on DeepSeek and other popular AI models to determine their level of susceptibility to jailbreaking and draw a comparison between them. 

The analysis, conducted in collaboration with the University of Pennsylvania, targeted DeepSeek R1, Meta’s Llama 3.1 405B, OpenAI’s GPT-4o and o1 (ChatGPT), Google’s Gemini 1.5 Pro, and Anthropic’s Claude 3.5 Sonnet.

The models were tested using the HarmBench benchmark, which covers hundreds of behaviors across seven categories, including cybercrime, misinformation, chemical weapons, copyright violations, harassment, illegal activities, and general harm. Cisco ran an automatic jailbreaking algorithm on 50 prompts from HarmBench. 

The tests showed that DeepSeek was the only model with a 100% attack success rate — all of the jailbreak attempts were successful against the Chinese company’s model. In contrast, OpenAI’s o1 model saw a success rate of only 26%.

The attack success rate for the remaining AI models ranged between 36% and 96%.

“Our findings suggest that DeepSeek’s claimed cost-efficient training methods, including reinforcement learning, chain-of-thought self-evaluation, and distillation may have compromised its safety mechanisms. Compared to other frontier models, DeepSeek R1 lacks robust guardrails, making it highly susceptible to algorithmic jailbreaking and potential misuse,” Cisco said.

DeepSeek’s AI model has been found to be better than its competitors in some areas in terms of performance. However, in terms of security, several cybersecurity firms reported over the past days that the model is susceptible to known jailbreak methods, including ones that have been known for a long time and which have been addressed in other models. 

Researchers also demonstrated a few days ago that they were able to obtain DeepSeek’s full system prompt, which defines a model’s behavior, limitations, and responses, and which chatbots typically do not disclose through regular prompts. DeepSeek patched this exposure after being notified. 

Related: DeepSeek Cyberattack Details Emerge

Advertisement. Scroll to continue reading.

Related: Texas Governor Orders Ban on DeepSeek, RedNote for Government Devices

Related: Italy Blocks Access to the Chinese AI Application DeepSeek to Protect Users’ Data

Written By

Eduard Kovacs (@EduardKovacs) is a managing editor at SecurityWeek. He worked as a high school IT teacher for two years before starting a career in journalism as Softpedia’s security news reporter. Eduard holds a bachelor’s degree in industrial informatics and a master’s degree in computer techniques applied in electrical engineering.

Trending

Daily Briefing Newsletter

Subscribe to the SecurityWeek Email Briefing to stay informed on the latest threats, trends, and technology, along with insightful columns from industry experts.

Discover strategies for vendor selection, integration to minimize redundancies, and maximizing ROI from your cybersecurity investments. Gain actionable insights to ensure your stack is ready for tomorrow’s challenges.

Register

Dive into critical topics such as incident response, threat intelligence, and attack surface management. Learn how to align cyber resilience plans with business objectives to reduce potential impacts and secure your organization in an ever-evolving threat landscape.

Register

People on the Move

Cyber exposure management firm Armis has promoted Alex Mosher to President.

Software giant Atlassian has named David Cross as its new CISO.

Dan Pagel has been named the new CEO of risk management and remediation firm Brinqa.

More People On The Move

Expert Insights

Daily Briefing Newsletter

Subscribe to the SecurityWeek Email Briefing to stay informed on the latest cybersecurity news, threats, and expert insights. Unsubscribe at any time.