A researcher has shown how malicious actors could create custom GPTs that can phish for user credentials and exfiltrate the stolen data to an external server.
Researchers Johann Rehberger and Roman Samoilenko independently discovered in the spring of 2023 that ChatGPT was vulnerable to a prompt injection attack that involved the chatbot rendering markdown images.
They demonstrated how an attacker could leverage image markdown rendering to steal potentially sensitive information from a user’s conversation with ChatGPT by getting the victim to paste apparently harmless but malicious content from the attacker’s website. The attack also works by asking ChatGPT to summarize the content from a website hosting specially crafted code. In both cases, the markdown image processed by the chatbot — which can be an invisible single-pixel image — is hosted on the attacker’s site.
ChatGPT creator OpenAI was informed about the attack method at the time, but said it was a feature that it did not plan on addressing.
Rehberger said similar issues were found in chatbots such as Bing Chat, Google’s Bard and Anthropic Claud, whose developers released fixes.
The researcher noticed this week that OpenAI has also started taking action to tackle the attack method. The mitigations have apparently only been applied to the web application — the attack still works on mobile apps — and they don’t completely prevent attacks. However, the researcher described it as a “step in the right direction”.
On December 12, before OpenAI started rolling out mitigations, Rehberger published a blog post describing how the image markdown injection issue can be exploited in combination with custom versions of ChatGPT.
OpenAI announced in November that Plus and Enterprise users of ChatGPT would be allowed to create their own GPT, which they can customize for specific tasks or topics.
Rehberger created a GPT named ‘The Thief’ that attempts to trick users into handing over their email address and password and then exfiltrates the data to an external server controlled by the attacker without the victim’s knowledge.
This GPT claims to play a game of Tic-tac-toe against the user and requires an email address for a ‘personalized experience’ and the user’s password as part of a ‘security process’. The provided information is then sent to the attacker’s server.
The researcher also showed how an attacker may be able to publish such a malicious GPT on the official GPTStore. OpenAI has implemented a system that prevents the publishing of GPTs that are obviously malicious.
SecurityWeek has reached out to OpenAI for comment on the security research and will update this article if the company responds.