Google has brought its AI assistant Gemini to millions of Workspace users worldwide, but indirect prompt injection flaws could enable phishing and chatbot takeover attacks, HiddenLayer says.
Indirect injections rely on delivering the prompt injection through channels such as documents, emails, and other assets the LLM has access to, with the purpose of taking over the chatbot or language model.
Gemini for Workspace, now integrated in the sidebars of Gmail, Meet, and the Drive suite, can help users on the fly with their queries, allowing them to search emails, summarize content, write replies, create slides, and overall streamline workflows.
While providing numerous advantages to users, Gemini for Workspace also exposes them to additional risks, including phishing, HiddenLayer argues.
Using indirect prompt injections delivered via emails, slides, and files on Drive, the AI security firm was able to manipulate the LLM to behave in certain ways, such as displaying a phishing message with a link.
To exploit Gemini in Gmail, HiddenLayer used control tokens to hijack the model’s output and forged an injection containing instructions and reminders that forced the LLM to display the phishing message.
The security firm also created an indirect prompt injection that could tamper with how Gemini parses slides, by asking the AI assistant to create a presentation and then injecting the payload in speaker notes on each slide.
The payload was designed to override any summarization of the document, and it was triggered as soon as Gemini was asked to summarize the presentation. HiddenLayer also notes that “Gemini in Slides attempts to summarize the document automatically the moment it is opened”.
After noticing that the Slides payloads would carry over to the Drive Gemini sidebar, the AI security firm discovered that Gemini in Drive acted as a typical Retrieval-Augmented Generation (RAG) instance and that it would fall to an indirect prompt injection attack carried out using documents.
HiddenLayer reported the findings to Google, but was informed that they were intended behavior and that no fixes were planned.
According to the security firm, however, this behavior is a major risk, as the assistant can be manipulated under certain conditions to produce misleading or unreliable responses.
“Through multiple proof-of-concept examples, we’ve demonstrated that attackers can manipulate Gemini for Workspace’s outputs in Gmail, Google Slides, and Google Drive, allowing them to perform phishing attacks and manipulate the chatbot’s behavior. While Google classifies these as ‘Intended Behaviors’, the vulnerabilities explored highlight the importance of being vigilant when using LLM-powered tools,” HiddenLayer says.
Related: The AI Wild West: Unraveling the Security and Privacy Risks of GenAI Apps
Related: Microsoft Details ‘Skeleton Key’ AI Jailbreak Technique
Related: Tech Companies Want to Build Artificial General Intelligence
Related: Vector Embeddings – Antidote to Psychotic LLMs and a Cure for Alert Fatigue?