Taking an unusual approach to a thorny security problem, Google plans to add a second AI model specifically to babysit the first one. Chrome’s new Gemini-powered assistant might soon browse the web on your behalf. But there’s a catch: artificial intelligence models are remarkably easy to trick. Google’s solution? Fight AI with more AI.
The tech giant recently added a Gemini-powered chat feature in Chrome, with plans to let it control your browser autonomously. Sound convenient? Just tell the AI what you need, and it does the browsing for you. But this convenience opens a dangerous door that security experts have been warning about for months.
This also involves a threat known as “indirect prompt injection,” when AI agents browse the internet independently. Here is how this works: a malicious website embeds certain hidden instructions that override its safety guidelines. Imagine your bot chancing upon a rigged page whose suggestions it follows to ignore your original request and have your bank account siphoned, or your passwords stolen.
Google’s Solution to the “Primary New Threat Facing All Agentic Browsers”
According to Nathan Parker, a Chrome security engineer, indirect prompt injection is “the primary new threat facing all agentic browsers.” It could be launched from anywhere: malicious websites, embedded content, or even user reviews that seem perfectly innocuous. And once launched, they can have the AI perform financial transactions or leak sensitive information without your knowledge.
The threat is serious enough that Gartner, a major IT consultancy, recently advised companies to block AI browsers entirely. For Google, which has poured billions into AI infrastructure, that’s not exactly the response they were hoping for.
Rather than dial back, Google is doubling down with a weird solution: they’re adding a second AI model whose sole job is keeping the first one honest. They call it the “User Alignment Critic.”
Think of it as an AI supervisor. After the main Gemini agent has decided what it wants to do, the Critic checks each step before it goes to execution. Its mission is simple: does this action actually help accomplish what the user asked for? If not, the Critic vetoes it.
Google designed this oversight model with the explicit purpose of not being corruptible by attackers through malicious content. That’s why it’s separated from the messy web pages the main agent will visit, theoretically making it immune to the same tricks that could fool the browsing AI.
This method of having one AI moderate another has recently taken off. Developer Simon Willison first floated the idea in 2023; Google DeepMind codified it in a research paper earlier this year, under the technical name “CaMeL”,Capabilities for Machine Learning.
The AI Defense of Google Origin: Isolation, Transparency, and User Approval in Chrome
The Critic is not the only defence from Google. Another approach that Chrome is taking is extending its origin-isolation technology to AI interactions. Security on the web depends on a concept of keeping data for different websites separate, and Chrome enforces that with Site Isolation. Google adapted that principle for AI agents: something called Agent Origin Sets prevents the AI from mixing data across different sites at will.
Another important element: transparency. Prior to whisking you off to financial sites, health destinations, or other places of confidentiality, the AI will first request approval.
If it’s able to use Google Password Manager to sign you in automatically, it will also ask for confirmation. In high-stakes situations-like completing a purchase or sending an actual message agent will need to get your explicit approval or simply return control to you for that last step.
Security Challenges Drive Google’s New VRP
Google knows it requires external assistance to stress-test these security measures. It has now refreshed its Vulnerability Rewards Program to include AI security vulnerabilities, including offering as much as $20,000 for researchers who can demonstrate serious breaches in the security boundaries of the system.
The Chrome developers have already integrated some origin isolation features in today’s browser builds, with more agentic capabilities coming online in future releases.
But what Google is proposing exposes, in equal measure, both the promise and the peril of AI agents. Automating complex browsing tasks genuinely could make our digital lives a lot easier. But the security challenges are unprecedented. We’re essentially teaching software to act on our behalf while simultaneously trying to prevent that same software from being hijacked by bad actors.
Whether adding a second AI to supervise the first one proves effective remains to be seen. Researchers will, without question, continue to probe such systems for weaknesses, and the $20,000 bounty suggests Google expects them to find some. For the time being, Chrome users can simply stand back and watch this AI arms race unfold hopefully at a safe distance.




