Google has opened a bounty program[1] focused solely on flaws that let artificial intelligence systems take unsafe or unwanted actions.
Google first added AI-related issues to its broader Vulnerability Reward Program (VRP) in October 2023. Over the past two years, researchers have earned more than $430,000 in AI-related rewards. The new dedicated AI Vulnerability Reward Program builds on these efforts with clearer rules and a focus on high-impact exploits.
The company will pay up to $30,000 for high‑impact reports that expose exploits capable of commanding AI agents to perform rogue actions, such as unlocking a smart lock indirectly or exfiltrating private data via an assistant’s summarization feature. The program sets a $20,000 base award for bugs that affect flagship services like Search, Gemini Apps, Gmail and Drive, and it offers bonuses for high quality reports and novel findings that can push the payout to $30,000.
Google said the program narrows in on a particular class of problem it calls AI bugs. Those are not just content errors or hallucinations. They are behaviors that use a large language model or generative system to harm users, corrupt accounts, change data, or otherwise exploit system tooling. The company explicitly separates model output that is only objectionable content from exploits that let an attacker change a user’s environment or siphon secrets. Content-based issues, including jailbreaks, direct prompt injections, or alignment problems, are intentionally out-of-scope for the AI VRP. Google encourages researchers to report these in-product, where teams can analyze model context, trends, and user metadata to implement long-term improvements.
The announcement came alongside a new AI agent named CodeMender that Google says can suggest patches for vulnerable code. According to Google, CodeMender has been used, under human review, to help deliver dozens of security fixes to open source projects. The company also pointed to the roughly $430,000 it has already awarded researchers since opening AI submissions within its broader vulnerability program two years ago.
Why security researchers say the money may not be enough
Security teams and independent researchers welcomed the focus on AI exploits. Still, some in the community argue the top reward does not match the market value of serious exploit chains. One reader pointed out that $30,000 will not always deter a small but motivated subset of finders from selling a working exploit to criminal buyers instead of reporting it. That critique reflects a wider tension. High end, chainable attacks that can turn assistant context into an active attack often require deep work and carry outsized real world value.
Past research shows exactly how dangerous chained inputs can be. In July[2] and again in October[3], independent write ups documented techniques that hide malicious instructions inside otherwise normal inputs, and researchers demonstrated how summarization and log features can be abused to deliver invisible prompts. One analysis found an attack that combined instructions injected into logs with crafted search history entries and the assistant’s browsing tools to siphon user data quietly to an attacker controlled server. Another report showed that hidden HTML and styling tricks placed inside emails can cause an assistant to repeat buried instructions inside a trusted summary. Those are the classes of flaws Google says it now wants researchers to bring forward.
What Google can do next
A focused bounty program is a step forward, but if the goal is to shrink the incentive to resell exploits, more will likely be needed. Below are practical moves Google could take to make the program more effective and safer.
- Raise maximum rewards for public reporting of chainable exploits that allow remote control or data exfiltration.
- Shorten triage timelines and publish a transparent pay and fix cadence so researchers see faster, reliable outcomes.
- Offer expanded safe harbor and legal clarity for researchers who follow program rules.
- Fund and run targeted hackathons with careful disclosure rules to surface complex multi-step attack chains.
- Publish anonymized case studies after fixes so defenders and other vendors can learn from real incidents.
Quick take
Google’s new AI bounty clarifies what the company cares about and signals that it is treating model tooling as a security frontier. The move also intersects with real incidents that have shown how summarization, logs, personalization and browsing can be abused when models treat hidden text or telemetry as commands. Whether the program will keep exploits out of criminal markets depends on money, speed, and trust. Until those pieces line up, some researchers will weigh private sale against public disclosure.
Notes: This post was edited/created using GenAI tools. Image: DIW-Aigen.
Read next: Instagram Revamps Map Feature with Full Control Over Location Sharing[4]
References
- ^ a bounty program (bughunters.google.com)
- ^ July (www.digitalinformationworld.com)
- ^ October (www.digitalinformationworld.com)
- ^ Instagram Revamps Map Feature with Full Control Over Location Sharing (www.digitalinformationworld.com)