Follow Cyber Kendra on Google News! | WhatsApp | Telegram

Add as a preferred source on Google

Your AI Chats Were Sent to Strangers on Google Cloud

Google Cloud's Vertex AI had a critical flaw that sent AI responses to wrong users. CVE-2025-11915 affected Claude, Llama & more.

Google Cloud Vertex AI

A critical security vulnerability in Google Cloud Platform's Vertex AI service allowed users' AI-generated responses to be accidentally routed to completely different recipients, exposing potentially sensitive data across multiple enterprise customers for nearly a week before being discovered and patched.

Google disclosed the incident, revealing that a technical flaw in their Vertex AI API caused responses from third-party AI models—including Anthropic's Claude, Meta's Llama, and various OpenAI models—to be misrouted between users when streaming requests were made. 

The vulnerability, now tracked as CVE-2025-11915 with a medium severity rating, stemmed from improper handling of HTTP requests containing the "Expect: 100-continue" header.

The issue represents a classic HTTP desynchronization attack (desync), where internal proxies failed to correctly process specific HTTP headers, causing a disconnect in how streaming responses were matched to requests. When one user's request included the problematic header, the response intended for that request could be delivered to the next user in the queue instead—creating what security researchers call "response queue poisoning."

How widespread was the exposure? While Google described the impact as affecting "a limited amount of responses," the vulnerability affected multiple popular AI models across Vertex AI's Model-as-a-Service offerings. 

Critically, Google's own Gemini models were not impacted, as the flaw only affected third-party models accessed through public endpoints.

The remediation timeline reveals the complexity of the issue: Anthropic's Claude models were fixed first, followed by open-source models like DeepSeek and Llama, and finally all remaining affected services including Mistral and self-deployed models. Notably, customers using dedicated or private endpoints were never vulnerable.

Why HTTP desync attacks are so dangerous

Security expert James Kettle from PortSwigger has warned that HTTP/1.1's inherent design flaws around request boundaries make these attacks particularly insidious. The "Expect: 100-continue" header, designed to optimize large file uploads by letting servers signal readiness before receiving data, becomes a weapon when proxies and backend servers interpret it differently.

Google has implemented comprehensive fixes including proper header handling, extensive testing protocols, and real-time monitoring systems to detect any recurrence. However, the incident highlights critical risks in cloud AI infrastructure, where data isolation failures can expose confidential conversations, proprietary prompts, or sensitive business information to unauthorized parties.

What should users do? Google states no customer action is required, as all fixes have been deployed server-side. However, organizations using Vertex AI during the affected period (September 23-28, 2025) should review their security logs and assess whether any sensitive information may have been exposed through streaming API requests.

Post a Comment