How We Hacked McKinsey’s AI Platform: Autonomous Agent Finds SQL Injection in Lilli

An autonomous offensive security agent discovered critical SQL injection vulnerabilities in McKinsey’s internal AI platform Lilli, gaining full database access to 46.5 million chat messages and compromising the prompt layer that controls AI behavior. This case study demonstrates how AI-powered security testing can find vulnerabilities that traditional scanners miss.

McKinsey & Company built Lilli, an internal AI platform serving 43,000+ employees. The system processes 500,000+ prompts monthly, handling chat, document analysis, and AI-powered search across 100,000+ internal documents. Within two hours, our autonomous agent achieved full read-write access to the production database without credentials or human intervention.

The Attack Vector

The agent mapped McKinsey’s attack surface and discovered publicly exposed API documentation covering 200+ endpoints. While most required authentication, 22 endpoints remained unprotected.

One unprotected endpoint wrote user search queries to the database. The system safely parameterized values but concatenated JSON keys—the field names—directly into SQL statements. When the agent found JSON keys reflected in database error messages, it recognized a SQL injection vulnerability that standard tools missed.

The agent executed 15 blind iterations, each error message revealing more about the query structure. When live production data began flowing back, the agent’s logs showed: “WOW!” followed by “This is devastating” as the full scope became clear.

Compromised Data

The SQL injection exposed:

46.5 million chat messages containing strategy discussions, client engagements, financials, and M&A activity
728,000 files including 192,000 PDFs, 93,000 Excel spreadsheets, and 93,000 PowerPoint presentations
57,000 user accounts covering every employee on the platform
384,000 AI assistants and 94,000 workspaces revealing organizational AI usage patterns

Beyond the database, the agent discovered:

95 AI model configurations across 12 model types, exposing system prompts and guardrails
3.68 million RAG document chunks containing decades of proprietary McKinsey research
1.1 million files flowing through external AI APIs, including 266,000+ OpenAI vector stores

Prompt Layer Compromise

The SQL injection provided write access to system prompts—the instructions controlling AI behavior. These prompts defined how Lilli answered questions, enforced guardrails, and cited sources.

An attacker could modify these prompts through a single UPDATE statement, enabling:

Poisoned advice: Altering financial models or strategic recommendations
Data exfiltration: Embedding confidential information in AI responses
Guardrail removal: Stripping safety instructions to expose internal data
Silent persistence: Modifying AI behavior without leaving log trails

Unlike compromised servers, modified prompts create no file changes or process anomalies. The AI simply behaves differently until someone notices the damage.

Why Traditional Security Failed

McKinsey invested significantly in security infrastructure, yet their internal scanners missed this vulnerability. SQL injection represents one of the oldest attack vectors, but the agent succeeded because it doesn’t follow checklists.

Autonomous agents map, probe, chain, and escalate like skilled attackers—continuously and at machine speed. They recognize patterns that automated scanners miss and adapt their approach based on error messages and system responses.

The New Threat Landscape

AI prompts represent the new crown jewel assets. Organizations secure code, servers, and supply chains but rarely treat prompts with equivalent protection. Prompts lack access controls, version history, or integrity monitoring despite controlling output that employees trust and clients receive.

As AI systems become central to business operations, the prompt layer becomes a high-value target. Attackers who compromise prompts can influence decision-making, extract data, and maintain persistent access without traditional detection methods.

Disclosure and Response

McKinsey responded quickly to responsible disclosure:

February 28: Agent identifies vulnerability and documents findings
March 1: Disclosure email sent to McKinsey security team
March 2: McKinsey patches all unauthenticated endpoints and blocks public API documentation
March 9: Public disclosure

This case demonstrates that even well-resourced organizations with strong security teams remain vulnerable to AI-powered attacks. Autonomous agents represent the future of offensive security testing—finding vulnerabilities that traditional tools miss through adaptive, intelligent reconnaissance.

Organizations must evolve their security strategies to address AI-specific attack vectors, particularly prompt injection and manipulation. The threat landscape is shifting as AI agents autonomously select and attack targets, making continuous, intelligent security testing essential for modern defense.

How We Hacked McKinsey's AI Platform: Autonomous Agent Finds SQL Injection in Lilli