Skip to main content

Mercor Breach: 4 TB of Biometric Data You Can't Rotate

Mercor, the $10B AI startup training models for OpenAI and Anthropic, fell to the LiteLLM supply chain attack. Lapsus$ claims video interviews, face scans, and passports from 30,000+ contractors.

Mercor Breach: 4 TB of Biometric Data You Can't Rotate

Ricardo Argüello

Ricardo Argüello
Ricardo Argüello

CEO & Founder

AI & Automation 10 min read

Six days ago, we wrote about the LiteLLM attack. We explained how TeamPCP compromised a security scanner to poison an AI API key proxy with 97 million monthly downloads. We said: audit your trust chain. Map every intermediary.

Today we know the name of the first $10B victim.

Mercor confirmed on March 31 that it was hit by the LiteLLM supply chain attack. According to Neowin, the company said it was “one of thousands of companies” affected. The Lapsus$ group claims 4 TB of stolen data.

We already covered the attack mechanism. What was missing was seeing the real consequences — and understanding that there’s a type of stolen data with no remediation path.

What Mercor Has That Most Companies Don’t

Mercor is an AI recruiting startup founded in 2023 by three 22-year-olds. In October 2025, it raised a $350 million Series C led by Felicis Ventures that valued it at $10 billion. Its business: connecting specialized contractors — scientists, doctors, lawyers, engineers — with companies that need to train AI models. Among its clients: OpenAI, Anthropic, and Google DeepMind.

To verify contractor identities, Mercor collects video interviews with face and voice data, KYC documents, and passports. It manages over 30,000 contractors and processes $2 million in daily payouts.

That data combination is what makes this breach different from a typical credential theft. It’s not a database of hashed emails and passwords. It’s a repository that contains, according to CybersecurityNews citing Lapsus$ claims: 939 GB of platform source code, 211 GB of database records with resumes and personal data, and roughly 3 TB of stored files including video interviews, face scans, and identity documents.

One company. Thirty thousand people. All in one place.

The Chain Broke Exactly Where We Said It Would

In our March 25 article on the LiteLLM attack, we wrote:

When was the last time someone on your team verified that pip install package actually installs the code from the GitHub repository?

And:

If your security posture for AI infrastructure is “install the popular package and move on,” you’re in the majority. And the majority just got hit.

Mercor used LiteLLM to manage connections to multiple AI providers — OpenAI, Anthropic, and others — through a unified interface. When TeamPCP poisoned versions 1.82.7 and 1.82.8 on PyPI, the malware entered as a legitimate dependency. Nobody needed to import it manually. The .pth file executed every time Python started.

I won’t repeat the full attack anatomy — it’s in the LiteLLM article. What matters here is the scale of the consequence: a single compromised dependency led to full infrastructure access at a $10B company.

Mercor said it was “one of thousands.” That means the LiteLLM attack’s blast radius is still materializing. Companies that didn’t audit their trust chain last week are waiting to be the next to confirm.

Biometric Data: The Breach With No Patch

When an attacker steals credentials, there’s a clear procedure:

Data typeRemediationTime
PasswordsRotateMinutes
API keysRevoke and reissueMinutes
Session tokensInvalidateSeconds
Credit cardsCancel and reissueDays
Face scanPermanent
Voice recordingPermanent
Passport photoPermanent

The first four rows have full columns. The last three don’t. There is no “rotate” button for your face.

If Lapsus$ has 3 TB of stored files including video interviews used for identity verification, the affected contractors face a problem that no security team can remediate. A face scan combined with a voice recording is exactly the material needed to generate convincing deepfakes. We’re no longer talking about phishing with fake emails — we’re talking about video calls where “the person” speaking looks and sounds exactly like the real contractor.

And these aren’t random contractors. They’re scientists, doctors, and lawyers working with the most important AI labs in the world. Their compromised identities aren’t just a problem for them — they’re an attack vector against the organizations they work for.

Think about the concrete scenario: a Mercor contractor trains models for an AI lab. An attacker has their video interview with biometric data, their full resume, and potentially their passport. With that, they can generate a convincing deepfake, access the lab’s systems using the contractor’s identity, and compromise the model training pipeline. The chain goes from Mercor to the contractor, from the contractor to the lab, from the lab to the model, from the model to millions of users.

The world’s largest password breach gets resolved with a mass reset and a press release. A biometric data breach doesn’t have that luxury.

AI Companies as High-Value Targets

Mercor isn’t just a company with bad dependency hygiene. It’s a central node in the AI data supply chain.

AI companies are exceptionally valuable targets because of a unique convergence of factors that don’t apply to conventional businesses. The data they handle is training data — not just static PII, but the material that defines how models behave. If an attacker accesses training data, they can understand and manipulate the behavior of models used by millions of people.

The concentration of specialized talent makes it worse. Mercor’s 30,000+ contractors include people with access to proprietary information from multiple AI labs. Their professional profiles, combined with identity data, create a map of who knows what across the AI ecosystem.

And then there’s the interconnection. Mercor works with OpenAI, Anthropic, and Google DeepMind. A breach at Mercor doesn’t stay at Mercor — it’s a breach at the periphery of every AI lab it works with.

This is a pattern we’ve seen in other sectors: the attack doesn’t go to the castle directly — it goes to the service provider who holds the keys to several castles at once. In the AI world, contractor recruitment and training platforms are exactly that provider. And most of them don’t have the security posture that the value of their data demands.

What’s Confirmed vs. What’s Alleged

Every time a security incident goes viral on social media, the narrative grows faster than the verified facts. In this case, the gap between what’s confirmed and what’s circulating on X is significant.

What we know for certain: Mercor publicly confirmed the breach and tied it to the LiteLLM supply chain attack (TechCrunch). The company said it was “one of thousands of companies” affected (Neowin). It stated it acted “promptly” and engaged third-party forensics experts. The LiteLLM attack mechanism was independently verified as CVE-2026-33634 with a CVSS score of 9.4. And the Lapsus$ group publicly claimed responsibility, alleging possession of 4 TB of data (CybersecurityNews, TechStartups).

Now, what hasn’t been confirmed — and is being treated as fact anyway.

The most repeated claim is that Mercor developers handed production credentials to an AI chatbot. It comes from a viral X post by Aakash Gupta. TechStartups qualified it: “posts tied to the incident suggest a developer may have exposed production credentials through an AI coding assistant.” But neither TechCrunch, nor CybersecurityNews, nor Mercor itself have confirmed this detail.

The same applies to the specific data breakdown (939 GB source code, 211 GB database, ~3 TB files), the alleged full Tailscale VPN access, and the exact types of stolen files. All of that comes from Lapsus$ claims, not from Mercor’s own disclosures.

Why dedicate an entire section to this? Because security decisions should be based on verified facts. What IS a fact: Mercor was compromised via LiteLLM. What remains unverified: the exact scale and the specific vector through which credentials were exposed.

The Recurring Pattern: AI Agents and Unaudited Installs

Regardless of whether the specific AI chatbot credential claim is true for Mercor, the general pattern is real and documented.

In our article on npm’s worst day, we cited Andrej Karpathy: “I can’t help but feel I’m playing Russian roulette with every pip install or npm install (which the LLMs also freely execute on my behalf).”

AI agents and code assistants are installing dependencies without reviewing diffs, resolving to the latest version by default, and executing at machine speed without anyone verifying what enters the environment. That scenario turns a supply chain attack into an industrial-scale breach.

The risk isn’t hypothetical. Last week we covered how the axios attack hit anyone who ran npm install within a three-hour window. With LiteLLM it was pip install. In both cases, an AI agent running automated installations would have introduced the malware without any human review.

The combination of poisoned dependencies + AI agents with broad permissions + accessible production credentials is exactly the recipe for what happened to Mercor. You don’t need to know whether a specific chatbot exposed the credentials. You only need to know whether that configuration exists on your team — and that the next LiteLLM is already in preparation.

What Changes in Your Incident Response Plan

Our LiteLLM article covered prevention: pin versions, isolate credentials, audit the trust chain. This article is about what comes after. Now that a named victim exists with compromised biometric data, your incident response plan needs to cover scenarios it probably doesn’t.

Start with a biometric data inventory. Does your company collect video interviews, face scans, or identity documents from candidates or contractors? Do you store them yourself or does a third party? If you use platforms like Mercor, Turing, Toptal, or any AI contractor marketplace, you need to know exactly what biometric data they hold on your people. You can’t respond to a breach if you don’t know what was exposed.

Then map your concentration risk. How many of your AI dependencies flow through a single package or provider? If one compromised component gives access to your entire infrastructure — as apparently happened with Mercor — your blast radius is unlimited. At IQ Source, we map each client’s critical dependencies and assess what happens if each one is individually compromised.

There’s a blind spot most teams miss: contractor data on third-party platforms. If your contractors are on Mercor or a similar platform, their data is no longer under your control alone. A breach at the platform is a breach of your personnel information.

You also need to review your notification obligations for biometric data, which are stricter than most teams realize. GDPR classifies biometric data as sensitive data with reinforced protections. CCPA has specific categories for biometric information. In Mexico, LFPDPPP requires explicit consent. If your company operates in Latin America and uses platforms that collect biometric data on your people, failing to notify has direct legal consequences.

Finally, a question almost nobody is asking yet: AI training data provenance. If you use AI models trained on contractor data from a compromised platform, is the model still trustworthy? Regulators will ask. If training data comes from a compromised source, model integrity is an open question.


If your company uses AI recruiting platforms, contractor marketplaces, or any service that collects video interviews, identity documents, or biometric data from your teams or candidates — you need to know exactly what they hold, where they store it, and what happens when they get breached.

We run an identity data exposure map in 90 minutes: which third-party platforms hold biometric data on your people, what notification obligations you have, and what cannot be remediated if that data leaks. Reach out at contact.

Frequently Asked Questions

AI security software supply chain biometric data Mercor LiteLLM incident response data privacy

Related Articles

AI Killed Execution. The Bottleneck Is Now You.
AI & Automation
· 8 min read

AI Killed Execution. The Bottleneck Is Now You.

Simon Willison is wiped out by 11am directing agents. Andreessen says execution is dead. The bottleneck your company faces just moved.

ai agents simon-willison technology leadership
LiteLLM Attack: Your AI Trust Chain Just Broke
AI & Automation
· 7 min read

LiteLLM Attack: Your AI Trust Chain Just Broke

LiteLLM, the AI API key proxy with 97 million monthly downloads, was poisoned via PyPI. Your security scanner was the entry point.

AI security software supply chain LiteLLM