Mercor Breach: 4 TB of Biometric Data You Can't Rotate
Ricardo Argüello — April 1, 2026
CEO & Founder
General summary
On March 31, 2026, Mercor — the $10B AI recruiting startup that trains models for OpenAI, Anthropic, and Google — confirmed it was hit by the LiteLLM supply chain attack. Lapsus$ claims 4 TB of stolen data: source code, databases with resumes from 30,000+ contractors, and files including video interviews, face scans, and identity documents. Unlike passwords or API keys, biometric data can't be rotated.
- Mercor confirmed it was hit by the LiteLLM attack and said it was 'one of thousands of companies' affected, as reported by TechCrunch
- Lapsus$ claims 939 GB of source code, 211 GB of databases with personal data, and ~3 TB of files including video interviews and KYC documents
- Biometric data — face scans, voice recordings, passport photos — is permanent and cannot be 'rotated' like compromised credentials
- Mercor manages 30,000+ contractors and trains AI models for top providers, turning the breach into a risk for the AI data supply chain
- Some claims about the specific attack vector come from social media and have not been confirmed by Mercor or verified security sources
Imagine a company manages the master lock to an office building where OpenAI, Anthropic, and Google work. One day, someone discovers the lock was built with a defective part (LiteLLM). The thieves don't just copy the key — they take the ID photos, iris scans, and voice recordings of every person who entered the building. You can change the lock. You can't change 30,000 people's faces.
AI-generated summary
Six days ago, we wrote about the LiteLLM attack. We explained how TeamPCP compromised a security scanner to poison an AI API key proxy with 97 million monthly downloads. We said: audit your trust chain. Map every intermediary.
Today we know the name of the first $10B victim.
Mercor confirmed on March 31 that it was hit by the LiteLLM supply chain attack. According to Neowin, the company said it was “one of thousands of companies” affected. The Lapsus$ group claims 4 TB of stolen data.
We already covered the attack mechanism. What was missing was seeing the real consequences — and understanding that there’s a type of stolen data with no remediation path.
What Mercor Has That Most Companies Don’t
Mercor is an AI recruiting startup founded in 2023 by three 22-year-olds. In October 2025, it raised a $350 million Series C led by Felicis Ventures that valued it at $10 billion. Its business: connecting specialized contractors — scientists, doctors, lawyers, engineers — with companies that need to train AI models. Among its clients: OpenAI, Anthropic, and Google DeepMind.
To verify contractor identities, Mercor collects video interviews with face and voice data, KYC documents, and passports. It manages over 30,000 contractors and processes $2 million in daily payouts.
That data combination is what makes this breach different from a typical credential theft. It’s not a database of hashed emails and passwords. It’s a repository that contains, according to CybersecurityNews citing Lapsus$ claims: 939 GB of platform source code, 211 GB of database records with resumes and personal data, and roughly 3 TB of stored files including video interviews, face scans, and identity documents.
One company. Thirty thousand people. All in one place.
The Chain Broke Exactly Where We Said It Would
In our March 25 article on the LiteLLM attack, we wrote:
When was the last time someone on your team verified that
pip install packageactually installs the code from the GitHub repository?
And:
If your security posture for AI infrastructure is “install the popular package and move on,” you’re in the majority. And the majority just got hit.
Mercor used LiteLLM to manage connections to multiple AI providers — OpenAI, Anthropic, and others — through a unified interface. When TeamPCP poisoned versions 1.82.7 and 1.82.8 on PyPI, the malware entered as a legitimate dependency. Nobody needed to import it manually. The .pth file executed every time Python started.
I won’t repeat the full attack anatomy — it’s in the LiteLLM article. What matters here is the scale of the consequence: a single compromised dependency led to full infrastructure access at a $10B company.
Mercor said it was “one of thousands.” That means the LiteLLM attack’s blast radius is still materializing. Companies that didn’t audit their trust chain last week are waiting to be the next to confirm.
Biometric Data: The Breach With No Patch
When an attacker steals credentials, there’s a clear procedure:
| Data type | Remediation | Time |
|---|---|---|
| Passwords | Rotate | Minutes |
| API keys | Revoke and reissue | Minutes |
| Session tokens | Invalidate | Seconds |
| Credit cards | Cancel and reissue | Days |
| Face scan | — | Permanent |
| Voice recording | — | Permanent |
| Passport photo | — | Permanent |
The first four rows have full columns. The last three don’t. There is no “rotate” button for your face.
If Lapsus$ has 3 TB of stored files including video interviews used for identity verification, the affected contractors face a problem that no security team can remediate. A face scan combined with a voice recording is exactly the material needed to generate convincing deepfakes. We’re no longer talking about phishing with fake emails — we’re talking about video calls where “the person” speaking looks and sounds exactly like the real contractor.
And these aren’t random contractors. They’re scientists, doctors, and lawyers working with the most important AI labs in the world. Their compromised identities aren’t just a problem for them — they’re an attack vector against the organizations they work for.
Think about the concrete scenario: a Mercor contractor trains models for an AI lab. An attacker has their video interview with biometric data, their full resume, and potentially their passport. With that, they can generate a convincing deepfake, access the lab’s systems using the contractor’s identity, and compromise the model training pipeline. The chain goes from Mercor to the contractor, from the contractor to the lab, from the lab to the model, from the model to millions of users.
The world’s largest password breach gets resolved with a mass reset and a press release. A biometric data breach doesn’t have that luxury.
AI Companies as High-Value Targets
Mercor isn’t just a company with bad dependency hygiene. It’s a central node in the AI data supply chain.
AI companies are exceptionally valuable targets because of a unique convergence of factors that don’t apply to conventional businesses. The data they handle is training data — not just static PII, but the material that defines how models behave. If an attacker accesses training data, they can understand and manipulate the behavior of models used by millions of people.
The concentration of specialized talent makes it worse. Mercor’s 30,000+ contractors include people with access to proprietary information from multiple AI labs. Their professional profiles, combined with identity data, create a map of who knows what across the AI ecosystem.
And then there’s the interconnection. Mercor works with OpenAI, Anthropic, and Google DeepMind. A breach at Mercor doesn’t stay at Mercor — it’s a breach at the periphery of every AI lab it works with.
This is a pattern we’ve seen in other sectors: the attack doesn’t go to the castle directly — it goes to the service provider who holds the keys to several castles at once. In the AI world, contractor recruitment and training platforms are exactly that provider. And most of them don’t have the security posture that the value of their data demands.
What’s Confirmed vs. What’s Alleged
Every time a security incident goes viral on social media, the narrative grows faster than the verified facts. In this case, the gap between what’s confirmed and what’s circulating on X is significant.
What we know for certain: Mercor publicly confirmed the breach and tied it to the LiteLLM supply chain attack (TechCrunch). The company said it was “one of thousands of companies” affected (Neowin). It stated it acted “promptly” and engaged third-party forensics experts. The LiteLLM attack mechanism was independently verified as CVE-2026-33634 with a CVSS score of 9.4. And the Lapsus$ group publicly claimed responsibility, alleging possession of 4 TB of data (CybersecurityNews, TechStartups).
Now, what hasn’t been confirmed — and is being treated as fact anyway.
The most repeated claim is that Mercor developers handed production credentials to an AI chatbot. It comes from a viral X post by Aakash Gupta. TechStartups qualified it: “posts tied to the incident suggest a developer may have exposed production credentials through an AI coding assistant.” But neither TechCrunch, nor CybersecurityNews, nor Mercor itself have confirmed this detail.
The same applies to the specific data breakdown (939 GB source code, 211 GB database, ~3 TB files), the alleged full Tailscale VPN access, and the exact types of stolen files. All of that comes from Lapsus$ claims, not from Mercor’s own disclosures.
Why dedicate an entire section to this? Because security decisions should be based on verified facts. What IS a fact: Mercor was compromised via LiteLLM. What remains unverified: the exact scale and the specific vector through which credentials were exposed.
The Recurring Pattern: AI Agents and Unaudited Installs
Regardless of whether the specific AI chatbot credential claim is true for Mercor, the general pattern is real and documented.
In our article on npm’s worst day, we cited Andrej Karpathy: “I can’t help but feel I’m playing Russian roulette with every pip install or npm install (which the LLMs also freely execute on my behalf).”
AI agents and code assistants are installing dependencies without reviewing diffs, resolving to the latest version by default, and executing at machine speed without anyone verifying what enters the environment. That scenario turns a supply chain attack into an industrial-scale breach.
The risk isn’t hypothetical. Last week we covered how the axios attack hit anyone who ran npm install within a three-hour window. With LiteLLM it was pip install. In both cases, an AI agent running automated installations would have introduced the malware without any human review.
The combination of poisoned dependencies + AI agents with broad permissions + accessible production credentials is exactly the recipe for what happened to Mercor. You don’t need to know whether a specific chatbot exposed the credentials. You only need to know whether that configuration exists on your team — and that the next LiteLLM is already in preparation.
What Changes in Your Incident Response Plan
Our LiteLLM article covered prevention: pin versions, isolate credentials, audit the trust chain. This article is about what comes after. Now that a named victim exists with compromised biometric data, your incident response plan needs to cover scenarios it probably doesn’t.
Start with a biometric data inventory. Does your company collect video interviews, face scans, or identity documents from candidates or contractors? Do you store them yourself or does a third party? If you use platforms like Mercor, Turing, Toptal, or any AI contractor marketplace, you need to know exactly what biometric data they hold on your people. You can’t respond to a breach if you don’t know what was exposed.
Then map your concentration risk. How many of your AI dependencies flow through a single package or provider? If one compromised component gives access to your entire infrastructure — as apparently happened with Mercor — your blast radius is unlimited. At IQ Source, we map each client’s critical dependencies and assess what happens if each one is individually compromised.
There’s a blind spot most teams miss: contractor data on third-party platforms. If your contractors are on Mercor or a similar platform, their data is no longer under your control alone. A breach at the platform is a breach of your personnel information.
You also need to review your notification obligations for biometric data, which are stricter than most teams realize. GDPR classifies biometric data as sensitive data with reinforced protections. CCPA has specific categories for biometric information. In Mexico, LFPDPPP requires explicit consent. If your company operates in Latin America and uses platforms that collect biometric data on your people, failing to notify has direct legal consequences.
Finally, a question almost nobody is asking yet: AI training data provenance. If you use AI models trained on contractor data from a compromised platform, is the model still trustworthy? Regulators will ask. If training data comes from a compromised source, model integrity is an open question.
If your company uses AI recruiting platforms, contractor marketplaces, or any service that collects video interviews, identity documents, or biometric data from your teams or candidates — you need to know exactly what they hold, where they store it, and what happens when they get breached.
We run an identity data exposure map in 90 minutes: which third-party platforms hold biometric data on your people, what notification obligations you have, and what cannot be remediated if that data leaks. Reach out at contact.
Frequently Asked Questions
On March 31, 2026, Mercor — a $10B AI recruiting startup that trains models for OpenAI, Anthropic, and Google — confirmed it was hit by the LiteLLM supply chain attack (CVE-2026-33634). The Lapsus$ group claims 4 TB of stolen data: 939 GB of source code, 211 GB of databases with resumes and personal data from over 30,000 contractors, and ~3 TB of files reportedly including video interviews, face scans, and KYC documents.
Passwords can be rotated in minutes. API keys can be revoked. Credit cards can be reissued. But a face scan, a voice recording from a video interview, or a passport photo are permanent identifiers that cannot be changed. If an attacker has your biometric data, they can generate deepfakes to impersonate your identity. There is no 'remediation' procedure for data that is part of your body.
The TeamPCP group first compromised Trivy (a security scanner) and used stolen credentials to poison LiteLLM versions 1.82.7 and 1.82.8 on PyPI. The malware auto-executed on Python startup and harvested SSH keys, cloud credentials, and .env files. Mercor used LiteLLM as a dependency to manage connections to multiple AI providers, which exposed their complete infrastructure.
Four concrete steps: inventory what biometric data third parties hold on your people, map concentration risk across your AI dependencies, review your notification obligations under GDPR, CCPA, or local data protection laws (such as LFPDPPP in Mexico), and audit data provenance if you use AI models trained on data from platforms that could be compromised.
Related Articles
AI Killed Execution. The Bottleneck Is Now You.
Simon Willison is wiped out by 11am directing agents. Andreessen says execution is dead. The bottleneck your company faces just moved.
LiteLLM Attack: Your AI Trust Chain Just Broke
LiteLLM, the AI API key proxy with 97 million monthly downloads, was poisoned via PyPI. Your security scanner was the entry point.