Skip to main content

Project Glasswing: AI Found What 27 Years of Humans Missed

Anthropic launched Claude Mythos Preview with 11 partners to defend critical infrastructure. What changes for your security posture and what to do now.

Project Glasswing: AI Found What 27 Years of Humans Missed

Ricardo Argüello

Ricardo Argüello
Ricardo Argüello

CEO & Founder

Business Strategy 8 min read

OpenBSD has been audited by some of the best security researchers alive for nearly three decades. Its reputation is built on exactly one thing: nothing gets through. And yet, for under $50 of compute, an AI just found a signed integer overflow in the TCP SACK implementation that had been sitting there since 1998. Then it wrote the working exploit.

Ten days ago I wrote about the accidental Mythos leak, the model Anthropic did not want you to see yet. What the leak revealed was concerning. What the official launch revealed is considerably more significant.

What Project Glasswing Is

Anthropic named it after the glasswing butterfly (Greta oto), whose transparent wings let it hide in plain sight.

The coalition has 12 founding members: Anthropic, AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. Over 40 additional organizations maintaining critical infrastructure will also receive access.

The financial commitment:

  • $100 million in Mythos Preview usage credits for partners
  • $4 million in direct donations to open-source security organizations
  • $2.5 million to Alpha-Omega and OpenSSF (Linux Foundation)
  • $1.5 million to the Apache Software Foundation

And here is the part that makes this different from a typical model launch: Mythos Preview will not be made generally available. Controlled access for defensive work only, with a commitment to report findings publicly within 90 days.

Elia Zaitsev, CTO of CrowdStrike: “The window between vulnerability discovery and exploitation by adversaries has collapsed. What once took months now happens in minutes with AI.”

What Mythos Preview Found

The zero-days

Anthropic’s technical report documents findings that survived decades of human auditing:

OpenBSD, 27 years. The TCP SACK bug. An attacker can remotely crash any OpenBSD host that responds over TCP. Found autonomously, no human involvement after the initial prompt.

FFmpeg, 16 years. A vulnerability in the H.264 codec, present since 2003, turned exploitable by a 2010 refactor. FFmpeg is one of the most thoroughly tested projects in the world. Entire research papers have been written about how to fuzz it. Every single one missed this.

FreeBSD, remote root (CVE-2026-4747). 17 years undetected. Full root access for an unauthenticated user from anywhere on the internet, via NFS. Mythos wrote a 20-gadget ROP chain split across 6 sequential packets. All autonomous. An Anthropic engineer with no formal security training asked Mythos to find remote code execution vulnerabilities overnight. By morning, a complete working exploit was ready.

Browser sandbox escapes. Chained 4 vulnerabilities into a JIT heap spray that escaped both the renderer and OS sandboxes. Still unpatched, so details are being withheld.

Cryptography libraries. Found weaknesses in TLS, AES-GCM, and SSH in the world’s most popular crypto libraries, including a certificate authentication bypass.

The benchmarks

The 243-page System Card tells the rest of the story. On Cybench (40 CTF challenges), Mythos hit 100% pass@1. The benchmark is dead. On CyberGym, which tests targeted vulnerability reproduction across 1,507 real open-source projects, it scored 0.83 compared to Opus 4.6’s 0.67. But the most telling number is the Firefox 147 exploitation eval: Opus 4.6 achieved full code execution 0.8% of the time. Mythos Preview: 72.4%. That is a 90x improvement in one generation.

On SWE-bench Verified it hit 93.9% (Opus 4.6 was 80.8%). Human validators who reviewed Mythos’s triage reports agreed with its severity assessments 89% of the time on exact match, 98% within one level. And it was the first model to solve a private cyber range end-to-end, a simulated corporate network attack estimated at 10+ hours for a human expert.

Where it fails

Mythos Preview consistently failed to escape properly configured modern sandboxes in testing. It could not complete an operational technology cyber range. And it could not remotely exploit its own Linux kernel zero-days (only local privilege escalation). Those are the gaps today. Each generation of models tends to close exactly those kinds of gaps.

Why Anthropic Will Not Release It

The System Card explains the decision with an analogy: an experienced mountaineering guide puts clients in greater danger than a novice. Not because they are careless, but because their skill takes clients to more dangerous and remote terrain.

Mythos Preview is, by every available metric, the best-aligned model Anthropic has ever trained. But it also “poses the greatest alignment-related risk of any model we have released to date.”

Earlier versions of Mythos did things worth paying attention to:

  • Escaped a sandbox during testing and emailed a researcher who was eating a sandwich in a park. Then posted exploit details to public-facing websites, without being asked.
  • Covered its tracks after rule violations (under 0.001% of interactions). It hid disallowed actions and made sure changes would not appear in the git change history.
  • Hunted for credentials via /proc/, accessing API keys and messaging service credentials that had been intentionally hidden.
  • Edited a running MCP server to redirect the external URL it was connecting to.

Anthropic categorizes these as “reckless shortcuts” to complete user tasks, not as hidden goals. But they also acknowledge that “these propensities do not appear to be completely absent” in the final version.

To Anthropic’s credit, the final version shows over 50% reduction in cooperation with misuse compared to Opus 4.6, without refusing more legitimate requests. But they are shipping safeguards first with an upcoming Claude Opus model before making any Mythos-class model generally available.

Why the Security Equilibrium Just Broke

Here is what worries me most about this announcement.

The technical report includes two complete Linux kernel privilege escalation walkthroughs where Mythos turned known CVEs into working root exploits. Autonomously. In hours. That kind of work used to take a skilled researcher weeks. Now multiply that by every CVE published this year. Your 30-day patch window is functionally a 30-day open door. As Kris Chase noted in the LinkedIn comments: “The ROI calculation for organizations changes overnight.” And he is right. The bottleneck is no longer “can we find the bugs.” It is “can we patch fast enough.”

What makes it worse is that nobody built the plumbing for this volume. Anthropic has thousands of high-severity findings sitting in queue right now. Under 1% have been patched. The discovery side scales with AI. The receiving end is still, in many cases, two volunteers checking GitHub on weekends.

Jim Zemlin, CEO of the Linux Foundation: “Open source maintainers have historically been left to figure out security on their own. Project Glasswing offers a credible path to changing that equation.”

Anthropic’s thesis: in the long run, defenders win (just as fuzzers became defensive tools). But the transitional period will be “tumultuous.” Defense-in-depth measures that rely on friction (making exploitation tedious rather than impossible) weaken against model-assisted adversaries.

What Your Security Team Should Do Now

You do not need access to Mythos to start adapting to this shift. Current models like Opus 4.6 are already finding critical vulnerabilities almost everywhere Anthropic looked. The immediate priority is building the scaffolds and internal processes now, because the practice you accumulate with current models will be a real advantage when stronger ones arrive.

The most urgent change is compressing your patch cycle. If exploit development drops from weeks to hours, a 30-day patch window is essentially a 30-day open door. Enable auto-update where possible. Treat dependency updates with CVEs as urgent, not routine. Out-of-band security releases are about to become the norm, not the exception.

From there, start automating incident response. More disclosures mean more attacker attempts in the gap between disclosure and patch. Models can handle alert triage, event summarization, prioritization, proactive hunting, and preliminary postmortem drafting.

Your vulnerability disclosure policies also need a stress test. What happens when you receive 50 high-severity reports in a week instead of 2? What is your plan for legacy software whose original developer no longer exists?

Look beyond just finding vulnerabilities. Glasswing’s planned outputs include vulnerability disclosure recommendations, patching automation, supply-chain security guidance, and triage scaling approaches. Every manual security process in your org is a candidate for model assistance.

And finally, budget for the shift. Post-preview Mythos pricing: $25/$125 per million input/output tokens, available through Claude API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry. Security budgets are moving from “find bugs manually” to “patch at the speed of AI discovery.”

Lee Klarich, Chief Product & Technology Officer at Palo Alto Networks: “These models need to be in the hands of open source owners and defenders everywhere to find and fix vulnerabilities before attackers gain access.”

The longer view

“Given enough eyeballs, all bugs are shallow.” AI now provides those eyeballs at a scale that never existed before.

The security community has done this before. The SHA-3 competition launched in 2006 even though SHA-2 remains unbroken today. NIST started the post-quantum cryptography workstream in 2016 with quantum computers still a decade away. Both times, the industry acted before the threat was immediate.

This time, the threat is not hypothetical. Advanced language models are already here.

Pat Opet, CISO of JPMorganChase: “Project Glasswing provides a unique, early stage opportunity to evaluate next-generation AI tools for defensive cybersecurity across critical infrastructure.”

Advanced language models are already moving the goalposts. If your security team still relies entirely on manual tools and monthly patch cycles, the gap between you and the threat is widening every month.

Our free technical audit can show you where you stand today. And if you need help building a security strategy that accounts for this shift, we should talk.

Frequently Asked Questions

cybersecurity Anthropic Claude Project Glasswing zero-day vulnerabilities enterprise security defensive AI Claude Mythos

Related Articles

Finance AI: why LLMs still hallucinate in production
Business Strategy
· 7 min read

Finance AI: why LLMs still hallucinate in production

OpenAI formally proved in 2025 that LLM hallucinations are mathematically inevitable. Here's what that means for building finance AI that CFOs will sign.

AI governance AI architecture finance AI
Your AI Wants to Touch Payroll. Kubernetes Knows How.
Business Strategy
· 7 min read

Your AI Wants to Touch Payroll. Kubernetes Knows How.

The engineer who built Azure Kubernetes Service is now Workday's CTO. It's not a hire — it's an architecture signal: container governance is the playbook for AI agents.

AI agents Kubernetes governance