You built a Foxconn factory to babysit your AI
Ricardo Argüello — June 9, 2026
CEO & Founder
General summary
Garry Tan, president of Y Combinator, confessed this week that he built 540,000 lines of code around an AI model and shouldn't have been proud of it. The image he used is the one that matters: he built a Foxconn factory for an AI worker that never needed the supervision. For decades, more code meant more capability, and that equation just inverted. The new waste isn't writing too little code. It's bolting validators, retry loops, and guardrails onto a model that already does that work, and calling it engineering rigor.
- Tan revealed that of his 540,000 lines, about 262,000 were application code and 276,000 were tests built to police the model. The audit committee was bigger than the company.
- For 36 years the rule was capability equals lines of code. That equation flipped: the model is cheap and capable, so you write the minimum code and instruct it in plain language.
- The new waste isn't writing little, it's over-policing: validators, sanitizers, and retry loops bolted onto a worker that already recovers on its own. Every line is a bet that the model will fail.
- The new unit is the skillpack: markdown instructions plus the minimal code, with tests. The tests are what separate it from vibe coding and what let it change without breaking.
- AI Maestro and Tech Partner from IQ Source separate the load-bearing rail from the cage that only provides a feeling of control.
Imagine hiring the best carpenter in the world and then, just in case, putting a supervisor behind them to check every nail, a second supervisor to check the first, and a 300-page manual repeating what the carpenter already knows by heart. The work ships, sure, but half your payroll exists only because you didn't trust the person who actually knew how to do it. That's what most of us are building around AI right now, and we call it rigor.
AI-generated summary
Garry Tan, the president of Y Combinator, admitted something this week that almost nobody with his profile says out loud.
In January he got back into coding and built a Rails app. More than 540,000 lines, plus the tests to police them. He was proud of it. And he wrote that he shouldn’t have been.
The image he used to describe what he’d built is the one that stayed with me all week: he built a Foxconn factory for an AI worker. Guardrails, retry loops, validators, a cage of control bolted on top of an intelligence that could already do the job, and a thousand things nobody asked for.
I’m telling you this because I recognize that engineer. I recognize him because I was that engineer, across five different eras. And the thesis here is uncomfortable: for decades, more code meant more capability. That equation just inverted, and most of us are still building as if it didn’t.
For 36 years, more code was more power
I have been in computing since 1990, since I was fifteen years old in front of a Commodore 64 with 64KB of memory that had to be defended byte by byte. And if there’s one thing every jump in the industry taught me, it was a rule that never failed: capability is measured in code. More lines, more functionality, more power. The engineer who wrote more, did more.
That rule was correct for a long time. When memory was expensive, you wrote tight code to fit. When compute was expensive, you wrote code to ration it. When services were expensive to call, you wrapped each call in layers of software to protect it. The instinct was always the same: the valuable resource is on the other side, so build around it to guard it.
Tan describes himself as a 2013 engineer dropped into 2026, building the only way he knew. It’s a good description, but it’s missing half. He isn’t a 2013 engineer. He’s the instinct of an entire career, mine included, that says capability and lines of code are the same thing. That instinct was true for 36 years. What happened is that it stopped being true, and the body hasn’t caught up.
What a Foxconn factory of code looks like
Tan did the math on his own app, and it’s worth looking at. Of his 540,000 lines, about 262,000 were application code. The other 276,000 were tests built to police those 262,000. The audit committee was bigger than the company it was auditing.
What did those 276,000 lines do? Sanitizers checking inputs the model already handled. Validators checking outputs the model already caught. Retry loops wrapping calls the model recovers from on its own. Tan puts it in a sentence you feel in your stomach: every one of those lines is a bet that the worker will fail. And he made that bet hundreds of thousands of times, against a worker that delivered.
This is the new part, and it’s what almost nobody is seeing. The waste of 2026 isn’t writing too little code. It’s writing defensive code around a model that doesn’t need it, and calling it engineering rigor. Walk through your own codebase and count the lines that exist only because you didn’t trust the model. You’ll find more than you expect.
And it isn’t just code. It’s the dashboards nobody opens on the second Monday, the 33 scheduled jobs that are 33 alarms for a worker who already shows up on time, the approval processes that duplicate a review the model already does. All of that infrastructure feels like control. Most of the time it’s a cage.
The new waste isn’t writing little, it’s over-policing
The reason for the shift is economic, and it’s the part the 36-year instinct doesn’t process. The model used to be expensive and code cheap, so you wrote a lot of code to ration the model, to call it carefully and sparingly. Today both halves of that equation flipped. The model is cheap, it gets cheaper every quarter, and it’s capable enough to write the code itself. So you stop writing code to babysit it. You instruct it in plain language and let it write the minimal code that’s actually needed.
Tan names the new unit: the skillpack. It isn’t a loose prompt that evaporates. It’s instruction in markdown, plus the minimal code it needs, plus unit and integration tests. That last part is what matters. The tests are what separate a skillpack from vibe coding: vibe coding is a feeling with no coverage, a skillpack has tests that let it change without breaking. The behavior lives in language you can edit, not in logic frozen the day you wrote it.
A month ago I wrote, from a different angle, that the runtime had become a commodity and the moat was the workflow you built on top. This is the next step, and it’s more uncomfortable: it isn’t only that the moat moved up. It’s that, while it moved, you were probably still bolting on bars. The piece I wrote about how code is not cheap pointed at the same thing from the accounting side: every line is a liability someone maintains. The defensive line that wasn’t needed is the most expensive liability of all, because it costs you to write, costs you to maintain, and on top of that slows down the worker you were trying to help.
What we do about it at IQ Source
When we walk into an operation that already added AI, half of what gets presented as rigor turns out to be cage. Validators repeating what the model already does. Dashboards nobody reads. Retry loops hiding a badly written prompt instead of fixing it. All of it was built with good intentions, and all of it has to be maintained forever.
The concrete work is separating the rail from the cage. A rail bears weight: it’s the validation at the boundary where data enters the system, the control that stops an agent from deleting something unrecoverable, the test that protects a real business decision. A cage only provides a feeling of control: it’s the line that exists because someone, at some point, didn’t trust. AI Maestro is the discovery where that distinction gets made process by process, before you keep building on top. And Tech Partner is the role that keeps that decision alive once the system is running in production, because the cage grows back on its own if nobody prunes it.
John Sjölander summed up the other face of this in a single line about Tan’s piece: the scarce resource became clarity, taste, and judgment, and the engineer who writes the least code is often the one building the most. It’s true, and it’s hard to swallow for anyone who learned to measure their worth in lines. I learned that way. Tan learned that way. The good news is that the craft doesn’t disappear, it relocates: from the one who writes the most to the one who knows what not to write.
So before your next architecture review, ask one question about your own system. If you removed today all the code that exists only because you don’t trust the model, what would be left, and would it still work? If the answer makes you uncomfortable, you didn’t build an application. You built a surveillance factory around a worker who already knew how to do the job.
Separate the rail from the cage in your AI operationFrequently Asked Questions
Garry Tan, president of Y Combinator, used the image of a Foxconn factory to describe the 540,000 lines of code he wrote around an AI model: validators, retry loops, and tests policing a worker that already does the job well. His point is that he built a cage of control over an intelligence that never needed it.
Because the economics inverted. For decades the AI model was expensive and code was cheap, so you wrote a lot of code to ration and police the model. Today the model is cheap and capable, so you write the minimum code and instruct it in plain language. More lines no longer mean more capability, they often mean more cage.
A skillpack, in the sense Garry Tan uses, is a reusable unit of capability: markdown instructions plus the minimal code it needs, with unit and integration tests. The difference from vibe coding is exactly that: vibe coding is a feeling with no coverage, while a skillpack has tests that let it change without breaking.
AI Maestro from IQ Source maps the real processes and separates the load-bearing rail from the cage that only provides a feeling of control, and Tech Partner is the role that keeps that decision in production. The goal is that you don't end up maintaining a surveillance factory around an AI worker that already did the job.
Related Articles
AI writes half your code and nothing ships faster
AI now writes nearly half the commits, but time to production hasn't moved. The bottleneck shifted downstream to tests, CI, and the delivery pipeline.
The agentic moat is not the model. It's seven files.
An April 30 paper from Fudan + Peking measures seven harness components. The system prompt is the only one that regresses below baseline when isolated.