Skip to main content

You built a Foxconn factory to babysit your AI

Garry Tan admitted he wrote 540,000 lines of code he didn't need. For 36 years, capability meant lines of code. That equation just inverted, and most of us missed it.

You built a Foxconn factory to babysit your AI

Ricardo Argüello

Ricardo Argüello
Ricardo Argüello

CEO & Founder

Software Development 7 min read

Garry Tan, the president of Y Combinator, admitted something this week that almost nobody with his profile says out loud.

In January he got back into coding and built a Rails app. More than 540,000 lines, plus the tests to police them. He was proud of it. And he wrote that he shouldn’t have been.

The image he used to describe what he’d built is the one that stayed with me all week: he built a Foxconn factory for an AI worker. Guardrails, retry loops, validators, a cage of control bolted on top of an intelligence that could already do the job, and a thousand things nobody asked for.

I’m telling you this because I recognize that engineer. I recognize him because I was that engineer, across five different eras. And the thesis here is uncomfortable: for decades, more code meant more capability. That equation just inverted, and most of us are still building as if it didn’t.

For 36 years, more code was more power

I have been in computing since 1990, since I was fifteen years old in front of a Commodore 64 with 64KB of memory that had to be defended byte by byte. And if there’s one thing every jump in the industry taught me, it was a rule that never failed: capability is measured in code. More lines, more functionality, more power. The engineer who wrote more, did more.

That rule was correct for a long time. When memory was expensive, you wrote tight code to fit. When compute was expensive, you wrote code to ration it. When services were expensive to call, you wrapped each call in layers of software to protect it. The instinct was always the same: the valuable resource is on the other side, so build around it to guard it.

Tan describes himself as a 2013 engineer dropped into 2026, building the only way he knew. It’s a good description, but it’s missing half. He isn’t a 2013 engineer. He’s the instinct of an entire career, mine included, that says capability and lines of code are the same thing. That instinct was true for 36 years. What happened is that it stopped being true, and the body hasn’t caught up.

What a Foxconn factory of code looks like

Tan did the math on his own app, and it’s worth looking at. Of his 540,000 lines, about 262,000 were application code. The other 276,000 were tests built to police those 262,000. The audit committee was bigger than the company it was auditing.

What did those 276,000 lines do? Sanitizers checking inputs the model already handled. Validators checking outputs the model already caught. Retry loops wrapping calls the model recovers from on its own. Tan puts it in a sentence you feel in your stomach: every one of those lines is a bet that the worker will fail. And he made that bet hundreds of thousands of times, against a worker that delivered.

This is the new part, and it’s what almost nobody is seeing. The waste of 2026 isn’t writing too little code. It’s writing defensive code around a model that doesn’t need it, and calling it engineering rigor. Walk through your own codebase and count the lines that exist only because you didn’t trust the model. You’ll find more than you expect.

And it isn’t just code. It’s the dashboards nobody opens on the second Monday, the 33 scheduled jobs that are 33 alarms for a worker who already shows up on time, the approval processes that duplicate a review the model already does. All of that infrastructure feels like control. Most of the time it’s a cage.

The new waste isn’t writing little, it’s over-policing

The reason for the shift is economic, and it’s the part the 36-year instinct doesn’t process. The model used to be expensive and code cheap, so you wrote a lot of code to ration the model, to call it carefully and sparingly. Today both halves of that equation flipped. The model is cheap, it gets cheaper every quarter, and it’s capable enough to write the code itself. So you stop writing code to babysit it. You instruct it in plain language and let it write the minimal code that’s actually needed.

Tan names the new unit: the skillpack. It isn’t a loose prompt that evaporates. It’s instruction in markdown, plus the minimal code it needs, plus unit and integration tests. That last part is what matters. The tests are what separate a skillpack from vibe coding: vibe coding is a feeling with no coverage, a skillpack has tests that let it change without breaking. The behavior lives in language you can edit, not in logic frozen the day you wrote it.

A month ago I wrote, from a different angle, that the runtime had become a commodity and the moat was the workflow you built on top. This is the next step, and it’s more uncomfortable: it isn’t only that the moat moved up. It’s that, while it moved, you were probably still bolting on bars. The piece I wrote about how code is not cheap pointed at the same thing from the accounting side: every line is a liability someone maintains. The defensive line that wasn’t needed is the most expensive liability of all, because it costs you to write, costs you to maintain, and on top of that slows down the worker you were trying to help.

What we do about it at IQ Source

When we walk into an operation that already added AI, half of what gets presented as rigor turns out to be cage. Validators repeating what the model already does. Dashboards nobody reads. Retry loops hiding a badly written prompt instead of fixing it. All of it was built with good intentions, and all of it has to be maintained forever.

The concrete work is separating the rail from the cage. A rail bears weight: it’s the validation at the boundary where data enters the system, the control that stops an agent from deleting something unrecoverable, the test that protects a real business decision. A cage only provides a feeling of control: it’s the line that exists because someone, at some point, didn’t trust. AI Maestro is the discovery where that distinction gets made process by process, before you keep building on top. And Tech Partner is the role that keeps that decision alive once the system is running in production, because the cage grows back on its own if nobody prunes it.

John Sjölander summed up the other face of this in a single line about Tan’s piece: the scarce resource became clarity, taste, and judgment, and the engineer who writes the least code is often the one building the most. It’s true, and it’s hard to swallow for anyone who learned to measure their worth in lines. I learned that way. Tan learned that way. The good news is that the craft doesn’t disappear, it relocates: from the one who writes the most to the one who knows what not to write.

So before your next architecture review, ask one question about your own system. If you removed today all the code that exists only because you don’t trust the model, what would be left, and would it still work? If the answer makes you uncomfortable, you didn’t build an application. You built a surveillance factory around a worker who already knew how to do the job.

Separate the rail from the cage in your AI operation

Frequently Asked Questions

Garry Tan AI agents Claude Code software architecture over-engineering skillpacks AI Maestro

Related Articles

AI writes half your code and nothing ships faster
Software Development
· 5 min read

AI writes half your code and nothing ships faster

AI now writes nearly half the commits, but time to production hasn't moved. The bottleneck shifted downstream to tests, CI, and the delivery pipeline.

CI/CD software delivery AI in development
The agentic moat is not the model. It's seven files.
Software Development
· 7 min read

The agentic moat is not the model. It's seven files.

An April 30 paper from Fudan + Peking measures seven harness components. The system prompt is the only one that regresses below baseline when isolated.

Anthropic AI Maestro Tech Partner