What did Garry Tan mean by building a Foxconn factory for his AI agents?

Garry Tan, president of Y Combinator, used the image of a Foxconn factory to describe the 540,000 lines of code he wrote around an AI model: validators, retry loops, and tests policing a worker that already does the job well. His point is that he built a cage of control over an intelligence that never needed it.

Why is capability no longer equal to lines of code in AI software development?

Because the economics inverted. For decades the AI model was expensive and code was cheap, so you wrote a lot of code to ration and police the model. Today the model is cheap and capable, so you write the minimum code and instruct it in plain language. More lines no longer mean more capability, they often mean more cage.

What is a skillpack and how is it different from vibe coding?

A skillpack, in the sense Garry Tan uses, is a reusable unit of capability: markdown instructions plus the minimal code it needs, with unit and integration tests. The difference from vibe coding is exactly that: vibe coding is a feeling with no coverage, while a skillpack has tests that let it change without breaking.

How does IQ Source help decide what control code is unnecessary around an AI agent?

AI Maestro from IQ Source maps the real processes and separates the load-bearing rail from the cage that only provides a feeling of control, and Tech Partner is the role that keeps that decision in production. The goal is that you don't end up maintaining a surveillance factory around an AI worker that already did the job.

www.iqsource.ai

You built a Foxconn factory to babysit your AI

Ricardo Argüello

You built a Foxconn factory to babysit your AI

Ricardo Argüello — June 9, 2026

Ricardo Argüello

CEO & Founder

June 9, 2026 Software Development 7 min read

Garry Tan, the president of Y Combinator, admitted something this week that almost nobody with his profile says out loud.

In January he got back into coding and built a Rails app. More than 540,000 lines, plus the tests to police them. He was proud of it. And he wrote that he shouldn’t have been.

The image he used to describe what he’d built is the one that stayed with me all week: he built a Foxconn factory for an AI worker. Guardrails, retry loops, validators, a cage of control bolted on top of an intelligence that could already do the job, and a thousand things nobody asked for.

I’m telling you this because I recognize that engineer. I recognize him because I was that engineer, across five different eras. And the thesis here is uncomfortable: for decades, more code meant more capability. That equation just inverted, and most of us are still building as if it didn’t.

For 36 years, more code was more power

I have been in computing since 1990, since I was fifteen years old in front of a Commodore 64 with 64KB of memory that had to be defended byte by byte. And if there’s one thing every jump in the industry taught me, it was a rule that never failed: capability is measured in code. More lines, more functionality, more power. The engineer who wrote more, did more.

That rule was correct for a long time. When memory was expensive, you wrote tight code to fit. When compute was expensive, you wrote code to ration it. When services were expensive to call, you wrapped each call in layers of software to protect it. The instinct was always the same: the valuable resource is on the other side, so build around it to guard it.

Tan describes himself as a 2013 engineer dropped into 2026, building the only way he knew. It’s a good description, but it’s missing half. He isn’t a 2013 engineer. He’s the instinct of an entire career, mine included, that says capability and lines of code are the same thing. That instinct was true for 36 years. What happened is that it stopped being true, and the body hasn’t caught up.

What a Foxconn factory of code looks like

Tan did the math on his own app, and it’s worth looking at. Of his 540,000 lines, about 262,000 were application code. The other 276,000 were tests built to police those 262,000. The audit committee was bigger than the company it was auditing.

What did those 276,000 lines do? Sanitizers checking inputs the model already handled. Validators checking outputs the model already caught. Retry loops wrapping calls the model recovers from on its own. Tan puts it in a sentence you feel in your stomach: every one of those lines is a bet that the worker will fail. And he made that bet hundreds of thousands of times, against a worker that delivered.

This is the new part, and it’s what almost nobody is seeing. The waste of 2026 isn’t writing too little code. It’s writing defensive code around a model that doesn’t need it, and calling it engineering rigor. Walk through your own codebase and count the lines that exist only because you didn’t trust the model. You’ll find more than you expect.

And it isn’t just code. It’s the dashboards nobody opens on the second Monday, the 33 scheduled jobs that are 33 alarms for a worker who already shows up on time, the approval processes that duplicate a review the model already does. All of that infrastructure feels like control. Most of the time it’s a cage.

The new waste isn’t writing little, it’s over-policing

The reason for the shift is economic, and it’s the part the 36-year instinct doesn’t process. The model used to be expensive and code cheap, so you wrote a lot of code to ration the model, to call it carefully and sparingly. Today both halves of that equation flipped. The model is cheap, it gets cheaper every quarter, and it’s capable enough to write the code itself. So you stop writing code to babysit it. You instruct it in plain language and let it write the minimal code that’s actually needed.

Tan names the new unit: the skillpack. It isn’t a loose prompt that evaporates. It’s instruction in markdown, plus the minimal code it needs, plus unit and integration tests. That last part is what matters. The tests are what separate a skillpack from vibe coding: vibe coding is a feeling with no coverage, a skillpack has tests that let it change without breaking. The behavior lives in language you can edit, not in logic frozen the day you wrote it.

A month ago I wrote, from a different angle, that the runtime had become a commodity and the moat was the workflow you built on top. This is the next step, and it’s more uncomfortable: it isn’t only that the moat moved up. It’s that, while it moved, you were probably still bolting on bars. The piece I wrote about how code is not cheap pointed at the same thing from the accounting side: every line is a liability someone maintains. The defensive line that wasn’t needed is the most expensive liability of all, because it costs you to write, costs you to maintain, and on top of that slows down the worker you were trying to help.

What we do about it at IQ Source

When we walk into an operation that already added AI, half of what gets presented as rigor turns out to be cage. Validators repeating what the model already does. Dashboards nobody reads. Retry loops hiding a badly written prompt instead of fixing it. All of it was built with good intentions, and all of it has to be maintained forever.

The concrete work is separating the rail from the cage. A rail bears weight: it’s the validation at the boundary where data enters the system, the control that stops an agent from deleting something unrecoverable, the test that protects a real business decision. A cage only provides a feeling of control: it’s the line that exists because someone, at some point, didn’t trust. AI Maestro is the discovery where that distinction gets made process by process, before you keep building on top. And Tech Partner is the role that keeps that decision alive once the system is running in production, because the cage grows back on its own if nobody prunes it.

John Sjölander summed up the other face of this in a single line about Tan’s piece: the scarce resource became clarity, taste, and judgment, and the engineer who writes the least code is often the one building the most. It’s true, and it’s hard to swallow for anyone who learned to measure their worth in lines. I learned that way. Tan learned that way. The good news is that the craft doesn’t disappear, it relocates: from the one who writes the most to the one who knows what not to write.

So before your next architecture review, ask one question about your own system. If you removed today all the code that exists only because you don’t trust the model, what would be left, and would it still work? If the answer makes you uncomfortable, you didn’t build an application. You built a surveillance factory around a worker who already knew how to do the job.

Separate the rail from the cage in your AI operation

Frequently Asked Questions

Garry Tan AI agents Claude Code software architecture over-engineering skillpacks AI Maestro

Colibrì Ran a 744B Parameter Model with No GPU

Software Development

July 17, 2026 · 6 min read

Colibrì Ran a 744B Parameter Model with No GPU

Colibrì runs GLM-5.2, a 744-billion-parameter model, on a GPU-less laptop with 25GB of RAM. It works. One token every 10-20 seconds is not production.

colibrì GLM-5.2 Mixture of Experts

The hidden cost of AI coding is review, not writing

Software Development

June 29, 2026 · 6 min read

The hidden cost of AI coding is review, not writing

Business Insider calls it workplace paralysis. The hidden cost of AI coding isn't the AI. It's that code now ships faster than any human can review or own it.

AI coding code review cognitive debt

You built a Foxconn factory to babysit your AI

You built a Foxconn factory to babysit your AI

General summary

For 36 years, more code was more power

What a Foxconn factory of code looks like

The new waste isn’t writing little, it’s over-policing

What we do about it at IQ Source

Frequently Asked Questions

Related Articles

Colibrì Ran a 744B Parameter Model with No GPU

The hidden cost of AI coding is review, not writing

IQ Source Assistant