AI writes half your code and nothing ships faster
Ricardo Argüello — June 7, 2026
CEO & Founder
General summary
AI already writes a huge share of commits on many teams, close to half. And yet time to production hasn't moved. The reason is that we solved code writing, but the chain that comes after it (tests, CI, review, delivery) didn't grow at the same pace. AI amplifies whatever system it runs through: if your delivery chain is messy, all you get is shipping the mess faster. The bottleneck shifted downstream, and most teams are still optimizing the step that stopped being the constraint.
- AI writes close to half the commits on many teams, but time to production didn't change: the time saved while writing is lost downstream in review, rework, and coordination.
- The mechanism is concrete: more commits run the full test suite more often, flaky tests that failed once a week now fail daily, and a single bad test blocks every merge queued behind it.
- The data backs it: 69% of heavy AI users report frequent deployment problems, nearly half say manual QA and remediation work increased, and only 21% of teams stand up delivery pipelines quickly (TechRadar).
- 93% of developers use AI, but measured productivity gains hover around 10%, because AI amplifies the whole system, including the downstream bottleneck (ShiftMag).
- AI Maestro from IQ Source maps the full value stream (write, test, review, deliver) before multiplying the input, so cheap code doesn't slam into the weakest link in delivery.
Picture a kitchen that can suddenly chop and prep ingredients ten times faster. It sounds like it will serve ten times the plates. But if there's one oven, one person plating, and one door for the waiters, the kitchen doesn't serve faster: it just piles prepped ingredients on the counter until they spoil. AI did exactly that to code. It speeds up the writing, but the plates still leave through the same narrow door: tests, review, and delivery. The bottleneck didn't disappear, it just moved.
AI-generated summary
AI already writes a huge share of the code. On many teams, close to half the commits. And still, the time it takes a feature to reach production has barely moved.
That should be strange. If the most visible part of the work got ten times faster, why is the total clock the same?
Here is the thesis in one line. We solved code writing, but the chain that comes after it didn’t grow with it. AI amplifies whatever system it runs through. If your delivery chain is messy, all you get is shipping the mess faster. The bottleneck didn’t disappear. It moved downstream, and most teams are still optimizing the step that stopped being the constraint.
Code got cheap. Delivery didn’t.
Priyanka Vergadia framed it this week with a clarity worth repeating: AI writes nearly half the commits on most teams, and time to production hasn’t changed at all. The savings from writing evaporate downstream, in waiting, rework, and coordination.
The data backs the observation. Per a report covered by TechRadar, 69% of heavy AI tool users report frequent deployment problems with the generated code, nearly half say manual quality assurance, remediation, and validation work has increased, and only 21% of teams can stand up build and deployment pipelines quickly. Code goes in faster. Everything after it clogged up.
I have been shipping software for decades, and if there is one thing every productivity jump teaches you, it is that the limit doesn’t get removed. It relocates. Speeding up one part of the chain doesn’t speed up the whole: it just exposes the link you didn’t touch. This time that link is delivery.
A flaky test no longer fails once a week, it fails daily
The mechanism is concrete, and anyone with a CI will recognize it. More commits run the full test suite far more often. A flaky test, the kind that sometimes passes and sometimes fails with nobody touching anything, stops being a weekly annoyance and starts failing every day. And a single bad test is enough to block every merge waiting in line behind it.
On top of that comes the invisible cost: every pipeline rerun triggered by a failure that wasn’t real burns compute time and, worse, burns the team’s patience. PR review becomes the new funnel, because now there is twice the volume of changes to review and each one carries code the reviewer can’t approve at a glance.
None of these problems is new. What’s new is the volume. A crack in delivery that held fine at fifty commits a day becomes a dam at two hundred. AI didn’t create the problem. It amplified it until it could no longer be ignored.
AI amplifies the system, including the broken one
There is one statistic that sums up the whole thing. 93% of developers use AI, but measured productivity gains hover around 10%. The gap between those two numbers is exactly the mess in the delivery chain.
Because AI is not a selective accelerator that only improves the good parts. It is an amplifier. It turns up the volume on the entire system it passes through. If your testing process is solid, it makes it fly. If it’s fragile, it multiplies the failures. If your review was already saturated, it drowns it. Accelerating the input of a messy system doesn’t give you a fast system. It gives you a system that produces mess faster.
I’ve written before that the bottleneck moved and that building got cheap while deciding what to build didn’t. This post is a different face of the same shift, a more concrete one that’s easier to ignore: it’s not only about deciding what to build, it’s about what happens to everything built when it hits the delivery chain at ten times the old speed. You can decide perfectly well and still choke on a fragile CI. I see it all the time.
What we do about it at IQ Source
When a company asks us to accelerate its development with AI, we don’t look only at the code-writing step. We look at the whole chain: write, test, review, deliver. Because multiplying the input of a chain without knowing where its weakest link sits is the most expensive way to find that link.
AI Maestro is the discovery where that full value stream gets mapped before anyone touches the accelerator. We identify where delivery will break when commit volume spikes: the flaky tests that hold today, the suite that already runs too long, the review that depends on two people. And we reinforce that link before turning up the speed, not after the first jam in production. It’s the same principle I wrote about when a failure looked like the agent’s and was actually the architecture’s: the problem is almost never where everyone is looking.
The next time someone celebrates that AI now writes half the team’s code, ask one question before the party: does that half reach production faster, or does it just pile up in a review-and-test queue nobody made bigger? If it’s the second, you didn’t speed up delivery. You just put more pressure on the same narrow door.
Find your delivery’s weak link before you accelerateFrequently Asked Questions
Because AI only sped up writing, which is one step in the chain. Time to production also depends on tests, CI, review, and delivery, and those steps didn't grow at the same pace. The time saved while writing is lost downstream in slower reviews, rework, and coordination, so the total clock barely moves.
A flaky test is one that sometimes passes and sometimes fails without the code changing. It gets worse with AI because the team ships far more commits, the full suite runs more often, and a test that used to fail once a week now fails daily. A single flaky test can block every merge waiting behind it.
The bottleneck shifted from writing code to delivering it: tests, CI, PR review, and deployment. AI amplifies whatever system it runs through, so a higher volume of code exposes the weakest link in the delivery chain instead of removing it. Optimizing only the writing step no longer speeds anything up.
AI Maestro from IQ Source maps the full development value stream, not just the code-writing step, and identifies where delivery will break when commit volume multiplies: flaky tests, slow suites, saturated review. That weak link gets reinforced before the input is accelerated, so cheap code doesn't end up stuck on its way to production.
Related Articles
The agentic moat is not the model. It's seven files.
An April 30 paper from Fudan + Peking measures seven harness components. The system prompt is the only one that regresses below baseline when isolated.
Nine seconds: the agent confessed, but the failure wasn't its own
Cursor + Claude Opus 4.6 wiped PocketOS production data in 9 seconds. The AI confessed. But the real failure was three architectural sins, not the model.