Skip to main content

AI Killed Execution. The Bottleneck Is Now You.

Simon Willison is wiped out by 11am directing agents. Andreessen says execution is dead. The bottleneck your company faces just moved.

AI Killed Execution. The Bottleneck Is Now You.

Ricardo Argüello

Ricardo Argüello
Ricardo Argüello

CEO & Founder

AI & Automation 8 min read

Simon Willison co-created Django, coined the term “prompt injection,” and has been shipping production software for 25 years. He no longer types 95% of his code. He prompts agents from his phone while walking his dog on the beach in Half Moon Bay.

On Lenny Rachitsky’s podcast last week, he explained what that actually feels like in practice: “Using coding agents well is taking every inch of my 25 years of experience as a software engineer, and it is mentally exhausting. I can fire up four agents in parallel and have them work on four different problems, and by 11am I am wiped out.”

The quote went viral because it punctured a specific fantasy that a lot of leadership teams have been holding onto: the idea that AI agents make engineering teams more productive at lower cost. Willison’s experience tells a more complicated story: the coding bottleneck dissolved, but the cognitive bottleneck of directing that code turned out to be worse.

The vision vs. the morning after

The same week Willison’s interview aired, Marc Andreessen added fuel from a different angle: “I’m not sure there will even be a salient concept of a programming language in the way that we understand it today.” According to Andreessen, programming languages won’t decline or evolve — they’ll stop being a relevant concept altogether. He compared the keyboard to the plow, noting that 99% of humanity was once behind one, and the world spent generations worrying about what would happen when farming disappeared. His answer: everything worth doing.

There’s enough truth in the vision to be dangerous if you act on it literally. Andreessen describes where we’re headed. Willison has been living the transition for months and reporting every bruise along the way. What the transition actually feels like in practice is a lot more exhausting than any pitch deck will tell you.

The bottleneck moved to a place most teams aren’t measuring

Willison calls this the single biggest shock in software engineering right now. “The thing that used to take the time takes way less time. Now the bottlenecks are everywhere else.”

A spec you handed to your engineering team that came back three weeks later — maybe — now comes back in three hours. Willison doesn’t need four-hour uninterrupted blocks to code anymore. He needs two minutes every now and then to prompt his agent.

The work shifted into a form that’s harder to see from a management dashboard.

Willison now prototypes three different versions of every feature because the cost is almost nothing. The hard part comes after: deciding which version is worth pursuing, holding context architecture in his head while four agents work on different problems simultaneously, noticing when one of them took a shortcut that passes the tests but sidesteps the underlying issue. None of that shows up in a “lines of code generated” metric.

“There is a limit on human cognition,” he says. “Even if you’re not reviewing everything they’re doing, just how much you can hold in your head at one time.”

What struck me about the interview wasn’t just the burnout. It’s that Willison describes engineers losing sleep, waking at 4am to set off more agents, treating the tools with what he calls “an element of sort of gambling and addiction.” His own New Year’s resolution flipped from “focus more, take on less” to “take on more, be more ambitious.” He says it’s fun. He says the brain exhaustion has been “a really big surprise.” I’ve seen that same pattern in CTOs I work with — the ones who adopted agents earliest are the ones who look the most tired.

Who benefits, who gets squeezed

ThoughtWorks ran an offsite with engineering VPs from different companies and published the findings in their Looking Glass 2026 report. They came back with a finding that should concern any CTO planning headcount. AI benefits two groups clearly: experienced engineers and brand-new ones. The middle gets squeezed.

Seniors have decades of pattern recognition to amplify. Willison describes looking at a problem and knowing it’s a one-sentence prompt, or knowing it’s a three-week problem the agent can’t touch. That judgment comes from 25 years of watching systems succeed and fail. The agents amplify it.

Juniors benefit from faster onboarding. Willison mentions that Cloudflare and Shopify each hired a thousand interns in 2025 because the ramp-up that used to take a month now takes a week with AI assistants. For someone starting fresh, the tools remove the friction of getting productive.

Mid-career engineers are in a different position. Five to ten years of experience — enough to be past the onboarding boost, not enough to direct with the kind of architectural intuition that makes agents really productive. Willison and the ThoughtWorks VPs both landed on this as the group most at risk, which matches the engineering bifurcation pattern I’ve been writing about. The work didn’t disappear for mid-career engineers. It reshaped itself into something their current skills don’t quite fit.

Levie’s organizational argument

Aaron Levie, CEO of Box, read the Willison quote and added the structural layer that was missing.

Companies don’t love being inefficient. They have managers because eventually you hit the limits of how much context one person can hold to produce useful work. So you delegate. And the person you delegate to handles their sub-context.

“For now, agents are generally only as effective as the context they’re provided, the tools they have access to, the human’s ability to keep them on track or review their work,” Levie wrote.

Agents haven’t broken free of those cognitive limits. Not yet. And Levie makes a point that should matter to every CTO planning headcount: “This is also generally why the jobs arguments from those who think people go away will be wrong.”

Directing agents is still management. You’re trading one type of cognitive load for another. From “doing the work” to “knowing what to ask for and whether the output is right.” The mental effort of directing is genuinely different from executing, and from what I see working with teams deploying agents, it’s often harder. The skills it demands — architectural judgment, pattern recognition across parallel workstreams, knowing when to stop — are exactly the skills most teams never had a reason to develop systematically until now.

What companies haven’t built yet

I’ve been working with B2B companies deploying agents for over a year, and the gap I keep running into has nothing to do with model quality or tooling. The technology works fine. The gap is organizational. Nobody sat down to define who directs the agents on a daily basis. Nobody measured how much cognitive load that direction requires. And nobody has a contingency plan for when the one person who holds all the context takes a sick day or quits.

Most companies still pitch AI internally as a productivity tool — deploy agents, get more done with the same headcount. That narrative holds up on a slide. It collapses when the senior engineer directing six agents hits a wall at 11am on a Tuesday and there’s nobody else on the team who can step in because they don’t have enough context on what those agents are doing.

What I think companies need is an agent operator role with actual defined responsibilities and staffing coverage, the same way you’d staff any critical operational function. They need evaluation frameworks focused on output quality rather than volume, because a thousand lines of code from an agent can be worth less than fifty if the direction was wrong. And they need cognitive load distribution plans so the ability to direct agents gets spread across multiple people instead of concentrating in whoever happens to be the most senior.

Willison puts it simply: “Using these tools effectively is not easy. That’s one of the great misconceptions in AI. It takes a lot of practice and it takes a lot of trying things that didn’t work.”

Speed of building used to be the competitive edge. I think that edge has shifted to something harder to acquire: the organizational capacity to direct what gets built, and to sustain that capacity over quarters rather than sprints.

A practical check

If you’re a CTO or VP of Engineering, there’s one question worth asking your team this week: who on the team can evaluate whether an agent’s output is architecturally correct? That’s a different question from who uses agents the most or who generates the most code, and the answer will probably be a shorter list than you’d like. Those people are your direction capacity. If it’s concentrated in one or two individuals, you’ve got a single point of failure that nobody put on a risk register.

The other metric I’d pay attention to is weekly cognitive drain — how exhausted is your team by Friday after a full week of directing parallel agents? If the answer is “completely,” the productivity calculation you made when you bought the licenses needs revising. You’re running your best people at a pace they can sustain for months of novelty, not years of production.

I’ve been doing agent direction capacity assessments for companies deploying AI at scale, focused on mapping whether your team can sustain the cognitive load of directing agents over the medium term. If you’re deploying agents and haven’t mapped the human side of the equation, reach out with a description of your current setup. We’ll show you where the concentration risk is.

Frequently Asked Questions

ai agents simon-willison technology leadership agentic engineering agent management future of work cognitive load

Related Articles

Mercor Breach: 4 TB of Biometric Data You Can't Rotate
AI & Automation
· 10 min read

Mercor Breach: 4 TB of Biometric Data You Can't Rotate

Mercor, the $10B AI startup training models for OpenAI and Anthropic, fell to the LiteLLM supply chain attack. Lapsus$ claims video interviews, face scans, and passports from 30,000+ contractors.

AI security software supply chain biometric data
LiteLLM Attack: Your AI Trust Chain Just Broke
AI & Automation
· 7 min read

LiteLLM Attack: Your AI Trust Chain Just Broke

LiteLLM, the AI API key proxy with 97 million monthly downloads, was poisoned via PyPI. Your security scanner was the entry point.

AI security software supply chain LiteLLM