Skip to main content

Taste Debt: The Real Cost of Removing Yourself From AI

Peter Steinberger named the real failure mode of agentic workflows: pulling yourself out too early. The bill that shows up later I call taste debt.

Taste Debt: The Real Cost of Removing Yourself From AI

Ricardo Argüello

Ricardo Argüello
Ricardo Argüello

CEO & Founder

Business Strategy 6 min read

Peter Steinberger, the Austrian developer behind OpenClaw, posted something yesterday that reads more like a CFO memo than a founder tweet:

The real failure of agentic workflows comes when people remove themselves too early and expect quality without human taste in the loop. Strong output needs vision + steering + the right questions.

I had to sit with it for a minute. Then I wrote down the term I had been circling for weeks with clients: taste debt.

Steinberger doesn’t use the phrase. But he describes the mechanism exactly. Every agentic artifact your company ships without human taste adds to a balance your dashboards don’t show. The bill arrives in quarter two.

Why call it debt

The parallel to technical debt isn’t decorative. It’s structural.

Technical debt is a conscious trade: you surrender maintainability for speed. You know you’ll pay interest. Your engineering metrics track the payment: recurring bugs, fragile deploys, a codebase nobody wants to touch.

Taste debt trades the same way, but the interest doesn’t post to any dashboard you already read. It flattens your brand voice into something generic. It drives decisions that got made without real judgment. And eventually a client notices, usually before you do, that nobody is watching the output anymore.

Rick Rubin — the producer behind Johnny Cash, Adele, and System of a Down — teamed up with Anthropic in 2025 on The Way of Code, a reinterpretation of the Tao Te Ching for the AI era. One line compresses the thesis: AI collapses execution time, not taste. Taste stays human because taste is a discipline, not magic.

Trung Phan unpacked it well: Jobs, Rubin, anyone who has kept a recognizable voice across decades, trained a muscle. That muscle doesn’t live in any model. It gets trained by rejecting, revising, and asking again. Pull it out of the loop and your output regresses to the mean. The mean of 2026, across most channels, is noise that other machines generated.

How the interest compounds

Taste debt isn’t linear. It compounds. Three loops where it compounds faster than anywhere else.

The content loop. Your agent drafts a follow-up email. Your agent drafts the next one. Your agent drafts the next proposal. Each piece is individually acceptable. Six months in, your brand voice has flattened into the same tone as three competitors using the same model with the same prompt. You didn’t see the moment it happened.

The decision loop. An agent triages tickets, classifies, responds, escalates. The rule looks solid. But the three percent of cases it escalates poorly never shows up on the top-line metric. That three percent becomes the Q4 churn you will rationalize as “the market.” It wasn’t. It was an accrual of unreviewed microdecisions.

The training loop. Subtler. If you feed models on text generated by models, the distribution degrades. Decisions.com captured the pattern well: ungoverned agent ecosystems don’t fail loudly. They fail quietly. Duplicated logic here, conflicting outputs there, a prompt that worked yesterday behaving differently today. You end up with a system optimizing against itself.

Across the three loops the shape is the same: you save minutes today and pay the interest all at once, right when renegotiating is no longer an option.

The self-automator underperforms

This is measurable, not philosophical.

Ethan Mollick, together with researchers at Harvard Business School and Boston Consulting Group, ran a field experiment with 244 consultants. They sorted behavior into three modes:

  • Centaurs divide tasks cleanly. AI does X, the human does Y.
  • Cyborgs fuse. Start a sentence, let the AI finish, edit, and loop.
  • Self-automators delegate almost everything. The human exits the loop.

MIT Sloan highlighted the finding: self-automators produced the weakest work. Not because the AI was bad. Because they stepped into what Mollick calls the jagged frontier — the zone where the model sounds competent and lands a confident answer but misses the core of the problem. Without a human reviewing with trained taste, nobody caught the miss.

That’s taste debt measured in a peer-reviewed paper.

Karpathy already paid some of this in public

The market is amortizing a chunk of this debt out loud.

Andrej Karpathy retired the term “vibe coding” in February and started defending something he calls agentic engineering. His argument: you aren’t writing code 99 percent of the time, you are orchestrating agents and acting as oversight. There is craft and expertise in the oversight itself.

This is the same person who put “vibe coding” on the map a year earlier, walking it back. Not because the method was wrong. Because people adopted it without the ingredient that made it work: somebody with taste reviewing.

It’s a public admission that the default was misconfigured.

Lutke had it right in the memo

Tobi Lutke’s April 2025 Shopify memo circulated hard on LinkedIn and X. Most people quoted the line about reflexive AI usage being a baseline expectation. Few quoted the second half.

The second half is where judgment lives. AI as multiplier, human judgment at the top. Not “AI replaces human.” AI underneath, judgment above. That hierarchy is what pays taste debt down before it accumulates.

A factory that produces fast without QA just builds expensive inventory. Scaling agents without a review layer does the same thing, except the inventory is taste debt and you can’t see it from the warehouse.

What IQ Source does with it

Our work with clients over the last six months has a cleaner description after reading Steinberger. We design the amortization schedule for taste debt.

It isn’t another governance framework. It’s a working method for three decisions a team usually ducks:

  1. Non-negotiable human taste. Acceptance criteria, brand voice, anything with second-order effects. The agent can propose. A human has to sign.
  2. Freedom for the agent. Enumerations, first drafts, research passes, summaries. Forcing human review here is paying interest for no reason.
  3. The review interface itself — the piece most companies shortchange. If reviewing costs more than redoing, nobody reviews. And the debt keeps stacking.

That is the scaffolding we implement with Team OS: a governance layer where human taste lives inside the flow, not on top of it. And where the fluidity of roles stops getting confused with the absence of review.

If you are scaling agents across your company and you’ve noticed some of the output no longer sounds like you, you are probably already paying interest. The question is whether you want to design the amortization — or let it collect itself, on a calendar that isn’t yours.

Let’s talk before the quarter closes.

Frequently Asked Questions

taste debt agentic AI AI governance Peter Steinberger Rick Rubin Team OS human judgment

Related Articles

Google's $180B: The Enterprise Signal Nobody Reads
Business Strategy
· 9 min read

Google's $180B: The Enterprise Signal Nobody Reads

Google went from $30B to $180B in AI CapEx. No keynote needed. For enterprise buyers evaluating AI vendors, that number tells more than any product demo.

Google CapEx TPU
Your Team Passes Jira Tickets. Figma Plays Total Football.
Business Strategy
· 7 min read

Your Team Passes Jira Tickets. Figma Plays Total Football.

Figma and OpenAI run fluid product teams: designers code, PMs prototype. But Holland lost the 1974 final. Fluidity without governance doesn't ship products.

total football Figma OpenAI