What is tokenmaxxing and how did one company spend $500 million on AI in a month?

Tokenmaxxing is maximized AI usage that spirals out of control when employees generate prompts with no oversight or limits. Per an Axios report, one company spent over $500M on Claude in thirty days by granting unrestricted access with no budgets or monitoring. The problem was not the token price, it was the absence of scope and control.

Does a spending cap actually control AI costs in a company?

A spending cap slows the bill, but it is a blind instrument: it cannot separate tokens that create value from tokens that are wasted, and it cuts everyone the same. It works as an emergency brake, not a strategy. Controlling AI costs requires defining the scope and expected return of each process before turning an agent loose.

How does IQ Source keep a company's AI bill from spiraling out of control?

AI Maestro from IQ Source defines, in a two-month discovery, which processes are worth automating and how much return each should produce per token spent. By measuring token yield before building, the company sizes the spend per process and never needs a panic cap after receiving an impossible bill.

www.iqsource.ai

A $1,500 cap on AI treats the symptom, not the cause

Ricardo Argüello

A $1,500 cap on AI treats the symptom, not the cause

Q: Why did Uber put a $1,500 cap on its employees' AI spending?

Uber burned its entire 2026 AI budget in just four months because of token consumption from agentic coding tools like Claude Code. Before the cap, individual engineers were generating bills of $500 to $2,000 a month. The $1,500 per-person, per-tool cap is meant to contain the spend, though it does not address the underlying cause.

Ricardo Argüello — June 6, 2026

Ricardo Argüello

CEO & Founder

June 6, 2026 Business Strategy 5 min read

General summary

In a single week, Uber capped AI tool spending at $1,500 per person after burning its entire 2026 budget in four months, and it came out that one company spent over $500M on Claude in thirty days because it set no limits at all. Everyone's reaction is the same: cap it. But the cap treats the symptom. The disease is turning agents loose on work nobody sized: no one defined how much return each process should produce per token spent. A cap cannot tell which tokens were working. The fix is not to budget after the fact. It is to scope before you build.

AI-generated summary

Explore other styles:

Two things happened in the same week. Uber capped spending at $1,500 per person on each AI tool, after burning its full 2026 budget in four months. And it came out that one company spent over $500M on Claude in a single month, because it set no limits at all.

Half the internet reacted the same way: cap it. As an emergency brake, fair enough. As a strategy, it treats the symptom.

Here is the thesis in one line. The cap is not the cure. The disease is not the token price, it is turning agents loose on work nobody sized. No one defined how much return each process should produce per token spent, so when the bill lands, all that is left is to cut blindly. The fix is not to budget after the fact. It is to scope before you build.

The month that cost $500 million

The half-billion-dollar story is the scary one, so start there. Per an Axios report, one company burned more than $500M on Claude in thirty days. Not through a signed contract, but by giving employees unrestricted access: no budgets, no quotas, no monitoring. A consultant called it tokenmaxxing, the maximized usage that spirals once people generate prompts with no oversight.

Uber’s case is less dramatic but more instructive, because it is a disciplined company that still crashed. It burned its entire 2026 AI budget in the first four months. Before the cap, individual engineers were generating bills of $500 to $2,000 a month in token consumption. The answer was a $1,500 cap per person on each agentic coding tool, a usage dashboard for every employee, and a process to request more when needed.

It is a sound containment move. Uber did the right thing to stop the bleeding. But look at what a cap cannot do.

A cap can’t tell which tokens were working

A cap is a blind instrument. It cuts spend, but it cannot separate the token that closed a problem from the token wasted running in circles. It sets the same ceiling for the engineer who resolved three incidents and the one who left an agent looping all weekend.

Someone in this week’s conversation put it better than any consultant: nobody caps spend on something they can measure. The $1,500 cap is not a verdict on AI’s value. It is an admission that the company could not see which tokens were working.

And here is the part almost nobody wants to say out loud. The real problem is not the price per token, which has in fact been falling for years. The problem is that most companies turned agents loose on processes they never sized. They never asked, before starting, how much useful work this process should produce per dollar of tokens. Without that number, there is no way to know whether an $1,800 bill is a robbery or a bargain. All that is left is the reflex to cut.

Token spend now looks like cloud spend

Arvind Jain, CEO of Glean, keeps making a point worth hearing: token spend is an architecture problem, not just a model problem. His company reports clients whose annual AI budget runs out in one or two months, and his thesis is that the right question is not how many tokens a system consumes, but how much useful work it produces per token. He calls it token yield, and it depends on everything around the model: how context is retrieved, how models are routed, how prior work is reused.

It is the same transition the cloud went through fifteen years ago. Once agents can retry, loop, browse, and spawn sub-agents, spend stops looking like a fixed monthly license and starts looking like a cloud bill: variable, with no natural ceiling, and dangerous if nobody sets budgets and traces per workflow. The difference is that with the cloud we learned to measure cost per service before letting it run. With AI, too many companies skipped that step.

I have written before that cost was the guardrail you didn’t know you had, and that tokens per shipped feature is the KPI that matters. What changed this week is not the thesis. It is that the market finally reacted, and it reached for the wrong instrument. The cap comes after the spend. The scope comes before it.

What we do about it at IQ Source

When a company asks us to put AI into a process, we do not start with which model or how much budget. We start by asking how much useful work that process should produce, and what we would be willing to pay in tokens for that result. That number, defined before building, is what makes the panic cap unnecessary.

AI Maestro is the discovery where that math gets done. Two months mapping the real operation to decide which processes are worth automating and what token yield to expect from each. Out of it comes an AI Opportunity Score and a Go/No-Go gate that, more than once, recommends not turning the agent loose yet, precisely because the math does not close. That is not austerity. It is sizing the spend before you sign for it.

A cap tells you how much you can spend. It does not tell you whether it is worth spending. Those are two different questions, and the second one gets answered before you build, not when the bill lands. The next time someone proposes capping the team’s AI spend, ask the other question first: do we know, per process, how much useful work each dollar of tokens buys us? If the answer is no, the cap will not save you. It will just hide the problem for one more month.

Size your AI spend before you sign for it

Frequently Asked Questions

AI spending cap AI costs tokens Uber AI governance AI Maestro AI FinOps

Cisco Just Gave 90,000 Employees a Personal AI Agent

Business Strategy

July 21, 2026 · 7 min read

Cisco Just Gave 90,000 Employees a Personal AI Agent

Cisco's CFO confirms every employee gets a cost-routed AI agent by fiscal year start. On-prem infrastructure, smart routing, and an unresolved trust question.

Cisco AI agents AI governance

Apple v. OpenAI Shows the IP Leak No Model Policy Stops

Business Strategy

July 20, 2026 · 7 min read

Apple v. OpenAI Shows the IP Leak No Model Policy Stops

Apple sued OpenAI on July 10 citing 400+ former employees. Musk and Altman turned it into an X feud that weekend. The real lesson is somewhere else entirely.

Apple OpenAI lawsuit trade secret theft talent governance

A $1,500 cap on AI treats the symptom, not the cause

A $1,500 cap on AI treats the symptom, not the cause

General summary

The month that cost $500 million

A cap can’t tell which tokens were working

Token spend now looks like cloud spend

What we do about it at IQ Source

Frequently Asked Questions

Related Articles

Cisco Just Gave 90,000 Employees a Personal AI Agent

Apple v. OpenAI Shows the IP Leak No Model Policy Stops

IQ Source Assistant