Skip to main content

Your AI plateaus if it doesn't dream between sessions

Anthropic shipped dreaming in agents on May 6. The difference between AI that compounds and AI that plateaus is architectural, not model-driven.

Your AI plateaus if it doesn't dream between sessions

Ricardo Argüello

Ricardo Argüello
Ricardo Argüello

CEO & Founder

Business Strategy 9 min read

Anthropic shipped a piece on Tuesday night that most enterprise AI deployments do not have. They called it dreaming in Claude Managed Agents: a layer that reviews the agent’s past sessions, extracts patterns and curates persistent memory between conversations. The official Claude announcement crossed 2.8 million views in under twenty-four hours. By the next morning, Aakash Gupta’s thread laying out the neuroscience case had passed 220 thousand views.

The line from his thread that should land in every executive committee: “Agents that dream between sessions will compound. The ones still running on raw context window will hit the same ceiling humans hit when they pull all-nighters.”

That is the executive thesis on the table today. The difference between AI that becomes a competitive advantage and AI that becomes a monthly invoice is not the model. It is whether it has a sleep layer. The board-room question is no longer whether the company is using AI; it is whether the AI is compounding month over month or sitting at a ceiling. If your agent today solves the same problems as your agent six months ago, just with more raw context piled on top, the ceiling already arrived. The model bill keeps growing while real capability flatlines.

RAG (retrieval-augmented generation) is search. You embed past sessions as vectors and pull them when relevant. It works for retrieving facts. It does not work for the three things a learning agent actually needs.

Pattern extraction. Naming what mattered across the last hundred sessions, working out the parts that generalize, and spotting the sequences that predicted a good outcome. RAG does none of that. RAG finds. Pattern extraction is what Aakash described as the hippocampal replay running at roughly twenty times normal speed: a ten-second sequence compressed to about 500 milliseconds. Wilson and McNaughton showed it in rats in 1994. Disrupt the sharp-wave ripples optogenetically and the rat fails the next day’s task. The replay is causal, not correlational. Most enterprise AI pipelines run with that causal piece missing entirely.

Trace reorganization. Moving learning from episodic memory (this session) into semantic memory (a reusable schema). In the brain, this moves memories out of the hippocampus and into the neocortex; that is why old memories survive hippocampal damage but recent ones do not. In your agent, this is the difference between “every Monday I re-explain to the assistant how we do things here” and “the assistant already has an internal schema and I only adjust the specifics of Monday’s case.”

Selective forgetting. Pruning what didn’t matter. Tononi and Cirelli’s synaptic homeostasis hypothesis says wakefulness is net potentiating: every interaction during the day pushes synaptic weights up. If you never prune, the signal-to-noise ratio collapses. An agent without a pruning step accumulates noise inside every vector until the response quality drops without anyone noticing the exact moment.

The two agentic architectures that went viral this week name the same gap in different language. Rahul Garg’s post on the Palantir AIP stack (433.6K views) separates the ontology layer and the memory & knowledge services from agent lifecycle. Neha Sharma’s nine-layer reference (550.9K views, 1,200 saves while she was prepping the Claude Architect cert) lists five distinct kinds of memory: short-term, long-term, knowledge base, episodic store and profile store. Not one. Five. Most enterprise deployments today have one type: a vector DB with RAG. The other four layers, where the moat actually lives, are still on the whiteboard.

Five cycles of the consolidation layer since 1990

I have been doing this for thirty-six years, since I was fifteen on a Commodore 64 and a Texas Instruments. I have seen this same pattern five times. Each cycle, the consolidation layer is where the winners separated from the companies that paid three times the rollout cost.

1990, data warehouses. They promised pattern extraction through batch ETL. Companies that built the consolidated data mart paid once. Companies that left data in transactional silos paid three times and are still integrating today.

2000, BI and OLAP cubes. Pattern extraction by precomputed aggregations. Whoever had a named owner of the cube went from “monthly report” to “weekly decision.” Everyone else generated PDFs nobody opened.

2010, MLOps and retraining pipelines. Model staleness is the exact equivalent of an agent that doesn’t dream. Teams with mature MLOps compounded capability every release. Teams with a model trained once flatlined by month six, exactly like an agent without dreaming today.

2020, RAG and vector DBs. The current trap. You accumulate embeddings, retrieve on demand, and it looks like learning. It isn’t. It only finds. The conversation closing this week puts that line in bold.

2026, dreaming and between-session consolidation. New cycle, same pattern. Build the sleep layer and you compound. Skip it and you pay the model subscription forever while sitting at the ceiling.

The new variable is compression. The previous cycle (RAG) took roughly four years to reveal its ceiling. This one is revealing in months because Anthropic named the problem in a keynote and shipped the solution the same week. The company that takes two quarters to internalize this will not lose one quarter; it will lose both.

Forgetting is the other half of sleep

Aakash made the second important point in a quote-tweet: dreaming is also about forgetting. Tononi and Cirelli’s synaptic homeostasis hypothesis says sleep down-regulates what wakefulness over-potentiated. Without that down phase, synapses saturate and stop discriminating.

The institutional translation is clean. Failing to kill projects produces technical debt. Failing to kill features produces product debt. The new flavor on the same shelf is AI debt: an agent that never forgets accumulates noise until the response quality drops silently. The hard institutional question is who owns what your agent forgets. If the answer is “nobody, it all lives in logs,” the next twelve months will reveal exactly when the read became unreliable.

There is one more data point worth landing from the Tse et al. paper out of Morris’s lab in 2007. Rats with an existing schema integrate new information in 48 hours, against the three to four weeks consolidation usually takes. The institutional translation: an agent with an explicit schema (a design.md, an ontology, a voice.md) integrates new context roughly ten times faster. That connects to the brand-readable-by-agent argument we covered yesterday: the file the agent reads at the start of every conversation is the equivalent of the prior schema in Morris’s experiment. The agent with a schema accelerates. The agent without one starts from scratch every Monday.

The same principle shows up in adjacent cycles. Boris Cherny killing 80% of his prototypes before noon is the institutional version of selective forgetting: the decision about what stays and what gets pruned calibrates the signal-to-noise ratio of the entire quarter. Without that decision, the product team fills with prototypes nobody will scale and the spend committee loses the read.

Five questions for your next AI standup

Three for the CEO. Two for the CTO. If all five cannot be answered in fifteen minutes, the AI program is flying without instruments.

For the CEO.

Start with the compounding test. Point to one capability your agent has today that it did not have last month. “It has more context” does not count; new capability means a consolidated pattern, not more piled-up logs. If nothing comes to mind, you have an answer already.

Then the continuity test. If you turned off the model for a week, would the institutional memory survive? When the entire AI investment lives inside the model’s context window and the embeddings hosted by the vendor, the asset is not yours; it is the vendor’s, and it leaves with the next contract negotiation.

Last, the named-owner question. Is there a human whose name you can write down, owner of the consolidation layer for the next ninety days? Without a name, the AI program is governance-orphaned, and the first model swap erases the learning.

For the CTO.

The schema test comes first. Is there an explicit schema (design.md, ontology, voice.md, a file the agent reads at startup) against which it integrates new context? Without one, integration takes the equivalent of three to four weeks; with one, 48 hours. That gap is what separates an agent that scales with the business from one that falls short on the next campaign.

The pruning test closes the loop. Is there an explicit step in the pipeline where the agent forgets selectively? If everything stays in the vector DB forever, congratulations, your AI has insomnia, and response quality will degrade silently every quarter without anyone noticing the moment it broke.

Where IQ Source fits

AI Maestro is the institutional version of “humans around the edge.” It is the audit that maps where your AI consolidates, what it forgets, who the named owner is and where the ceiling sits. Most companies need someone outside to name that ceiling before the next quarter gets spent expanding the model context. That is the conversation we open in the first meeting.

For software companies whose product includes an agent, the Tech Partner role covers the next step. The moat is no longer the model, which commoditized two quarters ago and dropped another notch this week. The moat is the schema, the consolidation step and the memory that survives a vendor swap. That is the same pattern we covered in runtime is commodity, workflow is the moat and that the dreaming launch accelerates by another cycle.

The practical question for this week: who on your team owns the file where what your agent knows actually lives? If nobody does, the first job is not buying another model or expanding the context window. It is naming the owner and giving them ninety days to build the missing layer. The easy half of AI now costs zero. The hard half decides whether your next quarter compounds or plateaus.

Frequently Asked Questions

dreaming agent memory agentic architecture Anthropic Claude Managed Agents AI Maestro Tech Partner

Related Articles

Being chosen is not the same as being seen
Business Strategy
· 6 min read

Being chosen is not the same as being seen

Farza, Eduardo Ordax and Jaya Gupta posted the same diagnosis at two depths in one week: when building costs zero, the only thing AI can't copy is your company's shape.

organizational shape talent Jaya Gupta
Building costs zero. Distribution is the new investment.
Business Strategy
· 11 min read

Building costs zero. Distribution is the new investment.

Aaron Levie, Gergely Orosz, and Eric Siu published the same thesis in 36 hours: building got commoditized. Owned distribution loops are the 2026 moat.

distribution strategy moat Aaron Levie