Your AI Investments Have an Expiration Date (2026)
Ricardo Argüello — March 19, 2026
CEO & Founder
General summary
In 2024, fine-tuning a model to classify contracts cost tens of thousands of dollars. In 2025, a base model with good context engineering did the same job. This pattern repeats: companies invest in compensating for model limitations, and the next model eliminates those limitations. Some techniques depreciate, and some investments appreciate with each generation.
- Fine-tuning to compensate for limitations (instruction following, tone, reasoning) depreciates every 6-12 months — fine-tuning with proprietary data the model has never seen holds up
- Naive RAG pipelines (chunk-embed-retrieve) lose value against 1M+ token windows — RAG for massive knowledge bases remains necessary
- Prompt engineering as an individual skill ages well; as a dedicated 5-person team, it doesn't
- What appreciates: clean data infrastructure, integrations with real workflows, and architecture that lets you swap models without rewriting
- The 10x filter: if the next model is 10 times better, does this investment still make sense?
It's like buying a manual transmission car because the automatics of the time were bad. You invested in a skill (driving stick) that compensated for a technology limitation. When automatics got better, that investment stopped paying off. The same thing happens in AI: many technical investments compensate for limitations that newer models no longer have.
AI-generated summary
In 2024, fine-tuning a model to classify contracts cost $10K-$50K and took weeks. In 2025, a base model with a good prompt and the full document in the context window did the same job. The fine-tuning didn’t stop working. It stopped being necessary.
Alex Wang, author of the Learn AI Together newsletter (541K+ subscribers), put it bluntly: if 90% of the AI techniques we were talking about last year are already losing relevance, what actually matters when building AI products now? His answer: the value shifted from isolated techniques to complete systems. What Wang proposes is building for model evolution, not for the current model.
What follows isn’t a debate about which technique is “best.” It’s a guide to the investment lifecycle of AI techniques: which ones depreciate, which ones appreciate, and how to filter before you spend.
The arc: from fine-tuning to context
In 2023-2024, models had small windows and followed instructions inconsistently. Fine-tuning was the default answer. Need the model to understand your contracts? Fine-tune it with thousands of examples.
Then in 2024-2025, windows grew but not enough. RAG became the standard fix: chunk documents, index, retrieve by semantic similarity. Every serious company had a RAG pipeline.
Now, 2025-2026: 200K, 1M, 2M token windows. Models that follow complex instructions out of the box. Context engineering and agents replace much of what used to require fine-tuning or RAG.
Each of those waves solved real problems, but all of them had an expiration date. The trap: companies spend heavily to compensate for a model’s current blind spots. When the next generation arrives without those blind spots, the investment evaporates, not because it was wrong, but because the limitation that justified the spend no longer exists.
Three techniques with an expiration date
Fine-tuning to compensate for limitations
There’s fine-tuning that holds up and fine-tuning that expires in months. The difference: if you’re training the model on proprietary data it has never seen (industry-specific legal terminology, regulatory report formats, internal taxonomies), that investment holds. The model doesn’t have those data, period.
But if you’re fine-tuning for instruction following, tone consistency, or better reasoning on common cases, the clock is ticking. New models do that out of the box. A 200-token system prompt already achieves what used to take weeks of training.
Enterprise fine-tuning costs range from $10K to $50K+ per cycle. And it’s not a one-time expense: every 6-12 months you need to retrain because the base model updated, your data changed, or the problem distribution shifted. It’s a treadmill.
In our experience at IQ Source, ~70% of the fine-tuning projects we evaluate are compensating for limitations the current model no longer has. The other ~30% — proprietary data, regulatory formats — still justify themselves.
Naive RAG pipelines
The RAG that works today doesn’t look like what was built in 2024.
Early pipelines were designed for 4K-8K token windows: chunk documents into 512-token fragments, generate embeddings, retrieve the 5 closest by cosine similarity. When the window was smaller than the document, chunking and searching was the only option.
With 1M+ token windows, many use cases work better by putting the entire document in context. A 200-page contract is ~80K tokens. It fits entirely in Claude’s window, and the model sees cross-referenced clauses that a naive RAG pipeline loses when fragmenting.
That doesn’t mean RAG dies. As I detailed in the context windows post, it remains necessary when:
- The knowledge base exceeds 1M tokens (50K-page technical manuals, for example)
- Query volume makes passing the full document every time cost-prohibitive
- Exact citations with source traceability are needed
What depreciates is the naive pipeline. What holds up is the intelligent retrieval layer — knowing what information to pull and when.
Prompt engineering as a core competency
Everyone in your organization knowing how to interact with models effectively is a baseline skill, like knowing how to use a spreadsheet. Each model generation makes it more accessible, not less relevant. That ages well.
What doesn’t age well is a team of 5 people whose primary job is writing and maintaining prompts. Models understand simple instructions better with each generation. What used to require a 3,000-token prompt with 15 few-shot examples now resolves with a 200-token direct instruction.
Context engineering replaces prompt engineering as the operational skill that makes the difference. It’s not how you ask the model — it’s what information you give it to work with. That skill scales with each generation because better models make better use of context.
What ages well
Data infrastructure
Clean, structured, accessible data appreciates with each model generation. A 10x better model produces 10x better results, if it has good input data.
Data pipelines, quality governance, labeling, internal taxonomies: all of that is investment a better model uses more effectively, not less. It doesn’t matter if tomorrow’s AI uses transformers, SSMs, or an architecture that doesn’t exist yet. It’s going to need clean data.
At IQ Source, when we evaluate a client’s AI maturity, the first thing we look at isn’t which model they use. It’s the state of their data. No advanced model will save an operation running on messy data, but a clean information foundation makes even a basic model perform well.
Integration with real workflows
The value of AI is in where it connects to operations, not in the model itself.
Integrations with CRM, ERP, and support systems. Well-defined APIs. Workflow triggers and approval chains. All of that is model-agnostic: it works the same if you switch providers tomorrow.
The API strategy a company builds today becomes more valuable as models improve, because there will be more points in operations where AI can add value through those same integrations.
Human oversight and model-agnostic architecture
Two investments that appreciate together.
Human oversight: review processes, output governance, quality metrics. As AI does more, the need to monitor what it produces grows proportionally. This investment never depreciates.
Model-agnostic architecture: abstraction layers like MCP (Model Context Protocol) that define tools and data in a standardized way. They let you swap models or providers without rewriting the application. When the model of the moment changes (and it will), migration is configuration, not development.
Both pass the 10x filter easily: more model capability = more need for oversight and more value in being able to switch fast.
The 10x filter
Before approving any AI investment, one question should be on the table:
If the next model is 10 times better in capability, does this investment still make sense?
| Investment | Ages well? | Why |
|---|---|---|
| Fine-tuned model for contract classification | No | Base models already do this with context engineering |
| Data pipeline connecting CRM + ERP + support | Yes | Better models = better use of this data |
| RAG pipeline for 50K-page knowledge base | Partially | Retrieval layer holds, chunking simplifies |
| Dedicated prompt engineering team (5 people) | No | Redistribute toward context engineering + domain expertise |
| Observability stack for AI outputs | Yes | More automation = more need for monitoring |
Don’t spend money fixing the layers that OpenAI, Anthropic, and Google are going to fix for you. Invest in the layers that only get better if you build them: your data, your integrations, and your governance.
What we do with this at IQ Source
Every week we evaluate AI projects from B2B clients. The most revealing question isn’t “does it work?” but “will it still work in 12 months?”
A RAG pipeline that works today but was built on 2024 assumptions (4K token windows, models that don’t follow long instructions) has months of useful life left. A well-built data pipeline feeding that RAG will feed whatever comes next.
The difference between an investment that lasts and one that depreciates is whether it compensates for a temporary limitation or builds a permanent capability.
If you want to know which of your current AI projects have an expiration date, send us the list. We’ll classify each one by shelf life — short, medium, durable — and return a one-page investment durability report.
Request a durability reportFrequently Asked Questions
Because fine-tuning that compensates for model limitations — instruction following, tone consistency, reasoning about specific cases — becomes unnecessary when the next model solves those limitations out of the box. The retraining cycle every 6-12 months at $10K-$50K+ doesn't justify itself when a base model with good context engineering achieves similar results.
RAG remains necessary when the knowledge base exceeds 1M tokens, when query volume makes full-context cost prohibitive, or when exact citations with source traceability are required. For documents that fit in the window — contracts, regulations, codebases — full context already works better than chunking and retrieving.
By using the 10x filter: if the next model is 10 times better in capability, does this investment still make sense? Investments in data, integrations, and observability pass because better models use them more effectively. Investments in compensating for model limitations — like fine-tuning for instructions or prompt engineering teams — don't pass.
It's an architecture where the model integration layer is decoupled from the rest of the system. It uses abstractions like MCP to define tools and data, allowing you to swap providers or models without rewriting the application. It protects investment because each model improvement is automatically used without migration costs.
Related Articles
We're AI Consultants. Sometimes We Say: Don't Use AI
An AI consultancy telling clients 'skip the AI' sounds contradictory. But it's the most valuable thing we do.
The 100x Employee Already Exists (And Changes How You Hire)
One AI-literate professional now produces what used to take a team. Jensen Huang confirmed it at GTC 2026. Here's what it means for your hiring strategy.