Why does AI model fine-tuning depreciate with each new generation of models?

Because fine-tuning that compensates for model limitations — instruction following, tone consistency, reasoning about specific cases — becomes unnecessary when the next model solves those limitations out of the box. The retraining cycle every 6-12 months at $10K-$50K+ doesn't justify itself when a base model with good context engineering achieves similar results.

When does RAG still make sense instead of full-context with 1M+ token windows?

RAG remains necessary when the knowledge base exceeds 1M tokens, when query volume makes full-context cost prohibitive, or when exact citations with source traceability are required. For documents that fit in the window — contracts, regulations, codebases — full context already works better than chunking and retrieving.

How can a B2B company tell if its AI investment will depreciate?

By using the 10x filter: if the next model is 10 times better in capability, does this investment still make sense? Investments in data, integrations, and observability pass because better models use them more effectively. Investments in compensating for model limitations — like fine-tuning for instructions or prompt engineering teams — don't pass.

What is model-agnostic architecture and why does it protect enterprise AI investment?

It's an architecture where the model integration layer is decoupled from the rest of the system. It uses abstractions like MCP to define tools and data, allowing you to swap providers or models without rewriting the application. It protects investment because each model improvement is automatically used without migration costs.

www.iqsource.ai

Your AI Investments Have an Expiration Date (2026)

Ricardo Argüello

Your AI Investments Have an Expiration Date (2026)

Ricardo Argüello — March 19, 2026

Ricardo Argüello

CEO & Founder

March 19, 2026 Business Strategy 7 min read

In 2024, fine-tuning a model to classify contracts cost $10K-$50K and took weeks. In 2025, a base model with a good prompt and the full document in the context window did the same job. The fine-tuning didn’t stop working. It stopped being necessary.

Alex Wang, author of the Learn AI Together newsletter (541K+ subscribers), put it bluntly: if 90% of the AI techniques we were talking about last year are already losing relevance, what actually matters when building AI products now? His answer: the value shifted from isolated techniques to complete systems. What Wang proposes is building for model evolution, not for the current model.

What follows isn’t a debate about which technique is “best.” It’s a guide to the investment lifecycle of AI techniques: which ones depreciate, which ones appreciate, and how to filter before you spend.

The arc: from fine-tuning to context

In 2023-2024, models had small windows and followed instructions inconsistently. Fine-tuning was the default answer. Need the model to understand your contracts? Fine-tune it with thousands of examples.

Then in 2024-2025, windows grew but not enough. RAG became the standard fix: chunk documents, index, retrieve by semantic similarity. Every serious company had a RAG pipeline.

Now, 2025-2026: 200K, 1M, 2M token windows. Models that follow complex instructions out of the box. Context engineering and agents replace much of what used to require fine-tuning or RAG.

Each of those waves solved real problems, but all of them had an expiration date. The trap: companies spend heavily to compensate for a model’s current blind spots. When the next generation arrives without those blind spots, the investment evaporates, not because it was wrong, but because the limitation that justified the spend no longer exists.

Three techniques with an expiration date

Fine-tuning to compensate for limitations

There’s fine-tuning that holds up and fine-tuning that expires in months. The difference: if you’re training the model on proprietary data it has never seen (industry-specific legal terminology, regulatory report formats, internal taxonomies), that investment holds. The model doesn’t have those data, period.

But if you’re fine-tuning for instruction following, tone consistency, or better reasoning on common cases, the clock is ticking. New models do that out of the box. A 200-token system prompt already achieves what used to take weeks of training.

Enterprise fine-tuning costs range from $10K to $50K+ per cycle. And it’s not a one-time expense: every 6-12 months you need to retrain because the base model updated, your data changed, or the problem distribution shifted. It’s a treadmill.

In our experience at IQ Source, ~70% of the fine-tuning projects we evaluate are compensating for limitations the current model no longer has. The other ~30% — proprietary data, regulatory formats — still justify themselves.

Naive RAG pipelines

The RAG that works today doesn’t look like what was built in 2024.

Early pipelines were designed for 4K-8K token windows: chunk documents into 512-token fragments, generate embeddings, retrieve the 5 closest by cosine similarity. When the window was smaller than the document, chunking and searching was the only option.

With 1M+ token windows, many use cases work better by putting the entire document in context. A 200-page contract is ~80K tokens. It fits entirely in Claude’s window, and the model sees cross-referenced clauses that a naive RAG pipeline loses when fragmenting.

That doesn’t mean RAG dies. As I detailed in the context windows post, it remains necessary when:

The knowledge base exceeds 1M tokens (50K-page technical manuals, for example)
Query volume makes passing the full document every time cost-prohibitive
Exact citations with source traceability are needed

What depreciates is the naive pipeline. What holds up is the intelligent retrieval layer — knowing what information to pull and when.

Prompt engineering as a core competency

Everyone in your organization knowing how to interact with models effectively is a baseline skill, like knowing how to use a spreadsheet. Each model generation makes it more accessible, not less relevant. That ages well.

What doesn’t age well is a team of 5 people whose primary job is writing and maintaining prompts. Models understand simple instructions better with each generation. What used to require a 3,000-token prompt with 15 few-shot examples now resolves with a 200-token direct instruction.

Context engineering replaces prompt engineering as the operational skill that makes the difference. It’s not how you ask the model — it’s what information you give it to work with. That skill scales with each generation because better models make better use of context.

What ages well

Data infrastructure

Clean, structured, accessible data appreciates with each model generation. A 10x better model produces 10x better results, if it has good input data.

Data pipelines, quality governance, labeling, internal taxonomies: all of that is investment a better model uses more effectively, not less. It doesn’t matter if tomorrow’s AI uses transformers, SSMs, or an architecture that doesn’t exist yet. It’s going to need clean data.

At IQ Source, when we evaluate a client’s AI maturity, the first thing we look at isn’t which model they use. It’s the state of their data. No advanced model will save an operation running on messy data, but a clean information foundation makes even a basic model perform well.

Integration with real workflows

The value of AI is in where it connects to operations, not in the model itself.

Integrations with CRM, ERP, and support systems. Well-defined APIs. Workflow triggers and approval chains. All of that is model-agnostic: it works the same if you switch providers tomorrow.

The API strategy a company builds today becomes more valuable as models improve, because there will be more points in operations where AI can add value through those same integrations.

Human oversight and model-agnostic architecture

Two investments that appreciate together.

Human oversight: review processes, output governance, quality metrics. As AI does more, the need to monitor what it produces grows proportionally. This investment never depreciates.

Model-agnostic architecture: abstraction layers like MCP (Model Context Protocol) that define tools and data in a standardized way. They let you swap models or providers without rewriting the application. When the model of the moment changes (and it will), migration is configuration, not development.

Both pass the 10x filter easily: more model capability = more need for oversight and more value in being able to switch fast.

The 10x filter

Before approving any AI investment, one question should be on the table:

If the next model is 10 times better in capability, does this investment still make sense?

Investment	Ages well?	Why
Fine-tuned model for contract classification	No	Base models already do this with context engineering
Data pipeline connecting CRM + ERP + support	Yes	Better models = better use of this data
RAG pipeline for 50K-page knowledge base	Partially	Retrieval layer holds, chunking simplifies
Dedicated prompt engineering team (5 people)	No	Redistribute toward context engineering + domain expertise
Observability stack for AI outputs	Yes	More automation = more need for monitoring

Don’t spend money fixing the layers that OpenAI, Anthropic, and Google are going to fix for you. Invest in the layers that only get better if you build them: your data, your integrations, and your governance.

What we do with this at IQ Source

Every week we evaluate AI projects from B2B clients. The most revealing question isn’t “does it work?” but “will it still work in 12 months?”

A RAG pipeline that works today but was built on 2024 assumptions (4K token windows, models that don’t follow long instructions) has months of useful life left. A well-built data pipeline feeding that RAG will feed whatever comes next.

The difference between an investment that lasts and one that depreciates is whether it compensates for a temporary limitation or builds a permanent capability.

If you want to know which of your current AI projects have an expiration date, send us the list. We’ll classify each one by shelf life — short, medium, durable — and return a one-page investment durability report.

Request a durability report

Frequently Asked Questions

AI investment fine-tuning RAG prompt engineering enterprise strategy model-agnostic architecture technology obsolescence

We're AI Consultants. Sometimes We Say: Don't Use AI

Business Strategy

March 24, 2026 · 9 min read

We're AI Consultants. Sometimes We Say: Don't Use AI

An AI consultancy telling clients 'skip the AI' sounds contradictory. But it's the most valuable thing we do.

AI strategy decision making AI ROI

The 100x Employee Already Exists (And Changes How You Hire)

Business Strategy

March 24, 2026 · 6 min read

The 100x Employee Already Exists (And Changes How You Hire)

One AI-literate professional now produces what used to take a team. Jensen Huang confirmed it at GTC 2026. Here's what it means for your hiring strategy.

artificial intelligence talent hiring

B2B Enterprise Services

Software Development

Digital Marketing

Free Tools

Your AI Investments Have an Expiration Date (2026)

Your AI Investments Have an Expiration Date (2026)

General summary

The arc: from fine-tuning to context

Three techniques with an expiration date

Fine-tuning to compensate for limitations

Naive RAG pipelines

Prompt engineering as a core competency

What ages well

Data infrastructure

Integration with real workflows

Human oversight and model-agnostic architecture

The 10x filter

What we do with this at IQ Source

Frequently Asked Questions

Related Articles

We're AI Consultants. Sometimes We Say: Don't Use AI

The 100x Employee Already Exists (And Changes How You Hire)

IQ Source Assistant