Ricardo Argüello
CEO & Founder
The Question Your Company Never Asked
The scene repeats itself. A company buys AI licenses, the tech team celebrates, the executives wait for results. Three months later, usage plateaus. The adoption reports show logins, but no impact.
The problem was never the tool.
In February 2026, Anthropic published the AI Fluency Index — a study based on 9,830 real conversations with Claude.ai. Not a perception survey, not a lab experiment. Real conversations from people using AI to get work done.
The core finding: the difference between teams that get twice the value from AI and those that don’t has little to do with which model they use. It’s about how they interact with it.
At IQ Source, we’ve seen this pattern in dozens of companies. They obsess over comparing models, negotiating licenses, picking the right vendor. Meanwhile, their teams accept the first response AI gives them without question. It’s like buying the finest scalpel on the market and never teaching the surgeon to use it.
What They Found in 9,830 Real Conversations
The research team — Swanson, Bent, Huang, Ludwig, Dakan, and Feller — used the 4D AI Fluency Framework, which defines 24 specific behaviors for effective human-AI collaboration. Of those 24, 11 are directly observable in Claude conversations. They didn’t measure satisfaction or speed — they measured how effectively people collaborate with AI to reach better outcomes.
Iteration Changes Everything
The most striking data point: users who iterate — who refine, question, ask for adjustments — show 2.67 fluency behaviors per conversation. Those who accept the first response show 1.33.
The gap isn’t just numerical. Iterators are 5.6 times more likely to question the AI’s reasoning and 4 times more likely to identify missing context.
In business terms: the employee who accepts the first draft of an analysis, a report, a proposal is getting half the possible value. Not because the AI is bad — because they didn’t push it further.
The Perfect Output Paradox
This finding struck me as the most uncomfortable one for enterprises. The researchers call it the Artifact Paradox: when AI produces a polished-looking result — code that compiles, a well-formatted report, a presentation with clean structure — users become less critical.
The numbers: teams show 5.2 percentage points less likelihood of identifying gaps when the output looks professional. And 3.1 points less of questioning the underlying assumptions.
The implication is concerning: your teams are least vigilant exactly when the stakes are highest — when the output will be used directly to make decisions.
One of our clients learned this the hard way: an AI-generated financial report looked impeccable, the numbers were neatly formatted, the charts were clear. Nobody noticed the model had assumed a 15% growth rate when historical data showed 4%. The perfect formatting hid an absurd assumption.
What Works the Same Across Six Languages
The study analyzed conversations in six languages. The result: 85.7% of users iterate at least once, but only 30% define collaboration norms — telling the AI how they want to work, what format they expect, what level of detail they need.
This isn’t a cultural problem. It’s behavioral. And that’s good news, because behavioral problems are solved with training, not tool changes.
Why Your Company Should Care (With Numbers)
Let’s do the math. Your company has a team of 20 people using AI. Each person makes 15 AI-assisted decisions per day. That’s 300 daily decisions.
If your team operates like most — accepting the first response, not iterating — they’re getting half the value on every interaction. Not because AI fails, but because they don’t ask for more.
Now multiply that over a quarter. That’s over 19,000 decisions where your team left value on the table.
The gap between iterating and not is the difference between AI as a cost to justify and AI as an advantage that shows up in results. Anthropic’s study data confirms it: it’s not the tool, it’s the habit.
Three Changes That Produce Immediate Results
Establish the “Second Draft” Rule
The simplest and most effective implementation: never accept AI’s first response for business decisions.
The protocol is straightforward. AI generates a draft. The user reviews and refines — questions assumptions, asks for alternatives, adds missing context. Only then is the output used.
This sounds obvious. It isn’t. The study shows most users treat AI interaction like a Google search: ask a question, get an answer, use it. But AI isn’t a search engine. It’s a collaborator that improves with every back-and-forth.
Audit Results That Look “Perfect”
To counter the Artifact Paradox, we implement a three-question checklist with our clients before using any AI output that looks complete:
- What assumptions did the AI make that we didn’t ask for? Every model fills information gaps with implicit assumptions. If you don’t identify them, you’re accepting them unknowingly.
- What context was missing? AI works with what you gave it. If the context was incomplete, the result is an elegant answer to the wrong question.
- What would a subject matter expert challenge? If a financial analyst, a lawyer, or a senior engineer wouldn’t accept that output without questioning something, your team shouldn’t either.
Define Explicit AI Collaboration Norms
Only 30% of users in Anthropic’s study define how they want to work with AI. It’s the lowest-effort, highest-impact change.
Simple norms: “Before giving me the final answer, list the assumptions you’re making.” “If you don’t have enough information, ask me before completing.” “Give me two alternatives with pros and cons before the final result.”
For a deeper look at how different departments can define these norms for their specific functions, we wrote a detailed guide on AI beyond the engineering department.
What We Do at IQ Source About This Problem
When a company tells us “AI isn’t delivering results,” the first thing we do isn’t check which model they use. We look at how they use it.
Our methodology has three components:
Interaction design. We map the workflows where AI intervenes and design specific interaction patterns for each use case. Using AI to review contracts isn’t the same as using it to generate sales proposals — each workflow needs its own iteration protocol.
Role-specific playbooks. The CFO, the operations manager, and the legal team need to interact with AI in different ways. We create role-specific guides with base prompts, collaboration norms, and validation criteria adapted to each function.
Fluency behavior measurement. Using a framework similar to Anthropic’s study, we measure how well teams interact with AI over time. We don’t measure how much they use it — we measure how well they use it. The distinction matters.
If you want to understand the full implementation process step by step, we detail it in our guide on how to implement AI in your company.
The Real Mistake Isn’t Picking the Wrong AI
Companies spend weeks comparing ChatGPT versus Claude versus Gemini. They evaluate benchmarks, negotiate pricing, run pilot tests with the model of the moment. All of that is fine. But it’s 20% of the problem.
The other 80% is behavioral. It’s whether your team iterates or accepts the first response. It’s whether they audit outputs that look perfect or send them straight to the client. It’s whether they’ve defined how they want to collaborate with AI or whether everyone is just winging it.
Anthropic’s study quantified this with 9,830 conversations. The gap is real, measurable, and fixable.
You already have the tools. The question is whether your team knows how to get their full value. If you want to measure the fluency gap in your organization, get in touch — this is exactly what we diagnose: not what technology you have, but how well your teams are using it and where the blind spots are costing you results.
Frequently Asked Questions
It's a study published in February 2026 by an Anthropic team led by Swanson, Bent, Huang, Ludwig, Dakan, and Feller. Based on 9,830 real conversations with Claude.ai, it uses a four-dimension framework with 24 specific behaviors (11 directly observable) measuring how effectively people collaborate with AI beyond simple usage metrics.
According to the study, users who iterate show 2.67 fluency behaviors per conversation versus 1.33 for those who accept the first response. They're 5.6 times more likely to question AI reasoning and 4 times more likely to identify missing context. The gap in final output quality is significant.
It's a finding where polished AI outputs — working code, well-formatted reports — make users less critical. Teams show 5.2 percentage points less likelihood of identifying gaps and 3.1 points less of questioning assumptions when the output looks professional. Users are least vigilant when stakes are highest.
Three concrete actions: implement a second-draft rule (never accept the first AI response for business decisions), audit results that look perfect using a checklist of assumptions and missing context, and define explicit AI collaboration norms. The study shows only 30% of users do the latter today.
Related Articles
AI Agents for Enterprise Operations: A Decision-Maker's Playbook
A practical playbook for deploying AI agents in enterprise operations. From procurement and customer service to compliance, with proven frameworks for bridging the demo-to-production gap.
How to Implement AI in Your B2B Company: A Practical Guide
Concrete steps for implementing AI in B2B operations: from picking the right use case to measuring results in the first 90 days.