Your Team Uses AI Wrong (and Research Proves It)
Ricardo Argüello — February 23, 2026
CEO & Founder
General summary
Anthropic analyzed 9,830 real AI conversations and found a clear pattern: the gap between teams that get twice the value and those that don't has almost nothing to do with which model they use — it's about how they interact with it. Users who iterate show 2.67 fluency behaviors per conversation versus 1.33 for those who accept the first response.
- Iteration is the single biggest predictor of AI output quality — refining prompts doubles effectiveness
- Polished-looking AI outputs actually make teams less critical, not more — a dangerous false confidence
- Only 30% of users set explicit AI collaboration norms with their teams
- A mandatory second-draft rule (never accept the first AI response for business decisions) closes most of the quality gap
- The problem isn't the tool — it's that companies invest in licenses but not in teaching people how to use them
Imagine buying a high-end camera and only ever using the auto mode. You'd get decent photos, but nothing close to what the camera can actually do. That's what most teams do with AI — they accept the first answer and move on. The research shows that teams who push back and refine their requests get dramatically better results from the exact same tool.
AI-generated summary
The Question Your Company Never Asked
The scene repeats itself. A company buys AI licenses, the tech team celebrates, the executives wait for results. Three months later, usage plateaus. The adoption reports show logins, but no impact.
The problem was never the tool.
In February 2026, Anthropic published the AI Fluency Index — a study based on 9,830 real conversations with Claude.ai. Not a perception survey, not a lab experiment. Real conversations from people using AI to get work done.
The core finding: the difference between teams that get twice the value from AI and those that don’t has little to do with which model they use. It’s about how they interact with it.
At IQ Source, we’ve seen this pattern in dozens of companies. They obsess over comparing models and negotiating licenses — while ignoring how their teams actually use the tools. Meanwhile, their teams accept the first response AI gives them without question — like buying the finest scalpel on the market and never teaching the surgeon to use it.
What They Found in 9,830 Real Conversations
The research team — Swanson, Bent, Huang, Ludwig, Dakan, and Feller — used the 4D AI Fluency Framework, which defines 24 specific behaviors for effective human-AI collaboration. Of those 24, 11 are directly observable in Claude conversations. They didn’t measure satisfaction or speed — they measured how effectively people collaborate with AI to reach better outcomes.
Iteration Changes Everything
The most striking data point: users who iterate — who refine their prompts and question what they get back — show 2.67 fluency behaviors per conversation. Those who accept the first response show 1.33.
The gap isn’t just numerical. Iterators are 5.6 times more likely to question the AI’s reasoning and 4 times more likely to identify missing context.
In business terms: the employee who accepts the first draft of an analysis or a proposal is getting half the possible value. Not because the AI is bad — because they didn’t push it further.
The Perfect Output Paradox
This finding struck me as the most uncomfortable one for enterprises. The researchers call it the Artifact Paradox: when AI produces a polished-looking result — code that compiles, a report with clean formatting — users become less critical.
The numbers: teams show 5.2 percentage points less likelihood of identifying gaps when the output looks professional. And 3.1 points less of questioning the underlying assumptions.
The implication is concerning: your teams are least vigilant exactly when the stakes are highest — when the output will be used directly to make decisions.
One of our clients learned this the hard way: an AI-generated financial report looked impeccable — neatly formatted numbers, clear charts. Nobody noticed the model had assumed a 15% growth rate when historical data showed 4%. The perfect formatting hid an absurd assumption.
What Works the Same Across Six Languages
The study analyzed conversations in six languages. The result: 85.7% of users iterate at least once, but only 30% define collaboration norms — telling the AI how they want to work and what format they expect.
This isn’t a cultural problem. It’s behavioral. And that’s good news, because behavioral problems are solved with training, not tool changes.
Why Your Company Should Care (With Numbers)
Let’s do the math. Your company has a team of 20 people using AI. Each person makes 15 AI-assisted decisions per day. That’s 300 daily decisions.
If your team operates like most — accepting the first response, not iterating — they’re getting half the value on every interaction. Not because AI fails, but because they don’t ask for more.
Now multiply that over a quarter. That’s over 19,000 decisions where your team left value on the table.
The gap between iterating and not is the difference between AI as a cost to justify and AI as an advantage that shows up in results. Anthropic’s study data confirms it: it’s not the tool, it’s the habit.
Three Changes That Produce Immediate Results
Establish the “Second Draft” Rule
The simplest and most effective implementation: never accept AI’s first response for business decisions.
The protocol is straightforward. AI generates a draft. The user reviews and refines — questions the assumptions, asks for alternatives. If context was missing, they add it. Only then is the output used.
This sounds obvious. It isn’t. The study shows most users treat AI interaction like a Google search: ask a question, get an answer, use it. But AI isn’t a search engine. It’s a collaborator that improves with every back-and-forth.
Audit Results That Look “Perfect”
To counter the Artifact Paradox, we implement a three-question checklist with our clients before using any AI output that looks complete:
- What assumptions did the AI make that we didn’t ask for? Every model fills information gaps with implicit assumptions. If you don’t identify them, you’re accepting them unknowingly.
- What context was missing? AI works with what you gave it. Incomplete context produces an elegant answer to the wrong question.
- Would a subject matter expert accept this without pushback? If your financial analyst or senior engineer would challenge something in the output, your team shouldn’t accept it either.
Define Explicit AI Collaboration Norms
Only 30% of users in Anthropic’s study define how they want to work with AI. It’s the lowest-effort, highest-impact change.
Simple norms work well: “Before giving me the final answer, list the assumptions you’re making.” Or: “If you don’t have enough information, ask me before completing.” Even just “Give me two alternatives with pros and cons” changes the output quality dramatically.
For a deeper look at how different departments can define these norms for their specific functions, we wrote a detailed guide on AI beyond the engineering department.
What We Do at IQ Source About This Problem
When a company tells us “AI isn’t delivering results,” the first thing we do isn’t check which model they use. We look at how they use it.
Our methodology starts with interaction design: we map the workflows where AI intervenes and design specific interaction patterns for each use case. Using AI to review contracts isn’t the same as using it to generate sales proposals — each workflow needs its own iteration protocol.
From there, we build role-specific playbooks. The CFO and the legal team need to interact with AI in fundamentally different ways. These guides include base prompts, collaboration norms, and validation criteria adapted to each function.
The third piece is fluency behavior measurement — a framework similar to Anthropic’s study. We don’t measure how much teams use AI. We measure how well they use it. The distinction matters.
If you want to understand the full implementation process step by step, we detail it in our guide on how to implement AI in your company.
The Real Mistake Isn’t Picking the Wrong AI
Companies spend weeks comparing ChatGPT versus Claude versus Gemini. They evaluate benchmarks and negotiate pricing. They run pilot tests with the model of the moment. All of that is fine — but it’s 20% of the problem.
The other 80% is behavioral. Does your team iterate, or do they accept the first response? Do they audit outputs that look perfect — or send them straight to the client? And has anyone actually defined how they want to collaborate with AI, or is everyone just winging it?
Anthropic’s study quantified this with 9,830 conversations. The gap is real, measurable, and fixable.
You already have the tools. The question is whether your team knows how to get their full value. If you want to measure the fluency gap in your organization, get in touch — this is exactly what we diagnose. Not what technology you have, but how well your teams are actually using it.
Frequently Asked Questions
Start with usage frequency — how often each person uses AI tools weekly. Then assess prompt quality: do they iterate or accept the first result? Finally, check whether they apply AI to real business tasks. The gap between superficial use and real adoption comes down to consistent iteration.
Usually because they don't iterate. The first output from any AI model is rarely the best. Teams that get real value edit the prompt and adjust context — often across three or four cycles. Most stop at the first attempt and conclude the tool doesn't work.
Accepting the first result without refining it. Teams that generate a document or analysis with AI and send it as-is get mediocre output. Those who use AI as a starting point and apply professional judgment to the result produce work that exceeds what they'd achieve without the tool.
Start by implementing a second-draft rule — never accept the first AI response for business decisions. Then audit results that look perfect using a checklist of assumptions and missing context. Plus, define explicit AI collaboration norms. The study shows only 30% of users do the latter today.
Related Articles
Karpathy Stopped Asking AI for Answers. He Asked It to Compile His Knowledge.
17 million saw Karpathy's post about LLM knowledge bases. Most copied the folder structure. Few understood the real shift: knowledge that compounds vs. knowledge that rots.
AI Agent Traps: the web your agent sees isn't yours
Google DeepMind mapped 18 attack types against AI agents. A viral thread fabricated the paper's numbers. The irony proves the thesis.