📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent whitepaper from Google emphasizes that in AI software development, the model accounts for only about 10% of system behavior. The real focus should be on harness design and context engineering, which drive performance and cost-efficiency.

A new whitepaper from Google, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, states that the model constitutes only about 10% of what determines AI system behavior. The report emphasizes that the harness and context engineering are the dominant factors, fundamentally shifting how organizations should approach AI development and deployment.

The whitepaper challenges the common focus on acquiring the latest AI models, arguing that most failures and inefficiencies stem from configuration, tooling, and context management. Experiments cited show that tweaking the harness—such as prompts, tools, and rules—can dramatically improve performance without changing the model itself. For example, moving a coding agent from outside the Top 30 to Top 5 on a benchmark was achieved solely through harness adjustments.

It also introduces the concept of agentic engineering, where AI is integrated within formal specifications, tests, and oversight, as opposed to vibe coding, which relies on minimal prompts and quick fixes. The whitepaper stresses that cost and security considerations favor this disciplined approach, as it reduces token waste and vulnerabilities over time.

At a glance

reportWhen: published early 2026

The developmentGoogle’s new whitepaper highlights that the primary driver of AI system behavior is the harness and context, not the underlying model, marking a paradigm shift in SDLC strategies.

The Model Is Only 10% — The New SDLC With Vibe Coding

AI Dispatch · Field Notes

Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified

Vibe Coding

Casual prompts · “does it seem to work?” · disposable code · high risk

Structured AI-Assisted

Detailed prompts + constraints · manual testing · features in real codebases

Agentic Engineering

Formal specs · automated tests + evals + CI gates · production scale · low risk

Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.

The idea worth building your strategy around

Agent = Model + Harness

~10%

HARNESS — prompts · tools · context · hooks · sandboxes · observability

MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S

Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.

“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.

The economics: it’s a token-cost problem (CapEx vs OpEx)

Vibe Coding

Low CapEx · High OpEx

Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.

Agentic Engineering

High CapEx · Low OpEx

Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.

85%

of devs use AI coding agents (51% daily)

41%

of all new code is AI-generated

~90%

of agent behavior is the harness, not the model

+19%

longer on some tasks (METR) — verification is the cost

The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.

thorstenmeyerai.com

Implications for AI Development Strategies

This insight urges organizations to reconsider their AI investment priorities. Instead of chasing the latest models, focus should shift to building robust harnesses and refining context. This approach can lead to significant cost savings and more reliable, secure AI systems, especially as AI becomes central to critical operations.

AI for Software Developers: 40 Practical Prompts To Harness AI Tools For Design, Coding, Debugging and DevOps (AI for Professionals)

As an affiliate, we earn on qualifying purchases.

Industry Shift Toward Harness and Context Engineering

The whitepaper builds on the rapid adoption of AI coding agents, where as of early 2026, 85% of developers use AI tools regularly. It highlights that the industry has historically overemphasized model advancements, whereas recent experiments confirm that configuration and context are more impactful. The paper aligns with broader trends toward formalized SDLC in AI, emphasizing verification, testing, and structured workflows.

“The biggest shift in software engineering isn’t a new language or framework—it’s moving from writing code to expressing intent and trusting machines to interpret that intent.”
— Addy Osmani

AI Prompt Engineering: Foundations of Communication with LLMs – Building Generative AI and Agentic AI Prompt Systems Across Development, Testing, and Deployment (AI Engineering)

As an affiliate, we earn on qualifying purchases.

What Aspects of the Harness Are Most Critical?

While experiments show harness adjustments can dramatically improve performance, it remains unclear which specific configurations yield the best results across different domains. The optimal balance between static and dynamic context loading is still under investigation, and best practices are evolving.

AI Model Validation & Testing: Ensuring Reliable AI Systems — Bias Testing, Robustness Evaluation & Regulatory Compliance (AI Compliance Toolkit)

As an affiliate, we earn on qualifying purchases.

Next Steps for AI Development and Adoption

Organizations are likely to invest more in building and testing harnesses, tools, and structured workflows. Further research will clarify which configurations maximize efficiency and security, and industry standards may emerge around harness design. Expect increased emphasis on cost-effective, verified AI systems over model upgrades alone.

A Frontend Web Developer's Guide to Testing: Explore leading web test automation frameworks and their future driven by low-code and AI

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of system behavior?

Experiments and benchmarks indicate that the surrounding harness—prompts, tools, rules—and context management play a much larger role in shaping AI outputs than the model itself.

How should organizations change their AI development approach?

Focus on designing robust harnesses, improving context engineering, and implementing verification processes rather than solely chasing newer, larger models.

What are the cost implications of this shift?

Investing in disciplined harness and context engineering can lower long-term costs by reducing token waste, improving security, and decreasing maintenance overhead.

Is vibe coding still viable?

Vibe coding remains useful for prototypes and quick tasks, but for production systems, a more disciplined approach with structured context and verification is recommended.

What remains uncertain about harness design?

Optimal configurations vary by application, and best practices are still emerging. More empirical data is needed to define industry standards.

Source: ThorstenMeyerAI.com

The Model Is Only 10%: The Real Lesson of the New SDLC

Up next

Cutrova: Edit the Words, Not the Timeline

Author

E BusExpert Team

Share article

The model is only 10%

Implications for AI Development Strategies

AI for Software Developers: 40 Practical Prompts To Harness AI Tools For Design, Coding, Debugging and DevOps (AI for Professionals)

Industry Shift Toward Harness and Context Engineering

AI Prompt Engineering: Foundations of Communication with LLMs – Building Generative AI and Agentic AI Prompt Systems Across Development, Testing, and Deployment (AI Engineering)

What Aspects of the Harness Are Most Critical?

AI Model Validation & Testing: Ensuring Reliable AI Systems — Bias Testing, Robustness Evaluation & Regulatory Compliance (AI Compliance Toolkit)

Next Steps for AI Development and Adoption

A Frontend Web Developer's Guide to Testing: Explore leading web test automation frameworks and their future driven by low-code and AI

Key Questions

Why is the model only 10% of system behavior?

How should organizations change their AI development approach?

What are the cost implications of this shift?

Is vibe coding still viable?

What remains uncertain about harness design?

IdeaClyst: The Validation Council

Scanners for EVs: Why “OBD2” Isn’t Enough Anymore

Bitcoin Battles Unfold in Live Warzone Visualization

The Delegation Ladder: The Four Agentic Loops, and What Each One Lets You Stop Doing

The Future Of AI In SAP’s Hands: Own The System, Don’t Rent The Brain

The Attacker Had A Name: OpenAI’s Own Models Broke Into Hugging Face — During A Benchmark

The AI Company Turning Corporate Survival Into A Live Feed

How MiMo Code Is Shaping The Next Generation Of AI Operations

The Model Is Only 10%: The Real Lesson of the New SDLC

Up next

Author

E BusExpert Team

Share article

The model is only 10%

Implications for AI Development Strategies

AI for Software Developers: 40 Practical Prompts To Harness AI Tools For Design, Coding, Debugging and DevOps (AI for Professionals)

Industry Shift Toward Harness and Context Engineering

AI Prompt Engineering: Foundations of Communication with LLMs – Building Generative AI and Agentic AI Prompt Systems Across Development, Testing, and Deployment (AI Engineering)

What Aspects of the Harness Are Most Critical?

AI Model Validation & Testing: Ensuring Reliable AI Systems — Bias Testing, Robustness Evaluation & Regulatory Compliance (AI Compliance Toolkit)

Next Steps for AI Development and Adoption

A Frontend Web Developer's Guide to Testing: Explore leading web test automation frameworks and their future driven by low-code and AI

Key Questions

Why is the model only 10% of system behavior?

How should organizations change their AI development approach?

What are the cost implications of this shift?

Is vibe coding still viable?

What remains uncertain about harness design?

You May Also Like