Skip to main content

If you’ve been building products with AI tools over the past two years, you’ve been getting a remarkable deal. OpenAI, Anthropic, Google, and a dozen other providers have been selling you compute at a fraction of what it actually costs them. That era is drawing to a close — and the businesses that plan for it now will be the ones that thrive when the bill comes due.

TL;DR

  • Most AI tools and APIs are currently sold well below cost — OpenAI alone is projected to burn $14 billion in 2026
  • Usage-based pricing is replacing flat subscriptions (GitHub Copilot’s shift to token billing in June 2026 is a bellwether)
  • A 10-person dev team’s AI spend could balloon from manageable to six figures annually as subsidies unwind
  • Businesses need to audit their AI dependencies now and build cost-resilient architectures before prices rise
  • The winners will be teams that treat AI as a strategic capability, not a cheap commodity

The Uncomfortable Maths Behind Your AI Bill

Here’s the reality that most businesses haven’t fully reckoned with: the AI tools you rely on daily are being sold at a loss. OpenAI is projected to burn through $14 billion in 2026, up from roughly $9 billion in 2025. Anthropic, despite raising billions, operates at a similar deficit. Even Google’s Gemini offerings are subsidised by its advertising revenue.

This isn’t sustainable, and the market knows it. The gap between the sticker price and the actual compute cost is where the cracks are starting to show. When you pay $20 a month for an AI assistant that costs $80 in compute to serve, someone is absorbing that difference — and that someone is venture capital, not a charitable foundation.

The four largest hyperscalers are projected to spend $650 billion on AI infrastructure by 2026. McKinsey estimates cumulative AI infrastructure investment will reach $6.7 trillion by 2030. That capital needs returns, and those returns will come from you — the customer.

The Copilot Canary in the Coal Mine

GitHub Copilot’s shift to usage-based billing in June 2026 is the clearest signal yet. Microsoft was reportedly losing $20 per user on flat-rate subscriptions. The move to token-based pricing isn’t a business model experiment — it’s a correction.

This pattern will repeat across the industry. Expect every AI tool you use to either raise prices, introduce usage caps, or shift to consumption-based models within the next 12 to 18 months. Enterprise pricing hikes of 25-50% are already being reported across several AI platforms.

For a 10-person development team relying heavily on LLM-powered tools, annual AI spend could balloon from a manageable budget line to anywhere between €70,000 and €100,000 — that’s not a tool subscription any more, that’s a headcount.

What This Means for Your Technology Strategy

1. Audit Your AI Dependencies

Start by mapping every AI service your business touches. Not just the obvious ones like Copilot or ChatGPT — include the AI features embedded in your existing tools. Many SaaS platforms have quietly added AI capabilities that consume tokens behind the scenes. Understand what you’re actually using, what it costs today, and what it might cost at 3x the current price.

2. Build for Cost Resilience

If your application calls GPT-4-class models for tasks that a smaller, cheaper model could handle, you’re overspending. Tiered model routing — using lightweight models for simple tasks and reserving expensive models for complex reasoning — can cut costs by 60-80% without meaningfully degrading quality.

Implement semantic caching so identical or near-identical queries don’t burn fresh tokens. Add cost observability to your AI pipelines — you can’t optimise what you can’t measure. At REPTILEHAUS, we build AI integrations with cost governance baked in from day one, because we’ve seen what happens when teams treat LLM calls like free API endpoints.

3. Consider Self-Hosted and Open-Source Models

The open-source model ecosystem has matured dramatically. Models like Llama 3, Mistral, and Qwen now deliver genuinely impressive results for many business use cases. Running inference on your own infrastructure — or on dedicated GPU instances — gives you predictable costs and eliminates vendor lock-in.

This isn’t right for every use case. Frontier models still lead on complex reasoning, multi-step planning, and nuanced generation. But for classification, summarisation, data extraction, and many customer-facing tasks, open-source models are more than capable — and they don’t come with a VC subsidy that’s about to expire.

4. Rethink Your Build vs Buy Calculus

When AI tools were essentially free, it made sense to call external APIs for everything. As prices normalise, the economics shift. Some capabilities that were cheaper to outsource become cheaper to build — particularly if you have domain-specific requirements that don’t need a 400-billion-parameter model.

Fine-tuned smaller models, running on modest hardware, can outperform general-purpose giants for narrow tasks. The upfront investment in training pays for itself quickly when you’re no longer paying per-token for every inference.

The Vendor Lock-In Trap

Here’s the subtler risk: many businesses have built their AI workflows tightly coupled to a single provider’s API. When that provider raises prices — and they will — switching costs are enormous. Prompt engineering, output parsing, fine-tuning, and integration code all need to be reworked.

Build provider-agnostic abstraction layers now, while you have the breathing room. Tools like LiteLLM, LLM routers, and the Model Context Protocol (MCP) make it far easier to swap providers without rewriting your application logic. The cost of abstraction today is a fraction of the cost of a forced migration tomorrow.

Who Survives the Correction?

The businesses that will weather the AI pricing correction best share a few characteristics:

  • They treat AI as a capability, not a crutch. AI augments their team’s expertise rather than replacing fundamental understanding.
  • They have cost visibility. They know exactly what they’re spending on AI, where, and why.
  • They’ve built switching capability. Their architecture doesn’t depend on any single provider’s pricing staying favourable.
  • They invest in their team. When AI tools get expensive, the teams with deep technical skills will optimise and adapt. The teams that outsourced their thinking to LLMs will struggle.

What You Should Do This Week

Don’t wait for the price increases to hit. Start with these concrete steps:

  1. Catalogue every AI dependency across your organisation — APIs, embedded features, developer tools
  2. Model the cost impact of a 3x price increase on each dependency
  3. Identify your highest-volume, lowest-complexity AI calls — these are candidates for cheaper models or self-hosting
  4. Evaluate your provider lock-in risk and begin abstracting where feasible
  5. Set up cost monitoring on all AI-related spend lines

The AI subsidy era gave every business a chance to experiment cheaply. The businesses that used that window to build real capability — not just to bolster fragile workflows with underpriced tokens — are the ones that will come out ahead.

Need help building cost-resilient AI architecture or evaluating your AI dependencies? Get in touch with our team — we specialise in building AI integrations that work at real-world economics, not just subsidised ones.

📷 Photo by Igor Omilaev on Unsplash