AI Token Pricing: The Free Ride Is Over for Good

The Token Tax Is Here

The free ride is over.

For two years, AI companies priced their products like venture capital would never run out. Flat-rate subscriptions. Unlimited requests. All-you-can-eat model access. The pitch was simple: pay us a fixed fee, use as much AI as you want.

That era just ended.

The Bill Comes Due

Anthropic’s model lineup tells the story in miniature. Users on Opus 4.5 migrated to Opus 4.6. Almost nobody uses Opus 4.7. That’s not a lineup. That’s a treadmill. When a model gets superseded before its training costs are recouped, the company eats the loss.

And Opus 4.7 pulled a neat trick. Its new tokenizer generates up to 35% more tokens for the same input text. FinOps analysts at Finout documented the impact: same price per token, but your request just got 35% more expensive without the sticker price changing. That’s a stealth price hike, and a clever one.

Meanwhile, OpenAI announced in March 2026 that it had closed $122 billion in committed capital. The largest private funding round in history. That sounds like strength. It’s not. It’s a distress signal dressed in a press release.

Internal projections reported by multiple outlets put OpenAI’s 2026 losses at approximately $14 billion. Positive free cash flow isn’t expected until 2029. That $122 billion doesn’t buy dominance. It buys three years of survival.

$0B

OpenAI committed capital

$0B

Projected 2026 losses

Breakeven year (est.)

Microsoft Blinks

GitHub Copilot used to charge a flat fee. You paid your $10 or $39 a month and got a set number of requests. Simple.

Starting June 1, 2026, that’s gone. Copilot is switching to “AI Credits,” a token-based billing system. Your monthly subscription still costs the same, but now your usage is metered by actual token consumption.

Why? Because not every model costs the same. A lightweight completion model costs almost nothing to run. A frontier reasoning model like Opus 4.7 costs roughly 20 times more. Under flat-rate pricing, every user running expensive models was a money-losing customer.

GitHub’s own blog post frames the change around “agentic workflows.” These multi-step AI sessions where a model reasons through a problem and executes autonomously. They burn tokens at 5 to 30 times the rate of a simple chat interaction. The old pricing model couldn’t survive that kind of usage.

Microsoft can absorb losses longer than most. They’re one of the largest companies on Earth. But even Microsoft looked at the numbers and said: this doesn’t work.

Google Wins by Default

Here’s the structural asymmetry nobody talks about enough.

Google is pouring $180 to $190 billion into AI infrastructure in 2026 alone, per Alphabet’s Q1 2026 earnings. And after spending all of that, Alphabet still turns a profit.

That’s the difference. OpenAI needs to raise capital to survive. Anthropic needs to raise capital to survive. Google just… makes money. Search ads, YouTube ads, Cloud revenue. The AI investment is enormous, but it comes out of cash flow, not investor desperation.

This explains the hype gap. You don’t hear Google’s CEO doing weekly “AI will replace all jobs” press tours. They don’t need to. Dario Amodei and Sam Altman do those tours because they need your attention to become investor attention to become funding rounds to become another 18 months of runway.

The marketing is the fundraising. That’s the part most people miss.

The Jevons Trap

The Jevons Paradox in Action

Cost per tokenTotal AI spend

Cheaper tokens unlock more use cases. Total spending rises even as unit costs fall.

Here’s the counterargument, and it’s a good one: per-token costs are actually plummeting. Industry benchmarks peg the decline at 10 to 50 times annually, depending on model tier. Tokens are getting cheaper, fast.

So why are total AI bills going up?

Jevons paradox. The 19th-century economist William Stanley Jevons observed that more efficient steam engines didn’t reduce coal consumption. They increased it. Cheaper energy unlocked new uses for energy. Total demand exploded.

The same thing is happening with AI tokens. Per-token costs drop. So companies deploy AI in more places. They move from simple chatbot queries to agentic workflows that run autonomously for minutes, consuming thousands of tokens per task. The unit price falls. The total bill rises.

Uber learned this the hard way. They encouraged company-wide AI adoption, even judging employees on usage metrics. Within four months, they’d burned through their entire annual AI budget. Not because tokens were expensive. Because cheap tokens at massive scale are still expensive.

This is the trap. AI gets cheaper per token, but the total cost of actually using it keeps climbing. Every efficiency gain gets absorbed by expanded usage.

What This Actually Means

Nobody’s going back to hand-coding everything. AI is too useful for that. Clone a repo, point an LLM at it, and you’ve got personalized documentation for any open-source project. That’s genuinely valuable.

But the “AI is basically free” era? It’s done.

Prices are going up. Not on the sticker. On the meter. Token-based billing, stealth tokenizer changes, tiered model pricing. The tools will keep working. They’ll just cost more to use heavily, and the companies building them will keep losing money for years.

The question isn’t whether AI is useful. It is. The question is who can afford to keep building it. Right now, the answer is: the company that was already profitable before AI existed.

Everyone else is fundraising.