They Keep Moving the Finish Line
I paid $20/month for an AI coding assistant. It worked great for about three weeks. Then it decided I’d coded enough for the week and locked me out.
Not a crash. Not a bug. A quota.
TL;DR
Every major AI coding IDE has followed the same playbook: launch generous, build dependency, tighten limits, monetize the anxiety. This isn’t a conspiracy. It’s a business model. And once you see the pattern, you can work around it.
The Honeymoon
When Antigravity launched in November 2025, it felt like cheating. Free, unlimited, fast. You could throw the hardest problems at Gemini Pro and it would just… handle them. No token counters. No credit dashboards. No “you have 3 requests remaining” banners.
So I did what any reasonable developer would do. I rebuilt my entire workflow around it.
Custom workflows for git commits. Automated code reviews. Design system extraction. SVG optimization pipelines. I wasn’t just using Antigravity as an autocomplete. I was using it as a second brain that happened to live inside my editor.
That’s the trap, by the way. Not a malicious one. But a trap nonetheless. Once you’ve wired an AI tool into every corner of your development process, switching costs become enormous. You’re not just changing an app. You’re rewriting dozens of scripts, prompts, and muscle-memory habits.
And that’s exactly when the goalposts start moving.
The First Move
Sometime around early 2026, the tiers appeared:
| Tier | Price | What You Get |
|---|---|---|
| Free | $0 | Gemini Flash only, heavily rate-limited |
| Pro | $20/mo | Higher limits for Gemini 3.1 Pro |
| Ultra | $249.99/mo | ”Consistent, high-volume” access to frontier models |
Fair enough. Running frontier models costs real money. Nobody expected free to last forever.
But the pricing wasn’t the problem. The metering was.
Antigravity introduced something called “baseline quota” with a concept of “sprint capacity.” The language alone should have been a red flag. When a company needs two tiers of abstraction to describe how much of their product you can use, they’re not optimizing for clarity.
Here’s how it actually works: your usage isn’t measured in messages, or tokens, or even API calls. It’s measured in “compute power,” which means the effort the model had to exert to answer your question. A simple autocomplete costs almost nothing. An agentic multi-file refactor? That could eat your entire day’s quota in one shot.
The problem: you can’t predict it. There’s no meter that says “this request will cost 12% of your daily budget.” You just use the tool until a banner appears telling you you’re done.
And if you hit the weekly cap? You’re locked out. Not for an hour. Not until tomorrow. For seven days.
It’s Not Just Antigravity
Here’s the thing that turned my frustration into something more like fascination: they’re all doing it. Every single one.
Cursor launched with “500 fast requests plus unlimited slow requests.” Developers loved it. Then in June 2025, they replaced the whole system with a credit pool. Credits deplete at different rates depending on which model you use and how complex the request is. Users reported burning through their monthly credits in days. The subreddit was not kind.
Windsurf positioned itself as the predictable alternative. $15/month, transparent per-prompt pricing. Then they killed the $15 tier and raised the base to $20. Even at the higher price points, power users consistently report hitting limits within the first week of each billing cycle.
The playbook is the same every time:
- Launch generous. Subsidize usage to build market share.
- Build dependency. Wait for developers to restructure their workflows.
- Tighten. Introduce metering, credits, or quota systems.
- Monetize. Offer a premium escape hatch ($200-250/month) for developers who can’t go back.
This isn’t unique to AI IDEs. It’s the same pattern we saw with cloud computing, with CI/CD platforms, with every developer tool that starts with “free for developers.” But AI tools compress the timeline. The free period is months, not years. And the cost of running these models is real, which means the squeeze comes faster and harder.
Why This Keeps Happening
I want to be honest here: these companies aren’t villains.
Running a frontier LLM costs serious money. A single agentic workflow (where the AI plans, reasons across multiple files, runs commands, and iterates on errors) can trigger dozens of internal “thinking” steps. Each step burns tokens. Each token costs money. A heavy user on Claude Opus or Gemini Pro can easily generate costs that dwarf their $20/month subscription.
The math simply doesn’t work at $20/month for unlimited frontier model access. It never did. The free tier was a bet on future conversion, not a sustainable offering.
But knowing why it happens doesn’t make the execution less frustrating. The real issue isn’t the pricing itself. It’s three things:
Opacity. No clear usage dashboard. No way to predict what a request will cost before you make it. You’re essentially coding with an invisible meter ticking down.
Communication. Users have reported “stealth” reductions in quotas without any public announcement. One day your normal workflow runs fine. The next day you’re hitting limits at half the usage. No changelog. No email. No explanation.
Framing. These tools are marketed as productivity software (“10x developer” vibes) but billed like infrastructure. Developers expect Slack-style pricing: pay a fixed amount, use it as much as you want. Instead, they’re getting AWS-style pricing: pay a base rate, then sweat every request.
So What Do You Do?
You have a few options:
Accept and budget. Pay for Ultra/Max tiers if you can justify the cost. For a professional developer, $250/month might be worth it if it genuinely saves hours per day. Do the math for your situation.
Go hybrid. Use the AI IDE for the high-value tasks (architecture, complex refactors, debugging) and a basic editor or terminal-based tool for everything else. Don’t burn premium quota on boilerplate.
Get creative. Build systems that work within the constraints instead of fighting them. Match the model to the task. A git commit message doesn’t need the same model as a system design session.
I went with option three. In my next post, I’ll show you exactly how I built a system that makes my AI IDE automatically switch to cheaper models for grunt work, so I never burn premium quota on a commit message again.
The goalposts keep moving. So I built legs that move faster.