pricingmarket-dynamicsdata-sourcingtransparency

Why LLM Prices Change Every Month: Our 2026 Data Source

Why LLM Prices Change Every Month: Our 2026 Data Source

By Aniket Nigam. Published 2026-04-15.

Quick answer

LLM prices drop on average 30-60% per year because three forces compound: hardware efficiency gains, competitive pressure, and scale economies. DeepSeek V3's December 2024 launch triggered an industry-wide 60% cut over the following four months. LLMversus refreshes prices daily via OpenRouter's API plus scripted provider-page scrapers plus a weekly manual verification pass for edge cases.

Why this post exists

Readers ask two questions about our pricing tables: "How recent is this?" and "Why did the number move?" This post answers both.

I also want to make the sourcing transparent. Every comparison site says its data is "current." Most update prices quarterly. We update daily and publish the update log at /changelog. That is a real difference, and it deserves a real explanation.

Table of contents

  1. A brief history of LLM price cuts, 2024-2026
  2. Why prices keep falling
  3. When prices actually go up
  4. How LLMversus sources live pricing
  5. The OpenRouter API polling cadence
  6. Provider page scraping: which endpoints, which limits
  7. Manual verification for edge cases
  8. What readers can do about price volatility

1. A brief history of price cuts

Three moments redrew the pricing map in the last two years.

December 2024: DeepSeek V3 launch. DeepSeek priced their 671B model at $0.27 per million input and $1.10 per million output. That was 90% below comparable frontier models at the time. Within 60 days, OpenAI cut GPT-4o input by 50% and Google cut Gemini 1.5 Flash by 30%.

July 2025: Claude Haiku 4.5 and Gemini 2.5 Flash. Anthropic priced Haiku 4.5 at $0.25/$1.25 per million, aimed at the volume tier. Google matched with Gemini 2.5 Flash at $0.075/$0.30. OpenAI shipped GPT-4o mini at $0.15/$0.60 in response.

January 2026: The o3 price reset. OpenAI dropped o3 from $60/$240 per million (the o1 pricing) to $15/$60. That is a 75% cut on the reasoning tier. Anthropic responded with Opus 4 at $15/$75 rather than the prior $30/$150.

The cumulative effect: input tokens for frontier-class models cost 85% less in April 2026 than they did in April 2024. Output tokens are down 78%.

2. Why prices keep falling

Three forces drive the cut curve.

Hardware efficiency. The H200 delivers 1.8x the memory bandwidth of the H100 at a 30% price premium. The B200 (Blackwell) lands in early 2026 at 2.5x H100 throughput for a 60% price premium. Cost per token on fresh hardware drops faster than capex.

Competitive pressure. The market has 5 credible frontier-model providers in 2026: OpenAI, Anthropic, Google, Meta (via Together/Fireworks hosting), and DeepSeek. When any one ships a new price, the others face customer pressure within 30 days. Three cut cycles per year is the observed cadence.

Scale economies. Serving 10x more tokens per second per GPU through speculative decoding, better KV cache management, and FP8 inference reduces per-token compute cost. OpenAI's internal cost curve, based on public disclosures, is down 4x from 2023 to 2025.

These three stack. The hardware improves. The scale kernels improve. The competitor ships a price. The incumbent matches.

3. When prices actually go up

Price increases happen, they are just rare. Two patterns I have tracked.

New model tier, not a replacement. When OpenAI shipped o3 in January 2026, the price was lower than o1, but the high-compute mode of o3 (with 32x more reasoning tokens) does raise per-task cost on hard queries. The model got cheaper. Solving a hard problem got more expensive.

Premium support and SLA tiers. Anthropic's Scale Plus tier, announced March 2026, does not change per-token prices but adds a $2,000/month floor. That is a price increase for customers who would otherwise run at $500/month on Scale. The ladder moved up, not the rate.

I have not seen a frontier model get a straight per-token price hike in the last 24 months. The direction is down.

4. How LLMversus sources live pricing

Our pricing data flows from three upstream feeds:

  1. OpenRouter API polled every 6 hours for 180+ models across 40+ providers
  2. Provider page scrapers running nightly against OpenAI, Anthropic, Google, Groq, Together, and Fireworks pricing pages
  3. Manual verification once per week for edge cases and contract-only tiers

The three sources cross-check each other. When OpenRouter disagrees with the provider page, the provider page wins. When both disagree with a human update, we re-scrape and investigate. About 2-3% of price changes flow through the manual review queue; the rest land automatically.

5. The OpenRouter API polling cadence

OpenRouter publishes its model catalog at https://openrouter.ai/api/v1/models. The response includes pricing per input token and per output token in USD. We pull it every 6 hours.

Here is the polling loop (TypeScript, runs on Vercel Cron):

import { NextResponse } from 'next/server';

export async function GET() {
  const res = await fetch('https://openrouter.ai/api/v1/models', {
    headers: {
      'HTTP-Referer': 'https://llmversus.com',
      'X-Title': 'LLMversus price tracker',
    },
    next: { revalidate: 0 },
  });

  const { data } = await res.json();

  for (const model of data) {
    await upsertPriceRow({
      modelId: model.id,
      inputPerMillion: Number(model.pricing.prompt) * 1_000_000,
      outputPerMillion: Number(model.pricing.completion) * 1_000_000,
      observedAt: new Date().toISOString(),
      source: 'openrouter',
    });
  }

  return NextResponse.json({ updated: data.length });
}

OpenRouter sometimes lags provider-direct pricing by 1-5 days. That is why we also scrape the provider pages directly.

6. Provider page scraping

Six scrapers, one per provider, run nightly at 3am UTC. They target these endpoints:

ProviderSource page
OpenAIopenai.com/api/pricing
Anthropicanthropic.com/pricing
Googlecloud.google.com/vertex-ai/pricing
Groqgroq.com/pricing
Togethertogether.ai/pricing
Fireworksfireworks.ai/pricing

All six pages have parseable HTML tables. Two (Anthropic and Groq) rebuild their pricing layout every 2-3 months and break the scraper; we maintain a selector table and a fallback to plain-text extraction when the structured parser fails.

The scrapers write to a staging table. A diff job compares the staging row to the current public row. Anything with a delta over 2% triggers a Slack alert to me before it hits production. That is how I caught a badly parsed Opus 4 price in February 2026 that briefly showed $150/M instead of $75/M.

7. Manual verification

Some prices cannot be scraped. Examples:

  1. Enterprise-only tiers negotiated through sales
  2. Regional discounts (EU via Vertex, India via Azure)
  3. Batch tier pricing when the provider hides it behind a flag
  4. Promotional rates during model launches (sometimes valid for 30 days only)

Every Friday I run through a checklist of 18 such edge cases. Five minutes each, 90 minutes total. I note any deltas in the public changelog at /changelog so readers can see exactly what moved and when.

The changelog doubles as an accountability log. If we got a price wrong for 6 hours, you can see it. That is the contract we want to keep.

8. What readers can do about price volatility

Even with daily-fresh data, prices will move between when you pick a model and when you scale. Five tactics that compound to protect your budget:

  1. Cache aggressively. Anthropic's 90% cache read discount and OpenAI's 50% are the easiest 30-60% cut on any cache-friendly workload. Shape your prompts to land on stable prefixes.
  2. Use a routing gateway. OpenRouter or LiteLLM lets you swap models without a code push when a cheaper option ships. I have used both in production.
  3. Negotiate above $10K/month. Both OpenAI and Anthropic will negotiate custom pricing at that floor. The typical discount is 10-25%.
  4. Monitor the changelog. Subscribe to provider emails plus LLMversus changelog RSS. A 30-day lag on a 40% cut costs real money at volume.
  5. Keep a second provider wired up. Outages happen. Price cuts happen. Being 1 config change away from switching is worth the 2 days of extra integration work.

FAQ

How often does the LLMversus price table update?

OpenRouter-sourced rows every 6 hours. Provider-scraped rows every 24 hours. Manual edge cases once per week. Every change is logged at /changelog.

What if I see a stale price?

Email hello@llmversus.com with the model name and the source you are checking against. We will investigate within 24 hours and publish the correction in the changelog.

Why not show historical pricing charts?

We do, on model detail pages. The chart goes back to the first date we observed the model. Most frontier models have 6-18 months of history.

Does the site show enterprise pricing?

No. Enterprise contracts are under NDA for every provider. We show the public tier prices plus a note on each model page pointing to the provider's sales team.

Do you accept affiliate payments that might bias your data?

We earn affiliate revenue on some provider signups, disclosed on each model page. Our price data comes from first-party sources, not affiliate feeds. Affiliate status does not change pricing display.

Actionable takeaways

  1. Assume any LLM price can drop 30% in 90 days; plan your unit economics with a 20% buffer
  2. Subscribe to the LLMversus changelog RSS to catch cuts before your finance team does
  3. Wire up a router (OpenRouter, LiteLLM, or your own) so you can swap models with a config change
  4. Negotiate once you pass $10,000/month in spend; most providers will discount
  5. Re-run your vendor scorecard every 90 days; the data shifts that fast
  6. Cache what you can; it is the fastest 30-60% cut independent of any price change

Sources

  • DeepSeek V3 pricing announcement, deepseek.com/blog, December 2024
  • OpenAI model pricing updates, openai.com/index/api-pricing-updates, 2025-2026
  • Anthropic pricing history, anthropic.com/pricing, accessed 2026-04-14
  • Google Cloud Vertex AI pricing, cloud.google.com/vertex-ai/pricing, accessed 2026-04-14
  • OpenRouter API documentation, openrouter.ai/docs, accessed 2026-04-14
  • LLMversus internal changelog, llmversus.com/changelog

Related: AI Pricing Trends 2026, LLM Token Pricing Explained, OpenAI vs Anthropic Pricing 2026.

Your ad here

Related Tools