LLM API Pricing Comparison — Complete Guide

LLM API pricing in 2026 varies dramatically across 92 models from 18 providers. Here's a comprehensive breakdown:


01.AI:

  • Yi-Lightning: $0.140/M input, $0.140/M output
  • Yi-Large: $3.00/M input, $3.00/M output

  • Alibaba:

  • Qwen 2.5 7B: $0.040/M input, $0.100/M output
  • Qwen 2.5 72B: $0.120/M input, $0.390/M output
  • QwQ 32B: $0.150/M input, $0.580/M output
  • Qwen 2.5 Max: $0.160/M input, $0.640/M output
  • Qwen 3 235B MoE: $0.455/M input, $1.82/M output
  • Qwen 2.5 Coder 32B: $0.660/M input, $1.00/M output

  • Amazon:

  • Amazon Nova Micro: $0.035/M input, $0.140/M output
  • Amazon Nova Lite: $0.060/M input, $0.240/M output
  • Amazon Nova Pro: $0.800/M input, $3.20/M output

  • Anthropic:

  • Claude 3.5 Haiku: $0.800/M input, $4.00/M output
  • Claude Haiku 4: $1.00/M input, $5.00/M output
  • Claude Sonnet 4: $3.00/M input, $15.00/M output
  • Claude 3.5 Sonnet: $3.00/M input, $15.00/M output
  • Claude Opus 4: $5.00/M input, $25.00/M output

  • Cohere:

  • Command R7B: $0.038/M input, $0.150/M output
  • Command R: $0.150/M input, $0.600/M output
  • Command A: $2.50/M input, $10.00/M output
  • Command R+: $2.50/M input, $10.00/M output

  • DeepSeek:

  • DeepSeek V2.5: $0.140/M input, $0.280/M output
  • DeepSeek V3: $0.259/M input, $0.420/M output
  • DeepSeek R1 Distill Qwen 32B: $0.290/M input, $0.290/M output
  • DeepSeek R1: $0.500/M input, $2.15/M output
  • DeepSeek R1 Distill Llama 70B: $0.700/M input, $0.800/M output

  • Fireworks AI:

  • Llama 3.3 70B (Fireworks): $0.900/M input, $0.900/M output
  • Mixtral 8x22B (Fireworks): $0.900/M input, $0.900/M output
  • Llama 3.1 405B (Fireworks): $3.00/M input, $3.00/M output

  • Google:

  • Gemini Experimental 1206: $0.00/M input, $0.00/M output
  • Gemini 2.0 Flash Thinking: $0.00/M input, $0.00/M output
  • Gemma 2 9B: $0.030/M input, $0.090/M output
  • Gemini 1.5 Flash 8B: $0.037/M input, $0.150/M output
  • Gemini 2.0 Flash Lite: $0.075/M input, $0.300/M output
  • Gemini 1.5 Flash: $0.075/M input, $0.300/M output
  • Gemini 2.0 Flash: $0.100/M input, $0.400/M output
  • Gemini 2.5 Flash: $0.300/M input, $2.50/M output
  • Gemma 2 27B: $0.650/M input, $0.650/M output
  • Gemini 2.5 Pro: $1.25/M input, $10.00/M output
  • Gemini 1.5 Pro: $1.25/M input, $5.00/M output

  • Groq:

  • Llama 3.1 8B (Groq): $0.050/M input, $0.080/M output
  • Gemma 2 9B (Groq): $0.200/M input, $0.200/M output
  • Mixtral 8x7B (Groq): $0.240/M input, $0.240/M output
  • Llama 3.3 70B (Groq): $0.590/M input, $0.790/M output
  • DeepSeek R1 (Groq): $0.750/M input, $0.990/M output

  • Meta:

  • Llama 3.1 8B: $0.020/M input, $0.050/M output
  • Llama 4 Scout: $0.080/M input, $0.300/M output
  • Llama 3.3 70B: $0.120/M input, $0.380/M output
  • Llama 4 Maverick: $0.150/M input, $0.600/M output
  • Llama 3.2 11B Vision: $0.245/M input, $0.245/M output
  • Llama 3.1 70B: $0.400/M input, $0.400/M output
  • Llama 3.2 90B Vision: $0.900/M input, $0.900/M output
  • Llama 3.1 405B: $3.00/M input, $3.00/M output

  • Microsoft:

  • Phi-4: $0.065/M input, $0.140/M output
  • Phi-3.5 Mini: $0.130/M input, $0.520/M output
  • Phi-3.5 MoE: $0.170/M input, $0.680/M output
  • Phi-3 Medium: $0.170/M input, $0.170/M output
  • WizardLM-2 8x22B: $0.620/M input, $0.620/M output

  • Mistral:

  • Mistral Small: $0.150/M input, $0.600/M output
  • Mistral Large: $0.500/M input, $1.50/M output

  • Mistral AI:

  • Mistral Nemo 12B: $0.020/M input, $0.040/M output
  • Mistral 7B: $0.110/M input, $0.190/M output
  • Codestral 22B: $0.300/M input, $0.900/M output
  • Mistral Medium 3: $0.400/M input, $2.00/M output
  • Pixtral Large: $2.00/M input, $6.00/M output

  • OpenAI:

  • GPT-4 1.5-nano: $0.100/M input, $0.400/M output
  • GPT-4o Mini: $0.150/M input, $0.600/M output
  • GPT-4 1.5-mini: $0.400/M input, $1.60/M output
  • GPT-3.5 Turbo: $0.500/M input, $1.50/M output
  • o3-mini: $1.10/M input, $4.40/M output
  • o1-mini: $1.10/M input, $4.40/M output
  • o4-mini: $1.10/M input, $4.40/M output
  • o3: $2.00/M input, $8.00/M output
  • GPT-4 1: $2.00/M input, $8.00/M output
  • GPT-4o: $2.50/M input, $10.00/M output
  • GPT-4o (Aug 2024): $2.50/M input, $10.00/M output
  • ChatGPT-4o Latest: $5.00/M input, $15.00/M output
  • GPT-4 Turbo: $10.00/M input, $30.00/M output
  • o1: $15.00/M input, $60.00/M output
  • GPT-4.5: $75.00/M input, $150.00/M output

  • Perplexity:

  • Sonar: $1.00/M input, $1.00/M output
  • Sonar Reasoning: $2.00/M input, $8.00/M output
  • Sonar Pro: $3.00/M input, $15.00/M output

  • Shanghai AI Lab:

  • InternLM 2.5 20B: $0.180/M input, $0.180/M output

  • Together AI:

  • Mistral 7B (Together): $0.200/M input, $0.200/M output
  • Llama 3.3 70B (Together): $0.880/M input, $0.880/M output
  • Qwen 2.5 72B (Together): $1.20/M input, $1.20/M output
  • DeepSeek R1 (Together): $3.00/M input, $7.00/M output
  • Llama 3.1 405B (Together): $3.50/M input, $3.50/M output

  • xAI:

  • Grok 3-mini: $0.300/M input, $0.500/M output
  • Grok 2: $2.00/M input, $10.00/M output
  • Grok 2 Vision: $2.00/M input, $10.00/M output
  • Grok 3: $3.00/M input, $15.00/M output

  • The cheapest model is Gemini Experimental 1206 at $0.00/M input tokens. The most expensive output pricing is GPT-4.5 at $150.00/M output tokens.


    Use our interactive pricing table for sortable, filterable comparisons, or try the cost calculator to estimate your specific monthly spend.

    Related Questions