LLM Context Window Comparison 2026
Compare context window sizes across 92 large language models. Larger context windows let you process longer documents and maintain richer conversation histories.
Data verified Apr 20, 2026
Context Window by Model
All Models — Ranked by Context Window
| Model | Provider | Context Window | Max Output | Input $/M | Output $/M |
|---|---|---|---|---|---|
| Llama 4 Scout | Meta | 10.48576M | 32,768 | $0.080 | $0.300 |
| Gemini Experimental 1206 | 2M | 8,192 | $0.00 | $0.00 | |
| Gemini 1.5 Pro | 2M | 8,192 | $1.25 | $5.00 | |
| Gemini 2.5 Pro | 1.048576M | 65,536 | $1.25 | $10.00 | |
| Llama 4 Maverick | Meta | 1.048576M | 32,768 | $0.150 | $0.600 |
| Gemini 2.0 Flash | 1.048576M | 8,192 | $0.100 | $0.400 | |
| Gemini 2.0 Flash Lite | 1.048576M | 8,192 | $0.075 | $0.300 | |
| Gemini 2.5 Flash | 1M | 8,192 | $0.300 | $2.50 | |
| Gemini 1.5 Flash | 1M | 8,192 | $0.075 | $0.300 | |
| Gemini 1.5 Flash 8B | 1M | 8,192 | $0.037 | $0.150 | |
| Amazon Nova Pro | Amazon | 300K | 4,096 | $0.800 | $3.20 |
| Amazon Nova Lite | Amazon | 300K | 4,096 | $0.060 | $0.240 |
| Command A | Cohere | 256K | 4,096 | $2.50 | $10.00 |
| Codestral 22B | Mistral AI | 256K | 4,096 | $0.300 | $0.900 |
| Claude Opus 4 | Anthropic | 200K | 32,000 | $5.00 | $25.00 |
| o3 | OpenAI | 200K | 100,000 | $2.00 | $8.00 |
| o1 | OpenAI | 200K | 100,000 | $15.00 | $60.00 |
| Grok 3 | xAI | 200K | 8,192 | $3.00 | $15.00 |
| Claude Sonnet 4 | Anthropic | 200K | 64,000 | $3.00 | $15.00 |
| Claude 3.5 Sonnet | Anthropic | 200K | 8,192 | $3.00 | $15.00 |
| Claude 3.5 Haiku | Anthropic | 200K | 8,192 | $0.800 | $4.00 |
| Claude Haiku 4 | Anthropic | 200K | 8,192 | $1.00 | $5.00 |
| Sonar Pro | Perplexity | 200K | 8,192 | $3.00 | $15.00 |
| Llama 3.1 405B (Fireworks) | Fireworks AI | 131.072K | 4,096 | $3.00 | $3.00 |
| Grok 2 | xAI | 131.072K | 4,096 | $2.00 | $10.00 |
| Llama 3.3 70B (Fireworks) | Fireworks AI | 131.072K | 4,096 | $0.900 | $0.900 |
| DeepSeek R1 | DeepSeek | 128K | 8,192 | $0.500 | $2.15 |
| Qwen 3 235B MoE | Alibaba | 128K | 4,096 | $0.455 | $1.82 |
| GPT-4.5 | OpenAI | 128K | 8,192 | $75.00 | $150.00 |
| DeepSeek R1 (Groq) | Groq | 128K | 8,192 | $0.750 | $0.990 |
| DeepSeek V3 | DeepSeek | 128K | 8,192 | $0.259 | $0.420 |
| o3-mini | OpenAI | 128K | 65,536 | $1.10 | $4.40 |
| o1-mini | OpenAI | 128K | 65,536 | $1.10 | $4.40 |
| ChatGPT-4o Latest | OpenAI | 128K | 16,384 | $5.00 | $15.00 |
| GPT-4o | OpenAI | 128K | 16,384 | $2.50 | $10.00 |
| o4-mini | OpenAI | 128K | 32,768 | $1.10 | $4.40 |
| Qwen 2.5 Max | Alibaba | 128K | 8,192 | $0.160 | $0.640 |
| GPT-4o (Aug 2024) | OpenAI | 128K | 16,384 | $2.50 | $10.00 |
| DeepSeek R1 Distill Llama 70B | DeepSeek | 128K | 8,192 | $0.700 | $0.800 |
| Mistral Large | Mistral | 128K | 8,192 | $0.500 | $1.50 |
| GPT-4 Turbo | OpenAI | 128K | 4,096 | $10.00 | $30.00 |
| Llama 3.1 405B | Meta | 128K | 4,096 | $3.00 | $3.00 |
| Pixtral Large | Mistral AI | 128K | 4,096 | $2.00 | $6.00 |
| Qwen 2.5 72B | Alibaba | 128K | 4,096 | $0.120 | $0.390 |
| GPT-4o Mini | OpenAI | 128K | 16,384 | $0.150 | $0.600 |
| Llama 3.3 70B (Groq) | Groq | 128K | 4,096 | $0.590 | $0.790 |
| Llama 3.3 70B | Meta | 128K | 4,096 | $0.120 | $0.380 |
| Mistral Medium 3 | Mistral AI | 128K | 4,096 | $0.400 | $2.00 |
| Llama 3.3 70B (Together) | Together AI | 128K | 4,096 | $0.880 | $0.880 |
| Llama 3.2 90B Vision | Meta | 128K | 4,096 | $0.900 | $0.900 |
| Command R+ | Cohere | 128K | 4,096 | $2.50 | $10.00 |
| DeepSeek V2.5 | DeepSeek | 128K | 4,096 | $0.140 | $0.280 |
| Llama 3.1 70B | Meta | 128K | 4,096 | $0.400 | $0.400 |
| Phi-3.5 MoE | Microsoft | 128K | 4,096 | $0.170 | $0.680 |
| Mistral Small | Mistral | 128K | 8,192 | $0.150 | $0.600 |
| GPT-4 1.5-mini | OpenAI | 128K | 4,096 | $0.400 | $1.60 |
| Grok 3-mini | xAI | 128K | 4,096 | $0.300 | $0.500 |
| Phi-3 Medium | Microsoft | 128K | 4,096 | $0.170 | $0.170 |
| Llama 3.2 11B Vision | Meta | 128K | 4,096 | $0.245 | $0.245 |
| Phi-3.5 Mini | Microsoft | 128K | 4,096 | $0.130 | $0.520 |
| Qwen 2.5 7B | Alibaba | 128K | 4,096 | $0.040 | $0.100 |
| GPT-4 1.5-nano | OpenAI | 128K | 4,096 | $0.100 | $0.400 |
| Command R | Cohere | 128K | 4,096 | $0.150 | $0.600 |
| Mistral Nemo 12B | Mistral AI | 128K | 4,096 | $0.020 | $0.040 |
| Amazon Nova Micro | Amazon | 128K | 4,096 | $0.035 | $0.140 |
| Command R7B | Cohere | 128K | 4,096 | $0.038 | $0.150 |
| Llama 3.1 8B (Groq) | Groq | 128K | 4,096 | $0.050 | $0.080 |
| Llama 3.1 8B | Meta | 128K | 4,096 | $0.020 | $0.050 |
| Qwen 2.5 Coder 32B | Alibaba | 128K | 4,096 | $0.660 | $1.00 |
| Sonar Reasoning | Perplexity | 127K | 8,192 | $2.00 | $8.00 |
| Sonar | Perplexity | 127K | 4,096 | $1.00 | $1.00 |
| DeepSeek R1 (Together) | Together AI | 64K | 8,192 | $3.00 | $7.00 |
| DeepSeek R1 Distill Qwen 32B | DeepSeek | 64K | 8,192 | $0.290 | $0.290 |
| Mixtral 8x22B (Fireworks) | Fireworks AI | 64K | 4,096 | $0.900 | $0.900 |
| WizardLM-2 8x22B | Microsoft | 64K | 4,096 | $0.620 | $0.620 |
| Gemini 2.0 Flash Thinking | 32K | 16,384 | $0.00 | $0.00 | |
| QwQ 32B | Alibaba | 32K | 8,192 | $0.150 | $0.580 |
| Qwen 2.5 72B (Together) | Together AI | 32K | 4,096 | $1.20 | $1.20 |
| Yi-Large | 01.AI | 32K | 4,096 | $3.00 | $3.00 |
| Mixtral 8x7B (Groq) | Groq | 32K | 4,096 | $0.240 | $0.240 |
| InternLM 2.5 20B | Shanghai AI Lab | 32K | 4,096 | $0.180 | $0.180 |
| Mistral 7B | Mistral AI | 32K | 4,096 | $0.110 | $0.190 |
| Mistral 7B (Together) | Together AI | 32K | 4,096 | $0.200 | $0.200 |
| Phi-4 | Microsoft | 16.384K | 4,096 | $0.065 | $0.140 |
| Yi-Lightning | 01.AI | 16K | 4,096 | $0.140 | $0.140 |
| GPT-3.5 Turbo | OpenAI | 16K | 4,096 | $0.500 | $1.50 |
| Grok 2 Vision | xAI | 8.192K | 4,096 | $2.00 | $10.00 |
| GPT-4 1 | OpenAI | 8.192K | 2,048 | $2.00 | $8.00 |
| Gemma 2 27B | 8K | 4,096 | $0.650 | $0.650 | |
| Gemma 2 9B (Groq) | Groq | 8K | 4,096 | $0.200 | $0.200 |
| Gemma 2 9B | 8K | 4,096 | $0.030 | $0.090 | |
| Llama 3.1 405B (Together) | Together AI | 4K | 4,096 | $3.50 | $3.50 |
Frequently Asked Questions
- What is a context window?
- A context window is the maximum number of tokens (words and word pieces) that a language model can process in a single request. It includes both the input prompt and the generated output. Larger context windows allow you to send longer documents, maintain longer conversation histories, and process more data in a single API call.
- Which LLM has the largest context window?
- As of 2026, Gemini 2.5 Pro leads with a 1 million token context window, followed by Gemini 2.0 Flash and Flash Lite with 1M tokens each. Among non-Google models, Claude Opus 4 and Claude Sonnet 4 offer 200K tokens, while GPT-4o provides 128K tokens.
- Does context window size affect price?
- Context window size itself doesn't directly affect per-token pricing, but larger context windows mean you can send more tokens per request, which increases total cost. Some providers offer cached input pricing at a discount for repeated content within the context window. Models with very large context windows (like Gemini) may also have different rate limits.