What is a context window?

A context window is the maximum number of tokens (words and word pieces) that a language model can process in a single request. It includes both the input prompt and the generated output. Larger context windows allow you to send longer documents, maintain longer conversation histories, and process more data in a single API call.

Which LLM has the largest context window?

As of 2026, Gemini 2.5 Pro leads with a 1 million token context window, followed by Gemini 2.0 Flash and Flash Lite with 1M tokens each. Among non-Google models, Claude Opus 4 and Claude Sonnet 4 offer 200K tokens, while GPT-4o provides 128K tokens.

Does context window size affect price?

Context window size itself doesn't directly affect per-token pricing, but larger context windows mean you can send more tokens per request, which increases total cost. Some providers offer cached input pricing at a discount for repeated content within the context window. Models with very large context windows (like Gemini) may also have different rate limits.

LLM Context Window Comparison 2026

Compare context window sizes across 92 large language models. Larger context windows let you process longer documents and maintain richer conversation histories.

Data verified Apr 20, 2026

Context Window by Model

All Models — Ranked by Context Window

Model	Provider	Context Window	Max Output	Input $/M	Output $/M
Llama 4 Scout	Meta	10.48576M	32,768	$0.080	$0.300
Gemini Experimental 1206	Google	2M	8,192	$0.00	$0.00
Gemini 1.5 Pro	Google	2M	8,192	$1.25	$5.00
Gemini 2.5 Pro	Google	1.048576M	65,536	$1.25	$10.00
Llama 4 Maverick	Meta	1.048576M	32,768	$0.150	$0.600
Gemini 2.0 Flash	Google	1.048576M	8,192	$0.100	$0.400
Gemini 2.0 Flash Lite	Google	1.048576M	8,192	$0.075	$0.300
Gemini 2.5 Flash	Google	1M	8,192	$0.300	$2.50
Gemini 1.5 Flash	Google	1M	8,192	$0.075	$0.300
Gemini 1.5 Flash 8B	Google	1M	8,192	$0.037	$0.150
Amazon Nova Pro	Amazon	300K	4,096	$0.800	$3.20
Amazon Nova Lite	Amazon	300K	4,096	$0.060	$0.240
Command A	Cohere	256K	4,096	$2.50	$10.00
Codestral 22B	Mistral AI	256K	4,096	$0.300	$0.900
Claude Opus 4	Anthropic	200K	32,000	$5.00	$25.00
o3	OpenAI	200K	100,000	$2.00	$8.00
o1	OpenAI	200K	100,000	$15.00	$60.00
Grok 3	xAI	200K	8,192	$3.00	$15.00
Claude Sonnet 4	Anthropic	200K	64,000	$3.00	$15.00
Claude 3.5 Sonnet	Anthropic	200K	8,192	$3.00	$15.00
Claude 3.5 Haiku	Anthropic	200K	8,192	$0.800	$4.00
Claude Haiku 4	Anthropic	200K	8,192	$1.00	$5.00
Sonar Pro	Perplexity	200K	8,192	$3.00	$15.00
Llama 3.1 405B (Fireworks)	Fireworks AI	131.072K	4,096	$3.00	$3.00
Grok 2	xAI	131.072K	4,096	$2.00	$10.00
Llama 3.3 70B (Fireworks)	Fireworks AI	131.072K	4,096	$0.900	$0.900
DeepSeek R1	DeepSeek	128K	8,192	$0.500	$2.15
Qwen 3 235B MoE	Alibaba	128K	4,096	$0.455	$1.82
GPT-4.5	OpenAI	128K	8,192	$75.00	$150.00
DeepSeek R1 (Groq)	Groq	128K	8,192	$0.750	$0.990
DeepSeek V3	DeepSeek	128K	8,192	$0.259	$0.420
o3-mini	OpenAI	128K	65,536	$1.10	$4.40
o1-mini	OpenAI	128K	65,536	$1.10	$4.40
ChatGPT-4o Latest	OpenAI	128K	16,384	$5.00	$15.00
GPT-4o	OpenAI	128K	16,384	$2.50	$10.00
o4-mini	OpenAI	128K	32,768	$1.10	$4.40
Qwen 2.5 Max	Alibaba	128K	8,192	$0.160	$0.640
GPT-4o (Aug 2024)	OpenAI	128K	16,384	$2.50	$10.00
DeepSeek R1 Distill Llama 70B	DeepSeek	128K	8,192	$0.700	$0.800
Mistral Large	Mistral	128K	8,192	$0.500	$1.50
GPT-4 Turbo	OpenAI	128K	4,096	$10.00	$30.00
Llama 3.1 405B	Meta	128K	4,096	$3.00	$3.00
Pixtral Large	Mistral AI	128K	4,096	$2.00	$6.00
Qwen 2.5 72B	Alibaba	128K	4,096	$0.120	$0.390
GPT-4o Mini	OpenAI	128K	16,384	$0.150	$0.600
Llama 3.3 70B (Groq)	Groq	128K	4,096	$0.590	$0.790
Llama 3.3 70B	Meta	128K	4,096	$0.120	$0.380
Mistral Medium 3	Mistral AI	128K	4,096	$0.400	$2.00
Llama 3.3 70B (Together)	Together AI	128K	4,096	$0.880	$0.880
Llama 3.2 90B Vision	Meta	128K	4,096	$0.900	$0.900
Command R+	Cohere	128K	4,096	$2.50	$10.00
DeepSeek V2.5	DeepSeek	128K	4,096	$0.140	$0.280
Llama 3.1 70B	Meta	128K	4,096	$0.400	$0.400
Phi-3.5 MoE	Microsoft	128K	4,096	$0.170	$0.680
Mistral Small	Mistral	128K	8,192	$0.150	$0.600
GPT-4 1.5-mini	OpenAI	128K	4,096	$0.400	$1.60
Grok 3-mini	xAI	128K	4,096	$0.300	$0.500
Phi-3 Medium	Microsoft	128K	4,096	$0.170	$0.170
Llama 3.2 11B Vision	Meta	128K	4,096	$0.245	$0.245
Phi-3.5 Mini	Microsoft	128K	4,096	$0.130	$0.520
Qwen 2.5 7B	Alibaba	128K	4,096	$0.040	$0.100
GPT-4 1.5-nano	OpenAI	128K	4,096	$0.100	$0.400
Command R	Cohere	128K	4,096	$0.150	$0.600
Mistral Nemo 12B	Mistral AI	128K	4,096	$0.020	$0.040
Amazon Nova Micro	Amazon	128K	4,096	$0.035	$0.140
Command R7B	Cohere	128K	4,096	$0.038	$0.150
Llama 3.1 8B (Groq)	Groq	128K	4,096	$0.050	$0.080
Llama 3.1 8B	Meta	128K	4,096	$0.020	$0.050
Qwen 2.5 Coder 32B	Alibaba	128K	4,096	$0.660	$1.00
Sonar Reasoning	Perplexity	127K	8,192	$2.00	$8.00
Sonar	Perplexity	127K	4,096	$1.00	$1.00
DeepSeek R1 (Together)	Together AI	64K	8,192	$3.00	$7.00
DeepSeek R1 Distill Qwen 32B	DeepSeek	64K	8,192	$0.290	$0.290
Mixtral 8x22B (Fireworks)	Fireworks AI	64K	4,096	$0.900	$0.900
WizardLM-2 8x22B	Microsoft	64K	4,096	$0.620	$0.620
Gemini 2.0 Flash Thinking	Google	32K	16,384	$0.00	$0.00
QwQ 32B	Alibaba	32K	8,192	$0.150	$0.580
Qwen 2.5 72B (Together)	Together AI	32K	4,096	$1.20	$1.20
Yi-Large	01.AI	32K	4,096	$3.00	$3.00
Mixtral 8x7B (Groq)	Groq	32K	4,096	$0.240	$0.240
InternLM 2.5 20B	Shanghai AI Lab	32K	4,096	$0.180	$0.180
Mistral 7B	Mistral AI	32K	4,096	$0.110	$0.190
Mistral 7B (Together)	Together AI	32K	4,096	$0.200	$0.200
Phi-4	Microsoft	16.384K	4,096	$0.065	$0.140
Yi-Lightning	01.AI	16K	4,096	$0.140	$0.140
GPT-3.5 Turbo	OpenAI	16K	4,096	$0.500	$1.50
Grok 2 Vision	xAI	8.192K	4,096	$2.00	$10.00
GPT-4 1	OpenAI	8.192K	2,048	$2.00	$8.00
Gemma 2 27B	Google	8K	4,096	$0.650	$0.650
Gemma 2 9B (Groq)	Groq	8K	4,096	$0.200	$0.200
Gemma 2 9B	Google	8K	4,096	$0.030	$0.090
Llama 3.1 405B (Together)	Together AI	4K	4,096	$3.50	$3.50

Frequently Asked Questions

What is a context window?: A context window is the maximum number of tokens (words and word pieces) that a language model can process in a single request. It includes both the input prompt and the generated output. Larger context windows allow you to send longer documents, maintain longer conversation histories, and process more data in a single API call.
Which LLM has the largest context window?: As of 2026, Gemini 2.5 Pro leads with a 1 million token context window, followed by Gemini 2.0 Flash and Flash Lite with 1M tokens each. Among non-Google models, Claude Opus 4 and Claude Sonnet 4 offer 200K tokens, while GPT-4o provides 128K tokens.
Does context window size affect price?: Context window size itself doesn't directly affect per-token pricing, but larger context windows mean you can send more tokens per request, which increases total cost. Some providers offer cached input pricing at a discount for repeated content within the context window. Models with very large context windows (like Gemini) may also have different rate limits.