Accuracy %85 models ranked
MMLU Leaderboard 2026
Massive Multitask Language Understanding (MMLU) tests knowledge across 57 subjects including STEM, humanities, social sciences, and professional domains. It is the most widely reported academic benchmark for general knowledge.
Quick Answer
The best model on MMLU in 2026 is o3 by OpenAI, scoring 95%. Runner-up: Gemini 2.5 Pro (92%).
85 / 85 models
What MMLU Tests
Multiple-choice questions across 57 subjects: mathematics, history, law, medicine, computer science, and more. Tests both breadth of knowledge and reasoning within those domains. Score = percentage of questions answered correctly.
Score Range
0–100% (human expert ~89%)
Other Benchmarks
Compare models side-by-side
Full spec comparison — pricing, context window, and all benchmarks.