Evaluation
Chatbot Arena
Quick Answer
A crowdsourced platform where users compare models through pairwise contests.
Chatbot Arena is a platform where users vote on model outputs for the same prompt. Models are compared pairwise (Elo rating system). This captures real preferences better than automated metrics. Chatbot Arena has collected millions of votes. Results correlate well with expert evaluation. Chatbot Arena is good for identifying capability gaps. It drives model development. Sampling bias (user base characteristics) is a limitation.
Last verified: 2026-04-08