Evaluation

F1 Score

Quick Answer

A metric combining precision and recall, useful for evaluating QA and information extraction.

F1 score is the harmonic mean of precision and recall. F1 balances false positives and false negatives. F1 is useful for evaluating QA and extraction. F1 ranges from 0 (worst) to 1 (perfect). F1 is more balanced than just accuracy. F1 is standard in information extraction. F1 provides detailed performance insight.

Last verified: 2026-04-08

Compare models

See how different LLMs compare on benchmarks, pricing, and speed.

Browse all models →