AI Benchmarks

Compare leading AI models across standardized benchmarks. Last updated 2026-03-29.

MMLU-ProGeneral knowledge and reasoning across 57 subjects. Max score: 100.

RankModelProviderScoreReleased
#1Claude Opus 4.6Anthropic92.4/ 1002026-03
#2o1OpenAI91.8/ 1002025-09
#3Gemini 2.5 ProGoogle91.2/ 1002026-01
#4GPT-4.5OpenAI90.1/ 1002025-12
#5Llama 4 MaverickMeta89.3/ 1002026-03
#6Claude Sonnet 4.6Anthropic88.7/ 1002026-02
#7DeepSeek V3DeepSeek88.1/ 1002025-12
#8GPT-4oOpenAI87.2/ 1002025-05
#9Mistral LargeMistral86.8/ 1002025-11
#10o3-miniOpenAI86.3/ 1002025-11
#11Llama 4 ScoutMeta85.9/ 1002026-02
#12Gemini 2.0 FlashGoogle84.5/ 1002025-10
#13Claude Haiku 4.5Anthropic82.1/ 1002026-01
#14Mistral SmallMistral78.4/ 1002025-09