🏆 AI Model Performance Leaderboard
Compare 319AI models by quality, cost & value
Llama 3.2 1B
Llama 3.2 · 1.2B · 128K ctx
HelpSteer2 Llama 3.1 70B
Llama 3.1 · 70.6B · 128K ctx
Qwen 2.5 14B
Qwen 2.5 · 14.8B · 128K ctx
DeepSeek R1 Distill 1.5B
DeepSeek R1 · 1.5B · 128K ctx
DeepSeek R1 Distill 14B
DeepSeek R1 · 14.8B · 128K ctx
DeepSeek R1 Distill 32B
DeepSeek R1 · 32.8B · 128K ctx
DeepSeek R1 Distill 70B
DeepSeek R1 · 70.6B · 128K ctx
DeepSeek R1
DeepSeek R1 · 671B MoE (37B active) · 128K ctx
DeepSeek V3-0324
DeepSeek V3 · 685B MoE (37B active) · 128K ctx
DeepSeek V3
DeepSeek V3 · 671B MoE (37B active) · 128K ctx
Mixtral 8x7B Instruct
Mixtral · 46.7B MoE (12.9B active) · 32K ctx
Mixtral 8x7B
Mixtral · 46.7B MoE (12.9B active) · 32K ctx
Llama 4 Scout
Llama 4 · 109B MoE (17B active) · 10240K ctx
Llama 3.1 Nemotron 51B
Llama 3.1 · 51B · 128K ctx
Llama 3.1 Nemotron 70B Reward
Llama 3.1 · 70.6B · 128K ctx
Showing 319 of 319 models
Tracking 319 AI models across 60 GPUs and 19 providers, updated daily. The top-ranked model for overall quality is BGE Small EN v1.5 with a quality score of —, available from $0.00/million output tokens. Rankings use InferenceBench's composite scoring combining benchmark results (MMLU, HumanEval, GSM8K), inference cost, and throughput efficiency.