Skip to content

Llama 4 Scout vs Llama 4 Maverick

Meta
Llama 4 Scout

Meta · 109B params · Quality: 76

Meta
Llama 4 Maverick

Meta · 400B params · Quality: 89

Architecture Comparison

SpecLlama 4 ScoutLlama 4 Maverick
TypeMOEMOE
Total Parameters109B400B
Active Parameters17B17B
Layers4896
Hidden Dimension5,1205,120
Attention Heads4040
KV Heads88
Context Length10,485,7601,048,576
Precision (default)BF16BF16
Total Experts16128
Active Experts11

Memory Requirements

PrecisionLlama 4 ScoutLlama 4 Maverick
BF16 Weights218.0 GB800.0 GB
FP8 Weights109.0 GB400.0 GB
INT4 Weights54.5 GB200.0 GB
KV-Cache / Token196608 B393216 B
Activation Estimate2.00 GB3.00 GB

Minimum GPUs Needed (BF16)

H100 SXM4 GPUsN/A
L40S6 GPUsN/A

Quality Benchmarks

BenchmarkLlama 4 ScoutLlama 4 Maverick
Overall7689
MMLU79.089.0
HumanEval55.063.0
GSM8K85.095.0
MT-Bench81.088.0

Llama 4 Scout

MMLU
79.0
HumanEval
55.0
GSM8K
85.0
MT-Bench
81.0

Llama 4 Maverick

MMLU
89.0
HumanEval
63.0
GSM8K
95.0
MT-Bench
88.0

Capabilities

FeatureLlama 4 ScoutLlama 4 Maverick
Tool Use✓ Yes✓ Yes
Vision✓ Yes✓ Yes
Code✓ Yes✓ Yes
Math✓ Yes✓ Yes
Reasoning✗ No✗ No
Multilingual✓ Yes✓ Yes
Structured Output✓ Yes✓ Yes

API Pricing Comparison

Cheapest Output (Llama 4 Scout)

$0.30/M

Input: $0.18/M

Cheapest Output (Llama 4 Maverick)

$1.80/M

Input: $1.20/M

ProviderLlama 4 Scout In $/MOut $/MLlama 4 Maverick In $/MOut $/M
together$0.18$0.30$1.20$1.80
fireworks$0.20$0.35$1.50$2.00

Recommendation Summary

  • Llama 4 Maverick scores higher on overall quality (89 vs 76).
  • Llama 4 Scout is cheaper per output token ($0.30/M vs $1.80/M).
  • Llama 4 Scout has a smaller memory footprint (218.0 GB vs 800.0 GB BF16), making it easier to deploy on fewer GPUs.
  • Llama 4 Scout supports a longer context window (10,485,760 vs 1,048,576 tokens).
  • Llama 4 Maverick is stronger at code generation (HumanEval: 63.0 vs 55.0).
  • Llama 4 Maverick is better at math reasoning (GSM8K: 95.0 vs 85.0).

Compare Other Models