Updated minutes ago· Sources: GPU Pricing, API Token Pricing, Model Registry

Llama 4 Behemoth

Meta · moe · 2000B parameters · 1,048,576 context

Quality

93.0

Calculate ROI →Compare with others Fine-Tune This Model →

Architecture Details

TypeMOE

Total Parameters2000B

Active Parameters400B

Layers128

Hidden Dimension16,384

Attention Heads128

KV Heads16

Head Dimension128

Vocab Size202,400

Total Experts256

Active Experts16

Memory Requirements

BF16 Weights

4000.0 GB

FP8 Weights

2000.0 GB

INT4 Weights

1000.0 GB

KV-Cache per Token4194304 bytes

Activation Estimate25.00 GB

Fits on (single-node)

B200 NVL (pair)x4 INT4Instinct MI325Xx5 INT4B300x5 INT4Groq LPUx6 INT4B200 SXMx7 INT4B100 SXMx7 INT4GB200 NVL72 (per GPU)x7 INT4GB300 NVL72 (per GPU)x7 INT4

GPU Recommendations

B200 SXMgood

BF16 · 32 GPUs · tensorrt-llm

63/100

score

Throughput

140.0 tok/s

Cost/Month

$136352

Cost/M Tokens

$370.60

Use this config →

B100 SXMgood

BF16 · 32 GPUs · tensorrt-llm

63/100

score

Throughput

140.0 tok/s

Cost/Month

$136656

Cost/M Tokens

$371.43

Use this config →

GB200 NVL72 (per GPU)good

BF16 · 32 GPUs · tensorrt-llm

63/100

score

Throughput

140.0 tok/s

Cost/Month

$197392

Cost/M Tokens

$536.51

Use this config →

API Pricing Comparison

Provider	Input $/M	Output $/M	Badges
together	$5.00	$16.00	Cheapest

Quality Benchmarks

MMLU

92.0

HumanEval

74.0

GSM8K

97.0

MT-Bench

92.0

Capabilities

Features

✓ Tool Use✓ Vision✓ Code✓ Math✓ Reasoning✓ Multilingual✓ Structured Output

Supported Frameworks

Supported Precisions

BF16 (default)

Similar Models

GPT-4.5 Preview

1500B params · moe

Quality: 93

from $150.00/M

Kimi K2.5

1000B params · moe

Quality: 50

from $2.40/M

DeepSeek R1

671B params · moe

Quality: 92

from $2.19/M

DeepSeek V3

671B params · moe

Quality: 86

from $0.42/M

Gemini 2.0 Pro

600B params · moe

Quality: 88

from $4.00/M