Skip to content
Updated minutes ago
DeepSeek

DeepSeek R1

DeepSeek · moe · 671B parameters · 131,072 context

Quality
92.0

Architecture Details

TypeMOE
Total Parameters671B
Active Parameters37B
Layers61
Hidden Dimension7,168
Attention Heads128
KV Heads1
Head Dimension128
Vocab Size129,280
Total Experts256
Active Experts8

Memory Requirements

BF16 Weights

1342.0 GB

FP8 Weights

671.0 GB

INT4 Weights

335.5 GB

KV-Cache per Token31232 bytes
Activation Estimate3.00 GB

Fits on (single-node)

Instinct MI325Xx2 INT4B200 NVL (pair)x2 INT4B300x2 INT4Groq LPUx2 INT4B200 SXMx3 INT4B100 SXMx3 INT4GB200 NVL72 (per GPU)x3 INT4GB300 NVL72 (per GPU)x3 INT4

GPU Recommendations

B200 NVL (pair)optimal

FP8 · 4 GPUs · tensorrt-llm

98/100

score

Throughput

140.0 tok/s

Cost/Month

$39858

Cost/M Tokens

$108.33

Use this config →
B200 SXMoptimal

FP8 · 8 GPUs · tensorrt-llm

93/100

score

Throughput

140.0 tok/s

Cost/Month

$34088

Cost/M Tokens

$92.65

Use this config →
H200 SXMoptimal

FP8 · 8 GPUs · tensorrt-llm

90/100

score

Throughput

140.0 tok/s

Cost/Month

$20422

Cost/M Tokens

$55.51

Use this config →

API Pricing Comparison

ProviderInput $/MOutput $/MBadges
deepseek$0.55$2.19
Cheapest
together$3.00$7.00

Quality Benchmarks

MMLU
90.8
HumanEval
71.7
GSM8K
97.3
MT-Bench
89.0

Capabilities

Features

Tool Use Vision Code Math Reasoning Multilingual Structured Output

Supported Frameworks

vllmsglangtensorrt-llm

Supported Precisions

BF16 (default)FP8INT4

Similar Models