Skip to content
Updated minutes ago
Google

Gemini 2.0 Pro

Google · moe · 600B parameters · 2,000,000 context

Quality
88.0

Architecture Details

TypeMOE
Total Parameters600B
Active Parameters150B
Layers96
Hidden Dimension12,288
Attention Heads96
KV Heads16
Head Dimension128
Vocab Size256,000
Total Experts16
Active Experts2

Memory Requirements

BF16 Weights

1200.0 GB

FP8 Weights

600.0 GB

INT4 Weights

300.0 GB

KV-Cache per Token2359296 bytes
Activation Estimate10.00 GB

Fits on (single-node)

B200 NVL (pair) INT4B200 SXMx2 INT4B100 SXMx2 INT4GB200 NVL72 (per GPU)x2 INT4GB300 NVL72 (per GPU)x2 INT4H100 NVL 94GB (per GPU pair)x2 INT4Instinct MI300Xx2 INT4Instinct MI325Xx2 INT4

GPU Recommendations

B200 NVL (pair)good

BF16 · 4 GPUs · tensorrt-llm

68/100

score

Throughput

140.0 tok/s

Cost/Month

$39858

Cost/M Tokens

$108.33

Use this config →
Instinct MI325Xgood

BF16 · 8 GPUs · vllm

65/100

score

Throughput

140.0 tok/s

Cost/Month

$18904

Cost/M Tokens

$51.38

Use this config →
B200 SXMgood

BF16 · 8 GPUs · tensorrt-llm

63/100

score

Throughput

140.0 tok/s

Cost/Month

$34088

Cost/M Tokens

$92.65

Use this config →

API Pricing Comparison

ProviderInput $/MOutput $/MBadges
google$1.00$4.00
Cheapest

Quality Benchmarks

MMLU
87.0
HumanEval
68.0
GSM8K
93.0
MT-Bench
88.0

Capabilities

Features

Tool Use Vision Code Math Reasoning Multilingual Structured Output

Supported Frameworks

Supported Precisions

BF16 (default)

Similar Models