Updated minutes ago
Gemini 2.0 Pro
Google · moe · 600B parameters · 2,000,000 context
Quality88.0
Architecture Details
TypeMOE
Total Parameters600B
Active Parameters150B
Layers96
Hidden Dimension12,288
Attention Heads96
KV Heads16
Head Dimension128
Vocab Size256,000
Total Experts16
Active Experts2
Memory Requirements
BF16 Weights
1200.0 GB
FP8 Weights
600.0 GB
INT4 Weights
300.0 GB
KV-Cache per Token2359296 bytes
Activation Estimate10.00 GB
Fits on (single-node)
B200 NVL (pair) INT4B200 SXMx2 INT4B100 SXMx2 INT4GB200 NVL72 (per GPU)x2 INT4GB300 NVL72 (per GPU)x2 INT4H100 NVL 94GB (per GPU pair)x2 INT4Instinct MI300Xx2 INT4Instinct MI325Xx2 INT4
GPU Recommendations
B200 NVL (pair)good
BF16 · 4 GPUs · tensorrt-llm
68/100
score
Throughput
140.0 tok/s
Cost/Month
$39858
Cost/M Tokens
$108.33
Instinct MI325Xgood
BF16 · 8 GPUs · vllm
65/100
score
Throughput
140.0 tok/s
Cost/Month
$18904
Cost/M Tokens
$51.38
B200 SXMgood
BF16 · 8 GPUs · tensorrt-llm
63/100
score
Throughput
140.0 tok/s
Cost/Month
$34088
Cost/M Tokens
$92.65
API Pricing Comparison
| Provider | Input $/M | Output $/M | Badges |
|---|---|---|---|
| $1.00 | $4.00 | Cheapest |
Quality Benchmarks
MMLU87.0
HumanEval68.0
GSM8K93.0
MT-Bench88.0
Capabilities
Features
✓ Tool Use✓ Vision✓ Code✓ Math✓ Reasoning✓ Multilingual✓ Structured Output
Supported Frameworks
Supported Precisions
BF16 (default)