Updated minutes ago
o1
OpenAI · moe · 200B parameters · 200,000 context
Quality95.0
Architecture Details
TypeMOE
Total Parameters200B
Active Parameters50B
Layers80
Hidden Dimension10,240
Attention Heads80
KV Heads10
Head Dimension128
Vocab Size200,000
Total Experts16
Active Experts2
Memory Requirements
BF16 Weights
400.0 GB
FP8 Weights
200.0 GB
INT4 Weights
100.0 GB
KV-Cache per Token204800 bytes
Activation Estimate4.00 GB
Fits on (single-node)
B200 SXM INT4B100 SXM INT4GB200 NVL72 (per GPU) INT4GB300 NVL72 (per GPU) INT4H200 SXM INT4H100 NVL 94GB (per GPU pair) INT4Instinct MI300X INT4Instinct MI325X FP8
GPU Recommendations
B200 SXMoptimal
BF16 · 4 GPUs · tensorrt-llm
93/100
score
Throughput
280.0 tok/s
Cost/Month
$17044
Cost/M Tokens
$23.16
B100 SXMoptimal
BF16 · 4 GPUs · tensorrt-llm
93/100
score
Throughput
280.0 tok/s
Cost/Month
$17082
Cost/M Tokens
$23.21
B200 NVL (pair)optimal
BF16 · 2 GPUs · tensorrt-llm
93/100
score
Throughput
280.0 tok/s
Cost/Month
$19929
Cost/M Tokens
$27.08
API Pricing Comparison
| Provider | Input $/M | Output $/M | Badges |
|---|---|---|---|
| openai | $15.00 | $60.00 | Cheapest |
Quality Benchmarks
MMLU92.3
HumanEval83.4
GSM8K98.0
MT-Bench91.0
Capabilities
Features
✓ Tool Use✓ Vision✓ Code✓ Math✓ Reasoning✓ Multilingual✓ Structured Output
Supported Frameworks
Supported Precisions
BF16 (default)