Updated minutes ago
Grok-2
xAI · moe · 314B parameters · 131,072 context
Quality87.0
Architecture Details
TypeMOE
Total Parameters314B
Active Parameters50B
Layers64
Hidden Dimension8,192
Attention Heads64
KV Heads8
Head Dimension128
Vocab Size131,072
Total Experts8
Active Experts2
Memory Requirements
BF16 Weights
628.0 GB
FP8 Weights
314.0 GB
INT4 Weights
157.0 GB
KV-Cache per Token262144 bytes
Activation Estimate3.00 GB
Fits on (single-node)
B100 SXM INT4GB200 NVL72 (per GPU) INT4GB300 NVL72 (per GPU) INT4H100 NVL 94GB (per GPU pair) INT4Instinct MI300X INT4Instinct MI325X INT4B200 NVL (pair) INT4B300 INT4
GPU Recommendations
H200 SXMoptimal
FP8 · 4 GPUs · tensorrt-llm
95/100
score
Throughput
280.0 tok/s
Cost/Month
$10211
Cost/M Tokens
$13.88
B100 SXMoptimal
FP8 · 2 GPUs · tensorrt-llm
93/100
score
Throughput
280.0 tok/s
Cost/Month
$8541
Cost/M Tokens
$11.61
GB200 NVL72 (per GPU)optimal
FP8 · 2 GPUs · tensorrt-llm
93/100
score
Throughput
280.0 tok/s
Cost/Month
$12337
Cost/M Tokens
$16.77
API Pricing Comparison
| Provider | Input $/M | Output $/M | Badges |
|---|---|---|---|
| xai | $2.00 | $10.00 | Cheapest |
Quality Benchmarks
MMLU87.5
HumanEval64.0
GSM8K93.0
MT-Bench88.0
Capabilities
Features
✓ Tool Use✓ Vision✓ Code✓ Math✓ Reasoning✓ Multilingual✓ Structured Output
Supported Frameworks
vllm
Supported Precisions
BF16 (default)FP8