Updated minutes ago
Kimi K2.5
Moonshot AI · moe · 1000B parameters · 131,072 context
Quality50.0
Architecture Details
TypeMOE
Total Parameters1000B
Active Parameters32B
Layers64
Hidden Dimension6,144
Attention Heads48
KV Heads8
Head Dimension128
Vocab Size131,072
Total Experts256
Active Experts8
Memory Requirements
BF16 Weights
2000.0 GB
FP8 Weights
1000.0 GB
INT4 Weights
500.0 GB
KV-Cache per Token262144 bytes
Activation Estimate3.00 GB
Fits on (single-node)
B200 NVL (pair)x2 INT4Instinct MI325Xx3 INT4B300x3 INT4Groq LPUx3 INT4B200 SXMx4 INT4B100 SXMx4 INT4GB200 NVL72 (per GPU)x4 INT4GB300 NVL72 (per GPU)x4 INT4
GPU Recommendations
B200 NVL (pair)optimal
FP8 · 4 GPUs · tensorrt-llm
98/100
score
Throughput
140.0 tok/s
Cost/Month
$39858
Cost/M Tokens
$108.33
B200 SXMoptimal
FP8 · 8 GPUs · tensorrt-llm
93/100
score
Throughput
140.0 tok/s
Cost/Month
$34088
Cost/M Tokens
$92.65
B100 SXMoptimal
FP8 · 8 GPUs · tensorrt-llm
93/100
score
Throughput
140.0 tok/s
Cost/Month
$34164
Cost/M Tokens
$92.86
API Pricing Comparison
| Provider | Input $/M | Output $/M | Badges |
|---|---|---|---|
| moonshot | $0.60 | $2.40 | Cheapest |
Capabilities
Features
✓ Tool Use✗ Vision✓ Code✓ Math✓ Reasoning✓ Multilingual✓ Structured Output
Supported Frameworks
vllmsglang
Supported Precisions
BF16FP8 (default)INT4