Skip to content
Updated minutes ago
Moonshot

Kimi K2.5

Moonshot AI · moe · 1000B parameters · 131,072 context

Quality
50.0

Architecture Details

TypeMOE
Total Parameters1000B
Active Parameters32B
Layers64
Hidden Dimension6,144
Attention Heads48
KV Heads8
Head Dimension128
Vocab Size131,072
Total Experts256
Active Experts8

Memory Requirements

BF16 Weights

2000.0 GB

FP8 Weights

1000.0 GB

INT4 Weights

500.0 GB

KV-Cache per Token262144 bytes
Activation Estimate3.00 GB

Fits on (single-node)

B200 NVL (pair)x2 INT4Instinct MI325Xx3 INT4B300x3 INT4Groq LPUx3 INT4B200 SXMx4 INT4B100 SXMx4 INT4GB200 NVL72 (per GPU)x4 INT4GB300 NVL72 (per GPU)x4 INT4

GPU Recommendations

B200 NVL (pair)optimal

FP8 · 4 GPUs · tensorrt-llm

98/100

score

Throughput

140.0 tok/s

Cost/Month

$39858

Cost/M Tokens

$108.33

Use this config →
B200 SXMoptimal

FP8 · 8 GPUs · tensorrt-llm

93/100

score

Throughput

140.0 tok/s

Cost/Month

$34088

Cost/M Tokens

$92.65

Use this config →
B100 SXMoptimal

FP8 · 8 GPUs · tensorrt-llm

93/100

score

Throughput

140.0 tok/s

Cost/Month

$34164

Cost/M Tokens

$92.86

Use this config →

API Pricing Comparison

ProviderInput $/MOutput $/MBadges
moonshot$0.60$2.40
Cheapest

Capabilities

Features

Tool Use Vision Code Math Reasoning Multilingual Structured Output

Supported Frameworks

vllmsglang

Supported Precisions

BF16FP8 (default)INT4

Similar Models