Updated minutes ago
YaLM 100B
Yandex · dense · 100B parameters · 2,048 context
Quality50.0
Architecture Details
TypeDENSE
Total Parameters100B
Active Parameters100B
Layers80
Hidden Dimension10,240
Attention Heads80
KV Heads80
Head Dimension128
Vocab Size128,000
Memory Requirements
BF16 Weights
200.0 GB
FP8 Weights
100.0 GB
INT4 Weights
50.0 GB
KV-Cache per Token3276800 bytes
Activation Estimate3.50 GB
Fits on (single-node)
B200 SXM FP8B100 SXM FP8GB200 NVL72 (per GPU) FP8GB300 NVL72 (per GPU) FP8H200 SXM FP8H100 SXM INT4H100 PCIe INT4H100 NVL INT4
GPU Recommendations
B200 SXMoptimal
FP8 · 1 GPU · tensorrt-llm
100/100
score
Throughput
280.0 tok/s
Cost/Month
$4261
Cost/M Tokens
$5.79
B100 SXMoptimal
FP8 · 1 GPU · tensorrt-llm
100/100
score
Throughput
280.0 tok/s
Cost/Month
$4271
Cost/M Tokens
$5.80
GB200 NVL72 (per GPU)optimal
FP8 · 1 GPU · tensorrt-llm
100/100
score
Throughput
280.0 tok/s
Cost/Month
$6169
Cost/M Tokens
$8.38
API Pricing Comparison
No API pricing data available for this model.
Capabilities
Features
✗ Tool Use✗ Vision✗ Code✗ Math✗ Reasoning✓ Multilingual✗ Structured Output
Supported Frameworks
vllmtgi
Supported Precisions
BF16 (default)FP8INT4