Updated minutes ago
Falcon 180B
TII · dense · 180B parameters · 2,048 context
Quality60.0
Architecture Details
TypeDENSE
Total Parameters180B
Active Parameters180B
Layers80
Hidden Dimension14,848
Attention Heads232
KV Heads8
Head Dimension64
Vocab Size65,024
Memory Requirements
BF16 Weights
360.0 GB
FP8 Weights
180.0 GB
INT4 Weights
90.0 GB
KV-Cache per Token163840 bytes
Activation Estimate4.00 GB
Fits on (single-node)
B200 SXM INT4B100 SXM INT4GB200 NVL72 (per GPU) INT4GB300 NVL72 (per GPU) INT4H200 SXM INT4H100 NVL 94GB (per GPU pair) INT4Instinct MI300X INT4Instinct MI325X FP8
GPU Recommendations
B200 SXMoptimal
FP8 · 2 GPUs · tensorrt-llm
98/100
score
Throughput
280.0 tok/s
Cost/Month
$8522
Cost/M Tokens
$11.58
B100 SXMoptimal
FP8 · 2 GPUs · tensorrt-llm
98/100
score
Throughput
280.0 tok/s
Cost/Month
$8541
Cost/M Tokens
$11.61
H200 SXMoptimal
FP8 · 2 GPUs · tensorrt-llm
95/100
score
Throughput
280.0 tok/s
Cost/Month
$5106
Cost/M Tokens
$6.94
API Pricing Comparison
| Provider | Input $/M | Output $/M | Badges |
|---|---|---|---|
| tii | $2.40 | $2.40 | Cheapest |
Quality Benchmarks
MMLU68.6
HumanEval33.0
GSM8K55.0
MT-Bench72.0
Capabilities
Features
✗ Tool Use✗ Vision✓ Code✗ Math✗ Reasoning✓ Multilingual✗ Structured Output
Supported Frameworks
vllmsglangtgitensorrt-llm
Supported Precisions
BF16 (default)FP8INT4