Updated minutes ago· Sources: GPU Pricing, API Token Pricing, Model Registry

Command R+

Cohere · dense · 104B parameters · 131,072 context

Quality

78.0

Calculate ROI →Compare with others Fine-Tune This Model →

Architecture Details

TypeDENSE

Total Parameters104B

Active Parameters104B

Layers64

Hidden Dimension12,288

Attention Heads96

KV Heads96

Head Dimension128

Vocab Size256,000

Memory Requirements

BF16 Weights

208.0 GB

FP8 Weights

104.0 GB

INT4 Weights

52.0 GB

KV-Cache per Token3145728 bytes

Activation Estimate3.00 GB

Fits on (single-node)

B200 SXM FP8B100 SXM FP8GB200 NVL72 (per GPU) FP8GB300 NVL72 (per GPU) FP8H200 SXM FP8H100 SXM INT4H100 PCIe INT4H100 NVL INT4

GPU Recommendations

B200 SXMoptimal

FP8 · 1 GPU · tensorrt-llm

100/100

score

Throughput

280.0 tok/s

Cost/Month

$4261

Cost/M Tokens

$5.79

Use this config →

B100 SXMoptimal

FP8 · 1 GPU · tensorrt-llm

100/100

score

Throughput

280.0 tok/s

Cost/Month

$4271

Cost/M Tokens

$5.80

Use this config →

GB200 NVL72 (per GPU)optimal

FP8 · 1 GPU · tensorrt-llm

100/100

score

Throughput

280.0 tok/s

Cost/Month

$6169

Cost/M Tokens

$8.38

Use this config →

API Pricing Comparison

Provider	Input $/M	Output $/M	Badges
together	$2.00	$2.00	Cheapest
cohere	$2.50	$10.00

Quality Benchmarks

MMLU

80.0

HumanEval

50.0

GSM8K

88.0

MT-Bench

83.0

Capabilities

Features

✓ Tool Use✗ Vision✓ Code✗ Math✗ Reasoning✓ Multilingual✓ Structured Output

Supported Frameworks

vllmsglangtgitensorrt-llm

Supported Precisions

BF16 (default)FP8INT4

Similar Models

Command R

35B params · dense

Quality: 68

from $0.50/M

Command R (August 2024)

35B params · dense

Quality: 50

from $0.60/M

Yi-Large

102.6B params · moe

Quality: 74

from $3.00/M

Inflection 3

100B params · dense

Quality: 74

from $15.00/M

YaLM 100B

100B params · dense

Quality: 50