Updated minutes ago· Sources: GPU Pricing, API Token Pricing, Model Registry

Mixtral 8x22B

Mistral AI · moe · 141B parameters · 65,536 context

Quality

73.0

Calculate ROI →Compare with others Fine-Tune This Model →

Architecture Details

TypeMOE

Total Parameters141B

Active Parameters39B

Layers56

Hidden Dimension6,144

Attention Heads48

KV Heads8

Head Dimension128

Vocab Size32,768

Total Experts8

Active Experts2

Memory Requirements

BF16 Weights

282.0 GB

FP8 Weights

141.0 GB

INT4 Weights

70.5 GB

KV-Cache per Token229376 bytes

Activation Estimate2.50 GB

Fits on (single-node)

B200 SXM FP8B100 SXM FP8GB200 NVL72 (per GPU) FP8GB300 NVL72 (per GPU) FP8H200 SXM INT4H100 NVL INT4H20 INT4H100 NVL 94GB (per GPU pair) FP8

GPU Recommendations

B100 SXMoptimal

FP8 · 1 GPU · tensorrt-llm

100/100

score

Throughput

280.0 tok/s

Cost/Month

$4271

Cost/M Tokens

$5.80

Use this config →

GB200 NVL72 (per GPU)optimal

FP8 · 1 GPU · tensorrt-llm

100/100

score

Throughput

280.0 tok/s

Cost/Month

$6169

Cost/M Tokens

$8.38

Use this config →

GB300 NVL72 (per GPU)optimal

FP8 · 1 GPU · tensorrt-llm

100/100

score

Throughput

280.0 tok/s

Cost/Month

$7118

Cost/M Tokens

$9.67

Use this config →

API Pricing Comparison

Provider	Input $/M	Output $/M	Badges
together	$1.20	$1.20	Cheapest
mistral	$2.00	$6.00

Quality Benchmarks

MMLU

77.8

HumanEval

46.0

GSM8K

78.4

MT-Bench

80.0

Capabilities

Features

✓ Tool Use✗ Vision✓ Code✓ Math✗ Reasoning✓ Multilingual✓ Structured Output

Supported Frameworks

vllmsglangtgitensorrt-llm

Supported Precisions

BF16 (default)FP8INT4

Similar Models

Mixtral 8x7B

46.7B params · moe

Quality: 67

from $0.50/M

Mixtral 8x7B Instruct

46.7B params · moe

Quality: 69

from $0.60/M

DBRX Base

132B params · moe

Quality: 50

from $2.25/M

DBRX Instruct

132B params · moe

Quality: 50

from $1.20/M

Mistral Large 2411

123B params · dense

Quality: 50

from $6.00/M