Skip to content

Mistral Large 2 vs Llama 3.1 405B

Mistral
Mistral Large 2

Mistral AI · 123B params · Quality: 82

Meta
Llama 3.1 405B

Meta · 405B params · Quality: 88

Architecture Comparison

SpecMistral Large 2Llama 3.1 405B
TypeDENSEDENSE
Total Parameters123B405B
Active Parameters123B405B
Layers88126
Hidden Dimension12,28816,384
Attention Heads96128
KV Heads88
Context Length131,072131,072
Precision (default)BF16BF16

Memory Requirements

PrecisionMistral Large 2Llama 3.1 405B
BF16 Weights246.0 GB810.0 GB
FP8 Weights123.0 GB405.0 GB
INT4 Weights61.5 GB202.5 GB
KV-Cache / Token360448 B516096 B
Activation Estimate3.50 GB5.00 GB

Minimum GPUs Needed (BF16)

H100 SXM4 GPUsN/A
L40S7 GPUsN/A

Quality Benchmarks

BenchmarkMistral Large 2Llama 3.1 405B
Overall8288
MMLU84.088.6
HumanEval53.061.0
GSM8K91.296.8
MT-Bench84.088.0

Mistral Large 2

MMLU
84.0
HumanEval
53.0
GSM8K
91.2
MT-Bench
84.0

Llama 3.1 405B

MMLU
88.6
HumanEval
61.0
GSM8K
96.8
MT-Bench
88.0

Capabilities

FeatureMistral Large 2Llama 3.1 405B
Tool Use✓ Yes✓ Yes
Vision✗ No✗ No
Code✓ Yes✓ Yes
Math✓ Yes✓ Yes
Reasoning✗ No✗ No
Multilingual✓ Yes✓ Yes
Structured Output✓ Yes✓ Yes

API Pricing Comparison

Cheapest Output (Mistral Large 2)

$2.50/M

Input: $2.50/M

Cheapest Output (Llama 3.1 405B)

$3.00/M

Input: $3.00/M

ProviderMistral Large 2 In $/MOut $/MLlama 3.1 405B In $/MOut $/M
together$2.50$2.50$3.50$3.50
fireworks$3.00$3.00
mistral$2.00$6.00

Recommendation Summary

  • Llama 3.1 405B scores higher on overall quality (88 vs 82).
  • Mistral Large 2 is cheaper per output token ($2.50/M vs $3.00/M).
  • Mistral Large 2 has a smaller memory footprint (246.0 GB vs 810.0 GB BF16), making it easier to deploy on fewer GPUs.
  • Llama 3.1 405B is stronger at code generation (HumanEval: 61.0 vs 53.0).
  • Llama 3.1 405B is better at math reasoning (GSM8K: 96.8 vs 91.2).

Compare Other Models