Skip to content

Llama 3.2 11B Vision vs Llama 3.2 90B Vision

Meta
Llama 3.2 11B Vision

Meta · 11B params · Quality: 50

Meta
Llama 3.2 90B Vision

Meta · 90B params · Quality: 50

Architecture Comparison

SpecLlama 3.2 11B VisionLlama 3.2 90B Vision
TypeDENSEDENSE
Total Parameters11B90B
Active Parameters11B90B
Layers4080
Hidden Dimension4,0968,192
Attention Heads3264
KV Heads88
Context Length131,072131,072
Precision (default)BF16BF16

Memory Requirements

PrecisionLlama 3.2 11B VisionLlama 3.2 90B Vision
BF16 Weights22.0 GB180.0 GB
FP8 Weights11.0 GB90.0 GB
INT4 Weights5.5 GB45.0 GB
KV-Cache / Token163840 B327680 B
Activation Estimate1.00 GB3.00 GB

Minimum GPUs Needed (BF16)

H100 SXM1 GPU3 GPUs
L40S1 GPU5 GPUs

Capabilities

FeatureLlama 3.2 11B VisionLlama 3.2 90B Vision
Tool Use✓ Yes✓ Yes
Vision✓ Yes✓ Yes
Code✓ Yes✓ Yes
Math✓ Yes✓ Yes
Reasoning✗ No✗ No
Multilingual✓ Yes✓ Yes
Structured Output✓ Yes✓ Yes

API Pricing Comparison

Cheapest Output (Llama 3.2 11B Vision)

$0.18/M

Input: $0.18/M

Cheapest Output (Llama 3.2 90B Vision)

$0.90/M

Input: $0.90/M

ProviderLlama 3.2 11B Vision In $/MOut $/MLlama 3.2 90B Vision In $/MOut $/M
together$0.18$0.18$1.20$1.20
fireworks$0.20$0.20$0.90$0.90

Recommendation Summary

  • Llama 3.2 11B Vision is cheaper per output token ($0.18/M vs $0.90/M).
  • Llama 3.2 11B Vision has a smaller memory footprint (22.0 GB vs 180.0 GB BF16), making it easier to deploy on fewer GPUs.

Compare Other Models