Llama 3.2 11B Vision vs Llama 3.2 90B Vision
Architecture Comparison
SpecLlama 3.2 11B VisionLlama 3.2 90B Vision
TypeDENSEDENSE
Total Parameters11B90B
Active Parameters11B90B
Layers4080
Hidden Dimension4,0968,192
Attention Heads3264
KV Heads88
Context Length131,072131,072
Precision (default)BF16BF16
Memory Requirements
PrecisionLlama 3.2 11B VisionLlama 3.2 90B Vision
BF16 Weights22.0 GB180.0 GB
FP8 Weights11.0 GB90.0 GB
INT4 Weights5.5 GB45.0 GB
KV-Cache / Token163840 B327680 B
Activation Estimate1.00 GB3.00 GB
Minimum GPUs Needed (BF16)
H100 SXM1 GPU3 GPUs
L40S1 GPU5 GPUs
Capabilities
FeatureLlama 3.2 11B VisionLlama 3.2 90B Vision
Tool Use✓ Yes✓ Yes
Vision✓ Yes✓ Yes
Code✓ Yes✓ Yes
Math✓ Yes✓ Yes
Reasoning✗ No✗ No
Multilingual✓ Yes✓ Yes
Structured Output✓ Yes✓ Yes
API Pricing Comparison
Cheapest Output (Llama 3.2 11B Vision)
$0.18/M
Input: $0.18/M
Cheapest Output (Llama 3.2 90B Vision)
$0.90/M
Input: $0.90/M
| Provider | Llama 3.2 11B Vision In $/M | Out $/M | Llama 3.2 90B Vision In $/M | Out $/M |
|---|---|---|---|---|
| together | $0.18 | $0.18 | $1.20 | $1.20 |
| fireworks | $0.20 | $0.20 | $0.90 | $0.90 |
Recommendation Summary
- ‣Llama 3.2 11B Vision is cheaper per output token ($0.18/M vs $0.90/M).
- ‣Llama 3.2 11B Vision has a smaller memory footprint (22.0 GB vs 180.0 GB BF16), making it easier to deploy on fewer GPUs.
Compare Other Models
Llama 3.2 11B Vision vs DeepSeek R1→Llama 3.2 11B Vision vs DeepSeek V3→Llama 3.2 11B Vision vs Gemma 3 27B→Llama 3.2 11B Vision vs Llama 3.1 405B→Llama 3.2 11B Vision vs Llama 3.1 70B→Llama 3.2 11B Vision vs Llama 3.1 8B→Llama 3.2 90B Vision vs DeepSeek R1→Llama 3.2 90B Vision vs DeepSeek V3→Llama 3.2 90B Vision vs Gemma 3 27B→Llama 3.2 90B Vision vs Llama 3.1 405B→