Question 1

Is Llama 3.2 11B Vision better than Llama 3.2 90B Vision?

Accepted Answer

Llama 3.2 11B Vision has a higher overall quality score. Llama 3.2 11B Vision scores 50/100 while Llama 3.2 90B Vision scores 50/100. The best choice depends on your use case, budget, and deployment constraints.

Question 2

Which is cheaper, Llama 3.2 11B Vision or Llama 3.2 90B Vision?

Accepted Answer

Llama 3.2 11B Vision is cheaper for output tokens. Llama 3.2 11B Vision starts at $0.18/M output tokens, while Llama 3.2 90B Vision starts at $0.90/M output tokens.

Question 3

How much VRAM do Llama 3.2 11B Vision and Llama 3.2 90B Vision need?

Accepted Answer

Llama 3.2 11B Vision requires 22.0 GB (BF16) or 5.5 GB (INT4). Llama 3.2 90B Vision requires 180.0 GB (BF16) or 45.0 GB (INT4). Additional memory is needed for KV-cache and activations.

Question 4

What is the context length of Llama 3.2 11B Vision vs Llama 3.2 90B Vision?

Accepted Answer

Llama 3.2 11B Vision supports 131,072 tokens context, while Llama 3.2 90B Vision supports 131,072 tokens.

Provider	Llama 3.2 11B Vision In $/M	Out $/M	Llama 3.2 90B Vision In $/M	Out $/M
together	$0.18	$0.18	$1.20	$1.20
fireworks	$0.20	$0.20	$0.90	$0.90

Llama 3.2 11B Vision vs Llama 3.2 90B Vision

Architecture Comparison

Memory Requirements

Minimum GPUs Needed (BF16)

Capabilities

API Pricing Comparison

Recommendation Summary

Compare Other Models