Cost to run

Llama 3.2 90B Vision Instruct

meta-llama/llama-3.2-90b-vision-instruct

Family: Llama 3.2
Context: 131,072 tokens

Llama 3.2 90B Vision Instruct can be run self-hosted (rent a GPU + run vLLM/TGI) or through a serverless API (pay per token). Live pricing comparisons:

Self-hosted

Rent a GPU from Lambda, RunPod, Vast.ai, CoreWeave. Full control.

Serverless

Pay per token via Together, Fireworks, DeepInfra, Groq.

Need to benchmark this model against another? Try the calculator or see where it ranks on the InferenceScore leaderboard.