Skip to content

Qwen 2.5 VL 7B vs Qwen 2.5 VL 72B

Alibaba
Qwen 2.5 VL 7B

Alibaba · 7.6B params · Quality: 50

Alibaba
Qwen 2.5 VL 72B

Alibaba · 72.7B params · Quality: 50

Architecture Comparison

SpecQwen 2.5 VL 7BQwen 2.5 VL 72B
TypeDENSEDENSE
Total Parameters7.6B72.7B
Active Parameters7.6B72.7B
Layers2880
Hidden Dimension3,5848,192
Attention Heads2864
KV Heads48
Context Length131,072131,072
Precision (default)BF16BF16

Memory Requirements

PrecisionQwen 2.5 VL 7BQwen 2.5 VL 72B
BF16 Weights15.2 GB145.4 GB
FP8 Weights7.6 GB72.7 GB
INT4 Weights3.8 GB36.4 GB
KV-Cache / Token57344 B327680 B
Activation Estimate0.80 GB3.00 GB

Minimum GPUs Needed (BF16)

H100 SXM1 GPU3 GPUs
L40S1 GPU4 GPUs

Capabilities

FeatureQwen 2.5 VL 7BQwen 2.5 VL 72B
Tool Use✓ Yes✓ Yes
Vision✓ Yes✓ Yes
Code✓ Yes✓ Yes
Math✓ Yes✓ Yes
Reasoning✗ No✗ No
Multilingual✓ Yes✓ Yes
Structured Output✓ Yes✓ Yes

API Pricing Comparison

Cheapest Output (Qwen 2.5 VL 7B)

$0.20/M

Input: $0.20/M

Cheapest Output (Qwen 2.5 VL 72B)

$0.90/M

Input: $0.90/M

ProviderQwen 2.5 VL 7B In $/MOut $/MQwen 2.5 VL 72B In $/MOut $/M
together$0.20$0.20$0.90$0.90

Recommendation Summary

  • Qwen 2.5 VL 7B is cheaper per output token ($0.20/M vs $0.90/M).
  • Qwen 2.5 VL 7B has a smaller memory footprint (15.2 GB vs 145.4 GB BF16), making it easier to deploy on fewer GPUs.

Compare Other Models