Question 1

Is Qwen 2.5 72B better than DeepSeek R1?

Accepted Answer

DeepSeek R1 has a higher overall quality score. Qwen 2.5 72B scores 84/100 while DeepSeek R1 scores 92/100. The best choice depends on your use case, budget, and deployment constraints.

Question 2

Which is cheaper, Qwen 2.5 72B or DeepSeek R1?

Accepted Answer

Qwen 2.5 72B is cheaper for output tokens. Qwen 2.5 72B starts at $0.90/M output tokens, while DeepSeek R1 starts at $2.19/M output tokens.

Question 3

How much VRAM do Qwen 2.5 72B and DeepSeek R1 need?

Accepted Answer

Qwen 2.5 72B requires 145.4 GB (BF16) or 36.4 GB (INT4). DeepSeek R1 requires 1342.0 GB (BF16) or 335.5 GB (INT4). Additional memory is needed for KV-cache and activations.

Question 4

What is the context length of Qwen 2.5 72B vs DeepSeek R1?

Accepted Answer

Qwen 2.5 72B supports 131,072 tokens context, while DeepSeek R1 supports 131,072 tokens.

Provider	Qwen 2.5 72B In $/M	Out $/M	DeepSeek R1 In $/M	Out $/M
together	$0.90	$0.90	$3.00	$7.00
fireworks	$0.90	$0.90	—	—
deepseek	—	—	$0.55	$2.19

Qwen 2.5 72B vs DeepSeek R1

Architecture Comparison

Memory Requirements

Minimum GPUs Needed (BF16)

Quality Benchmarks

Qwen 2.5 72B

DeepSeek R1

Capabilities

API Pricing Comparison

Recommendation Summary

Compare Other Models