Question 1

Is DeepSeek V3 better than Qwen 2.5 72B?

Accepted Answer

DeepSeek V3 has a higher overall quality score. DeepSeek V3 scores 86/100 while Qwen 2.5 72B scores 84/100. The best choice depends on your use case, budget, and deployment constraints.

Question 2

Which is cheaper, DeepSeek V3 or Qwen 2.5 72B?

Accepted Answer

DeepSeek V3 is cheaper for output tokens. DeepSeek V3 starts at $0.42/M output tokens, while Qwen 2.5 72B starts at $0.90/M output tokens.

Question 3

How much VRAM do DeepSeek V3 and Qwen 2.5 72B need?

Accepted Answer

DeepSeek V3 requires 1342.0 GB (BF16) or 335.5 GB (INT4). Qwen 2.5 72B requires 145.4 GB (BF16) or 36.4 GB (INT4). Additional memory is needed for KV-cache and activations.

Question 4

What is the context length of DeepSeek V3 vs Qwen 2.5 72B?

Accepted Answer

DeepSeek V3 supports 131,072 tokens context, while Qwen 2.5 72B supports 131,072 tokens.

Provider	DeepSeek V3 In $/M	Out $/M	Qwen 2.5 72B In $/M	Out $/M
deepseek	$0.28	$0.42	—	—
together	$0.50	$2.80	$0.90	$0.90
fireworks	—	—	$0.90	$0.90

DeepSeek V3 vs Qwen 2.5 72B

Architecture Comparison

Memory Requirements

Minimum GPUs Needed (BF16)

Quality Benchmarks

DeepSeek V3

Qwen 2.5 72B

Capabilities

API Pricing Comparison

Recommendation Summary

Compare Other Models