What is the difference between B200 SXM and H200 SXM?

The B200 SXM has 180GB HBM3e with 2250 BF16 TFLOPS, while the H200 SXM has 141GB HBM3e with 990 BF16 TFLOPS. The B200 SXM has 8000 GB/s memory bandwidth vs 4800 GB/s for the H200 SXM.

Which GPU is cheaper, B200 SXM or H200 SXM?

The cheapest on-demand rate for the B200 SXM is $4.49/hr, while the H200 SXM starts at $2.69/hr. The H200 SXM is +67% cheaper.

How many AI models fit on the B200 SXM vs the H200 SXM?

At FP16 precision on a single GPU, the B200 SXM can run 220 models from our catalog, while the H200 SXM can run 193 models. The B200 SXM supports 27 more models due to its 180GB VRAM.

B200 SXM vs H200 SXM

Side-by-side comparison of the NVIDIA B200 SXM and the NVIDIA H200 SXM for AI inference workloads.

Specifications

Spec	B200 SXM	H200 SXM
Generation	blackwell	hopper
Memory Type	HBM3e	HBM3e
VRAM	180 GB	141 GB
Memory Bandwidth	8,000 GB/s	4,800 GB/s
BF16 TFLOPS	2,250	990
FP16 TFLOPS	2,250	990
FP8 TFLOPS	4,500	1,979
INT8 TOPS	4,500	1,979
TDP	1,000 W	700 W
Interconnect	nvlink	nvlink
NVLink Bandwidth	1,800 GB/s	900 GB/s
Max GPUs per Node	8	8
PCIe Gen	Gen 6	Gen 5
CUDA Compute Capability	10	9

Pricing

B200 SXM

Provider	On-Demand	Reserved	Spot
coreweave	$7.50/hr	$5.50/hr	-
lambda	$5.99/hr	$4.49/hr	-
runpod	$7.20/hr	-	-

H200 SXM

Provider	On-Demand	Reserved	Spot
lambda	$3.49/hr	$2.69/hr	-
coreweave	$4.25/hr	$3.19/hr	-
runpod	$4.69/hr	-	-
tensordock	$3.80/hr	-	$2.90/hr

Cheapest available rate: B200 SXM at $4.49/hr vs H200 SXM at $2.69/hr — H200 SXM is +67% cheaper

Efficiency Metrics

TFLOPS / Watt

2.3

B200 SXM

1.4

H200 SXM

BF16

VRAM / Dollar

40.1

B200 SXM

52.4

H200 SXM

GB/$/hr

Bandwidth / Watt

8.0

B200 SXM

6.9

H200 SXM

GB/s/W

Models (FP16, 1 GPU)

220.0

B200 SXM

193.0

H200 SXM

Model Compatibility (FP16, Single GPU)

Only on B200 SXM (27)

Code Llama 70B
Dolphin 2.9 72B
DeepSeek R1 Distill 70B
Llama 3 70B 1M Context
Llama 2 70B
Llama 3 70B
Llama 3.1 70B
Llama 3.3 70B
WizardMath 70B
Hermes 3 70B
HelpSteer2 Llama 3.1 70B
Llama 3.1 Nemotron 70B Instruct
Llama 3.1 Nemotron 70B Reward
Nemotron 70B
Qwen 2.5 72B
Qwen 2.5 Math 72B
Qwen 2.5 VL 72B
Llama 3.1 70B Turbo
Claude Sonnet 4
o1-mini
+7 more

Both (193)

Yi 1.5 34B
Yi 1.5 9B
Yi Coder 9B
Jamba 1.5 Mini
GTE Qwen2 7B
Marco O1
Qwen 1.5 MoE A2.7B
Qwen 2 Audio 7B
Qwen 2.5 14B
Qwen 2.5 32B
Qwen 2.5 3B
Qwen 2.5 Coder 32B
OLMo 2 13B
OLMo 2 7B
Amazon Nova Lite
Amazon Nova Pro
OpenELM 3B
BGE Large EN v1.5
BGE M3
Baichuan 2 13B
+173 more

Only on H200 SXM (0)

None

Summary

The B200 SXM (blackwell generation) offers 180GB of HBM3e with 2,250 BF16 TFLOPS and 8,000 GB/s memory bandwidth at 1000W TDP.

The H200 SXM (hopper generation) offers 141GB of HBM3e with 990 BF16 TFLOPS and 4,800 GB/s memory bandwidth at 700W TDP.

The B200 SXM has +28% more VRAM, allowing it to run larger models without multi-GPU setups.

From a cost perspective, the H200 SXM is more affordable at $2.69/hr vs $4.49/hr for the B200 SXM.

More GPU Comparisons

H100 SXM vs A100 80GB SXM H200 SXM vs H100 SXM H100 SXM vs H100 PCIe A100 80GB SXM vs A100 40GB SXM RTX 4090 vs L40S H100 SXM vs B200 SXM A100 80GB SXM vs L40S RTX 3090 vs RTX 4090 H100 NVL vs H100 SXM B200 SXM vs B100 SXM H200 SXM vs A100 80GB SXM H100 SXM vs L40S