What is the difference between H100 SXM and L40S?

The H100 SXM has 80GB HBM3 with 990 BF16 TFLOPS, while the L40S has 48GB GDDR6 with 362 BF16 TFLOPS. The H100 SXM has 3350 GB/s memory bandwidth vs 864 GB/s for the L40S.

Which GPU is cheaper, H100 SXM or L40S?

The cheapest on-demand rate for the H100 SXM is $1.89/hr, while the L40S starts at $0.85/hr. The L40S is +122% cheaper.

How many AI models fit on the H100 SXM vs the L40S?

At FP16 precision on a single GPU, the H100 SXM can run 182 models from our catalog, while the L40S can run 162 models. The H100 SXM supports 20 more models due to its 80GB VRAM.

H100 SXM vs L40S

Side-by-side comparison of the NVIDIA H100 SXM and the NVIDIA L40S for AI inference workloads.

Specifications

Spec	H100 SXM	L40S
Generation	hopper	ada
Memory Type	HBM3	GDDR6
VRAM	80 GB	48 GB
Memory Bandwidth	3,350 GB/s	864 GB/s
BF16 TFLOPS	990	362
FP16 TFLOPS	990	362
FP8 TFLOPS	1,979	733
INT8 TOPS	1,979	733
TDP	700 W	350 W
Interconnect	nvlink	pcie
NVLink Bandwidth	900 GB/s	N/A
Max GPUs per Node	8	8
PCIe Gen	Gen 5	Gen 4
CUDA Compute Capability	9	8.9

Pricing

H100 SXM

Provider	On-Demand	Reserved	Spot
runpod	$4.18/hr	-	$3.29/hr
lambda	$2.49/hr	$1.89/hr	-
coreweave	$3.79/hr	$2.57/hr	-
aws	$5.12/hr	$3.59/hr	-
gcp	$4.85/hr	$3.40/hr	-
azure	$4.98/hr	$3.49/hr	-
vast ai	$3.40/hr	-	$2.50/hr
tensordock	$3.29/hr	-	$2.49/hr
fluidstack	$2.85/hr	-	$2.10/hr

L40S

Provider	On-Demand	Reserved	Spot
runpod	$1.90/hr	-	$1.49/hr
lambda	$1.59/hr	$1.19/hr	-
coreweave	$1.84/hr	$1.34/hr	-
aws	$2.56/hr	$1.69/hr	-
gcp	$2.45/hr	$1.62/hr	-
vast ai	$1.29/hr	-	$0.95/hr
tensordock	$1.19/hr	-	$0.89/hr
fluidstack	$1.09/hr	-	$0.85/hr

Cheapest available rate: H100 SXM at $1.89/hr vs L40S at $0.85/hr — L40S is +122% cheaper

Efficiency Metrics

TFLOPS / Watt

1.4

H100 SXM

1.0

L40S

BF16

VRAM / Dollar

42.3

H100 SXM

56.5

L40S

GB/$/hr

Bandwidth / Watt

4.8

H100 SXM

2.5

L40S

GB/s/W

Models (FP16, 1 GPU)

182.0

H100 SXM

162.0

L40S

Model Compatibility (FP16, Single GPU)

Only on H100 SXM (20)

Yi 1.5 34B
Qwen 2.5 32B
Qwen 2.5 Coder 32B
Aya 23 35B
Command R
DeepSeek Coder 33B
DeepSeek R1 Distill 32B
Gemma 2 27B
Gemma 3 27B
InternVL2 26B
Vicuna 33B
Code Llama 34B
Mistral Small 24B
Mistral Small 3.1 24B
Qwen 3 32B
WizardCoder 33B
Qwen 3 30B-A3B
JAIS 30B
Command R (August 2024)
MPT 30B

Both (162)

Yi 1.5 9B
Yi Coder 9B
GTE Qwen2 7B
Marco O1
Qwen 1.5 MoE A2.7B
Qwen 2 Audio 7B
Qwen 2.5 14B
Qwen 2.5 3B
OLMo 2 13B
OLMo 2 7B
Amazon Nova Lite
OpenELM 3B
BGE Large EN v1.5
BGE M3
Baichuan 2 13B
OctoCoder 15B
StarCoder2 15B
StarCoder2 3B
StarCoder2 7B
Aya 23 8B
+142 more

Only on L40S (0)

None

Summary

The H100 SXM (hopper generation) offers 80GB of HBM3 with 990 BF16 TFLOPS and 3,350 GB/s memory bandwidth at 700W TDP.

The L40S (ada generation) offers 48GB of GDDR6 with 362 BF16 TFLOPS and 864 GB/s memory bandwidth at 350W TDP.

The H100 SXM has +67% more VRAM, allowing it to run larger models without multi-GPU setups.

From a cost perspective, the L40S is more affordable at $0.85/hr vs $1.89/hr for the H100 SXM.

More GPU Comparisons

H100 SXM vs A100 80GB SXM H200 SXM vs H100 SXM H100 SXM vs H100 PCIe A100 80GB SXM vs A100 40GB SXM RTX 4090 vs L40S H100 SXM vs B200 SXM A100 80GB SXM vs L40S RTX 3090 vs RTX 4090 H100 NVL vs H100 SXM B200 SXM vs H200 SXM B200 SXM vs B100 SXM H200 SXM vs A100 80GB SXM