Skip to content

H100 SXM vs L40S

Side-by-side comparison of the NVIDIA H100 SXM and the NVIDIA L40S for AI inference workloads.

Specifications

SpecH100 SXML40S
Generationhopperada
Memory TypeHBM3GDDR6
VRAM80 GB48 GB
Memory Bandwidth3,350 GB/s864 GB/s
BF16 TFLOPS990362
FP16 TFLOPS990362
FP8 TFLOPS1,979733
INT8 TOPS1,979733
TDP700 W350 W
Interconnectnvlinkpcie
NVLink Bandwidth900 GB/sN/A
Max GPUs per Node88
PCIe GenGen 5Gen 4
CUDA Compute Capability98.9

Pricing

H100 SXM

ProviderOn-DemandReservedSpot
runpod$4.18/hr-$3.29/hr
lambda$2.49/hr$1.89/hr-
coreweave$3.79/hr$2.57/hr-
aws$5.12/hr$3.59/hr-
gcp$4.85/hr$3.40/hr-
azure$4.98/hr$3.49/hr-
vast ai$3.40/hr-$2.50/hr
tensordock$3.29/hr-$2.49/hr
fluidstack$2.85/hr-$2.10/hr

L40S

ProviderOn-DemandReservedSpot
runpod$1.90/hr-$1.49/hr
lambda$1.59/hr$1.19/hr-
coreweave$1.84/hr$1.34/hr-
aws$2.56/hr$1.69/hr-
gcp$2.45/hr$1.62/hr-
vast ai$1.29/hr-$0.95/hr
tensordock$1.19/hr-$0.89/hr
fluidstack$1.09/hr-$0.85/hr

Cheapest available rate: H100 SXM at $1.89/hr vs L40S at $0.85/hrL40S is +122% cheaper

Efficiency Metrics

TFLOPS / Watt

1.4

H100 SXM

vs

1.0

L40S

BF16

VRAM / Dollar

42.3

H100 SXM

vs

56.5

L40S

GB/$/hr

Bandwidth / Watt

4.8

H100 SXM

vs

2.5

L40S

GB/s/W

Models (FP16, 1 GPU)

182.0

H100 SXM

vs

162.0

L40S

Model Compatibility (FP16, Single GPU)

Only on H100 SXM (20)

  • Yi 1.5 34B
  • Qwen 2.5 32B
  • Qwen 2.5 Coder 32B
  • Aya 23 35B
  • Command R
  • DeepSeek Coder 33B
  • DeepSeek R1 Distill 32B
  • Gemma 2 27B
  • Gemma 3 27B
  • InternVL2 26B
  • Vicuna 33B
  • Code Llama 34B
  • Mistral Small 24B
  • Mistral Small 3.1 24B
  • Qwen 3 32B
  • WizardCoder 33B
  • Qwen 3 30B-A3B
  • JAIS 30B
  • Command R (August 2024)
  • MPT 30B

Both (162)

  • Yi 1.5 9B
  • Yi Coder 9B
  • GTE Qwen2 7B
  • Marco O1
  • Qwen 1.5 MoE A2.7B
  • Qwen 2 Audio 7B
  • Qwen 2.5 14B
  • Qwen 2.5 3B
  • OLMo 2 13B
  • OLMo 2 7B
  • Amazon Nova Lite
  • OpenELM 3B
  • BGE Large EN v1.5
  • BGE M3
  • Baichuan 2 13B
  • OctoCoder 15B
  • StarCoder2 15B
  • StarCoder2 3B
  • StarCoder2 7B
  • Aya 23 8B
  • +142 more

Only on L40S (0)

None

Summary

The H100 SXM (hopper generation) offers 80GB of HBM3 with 990 BF16 TFLOPS and 3,350 GB/s memory bandwidth at 700W TDP.

The L40S (ada generation) offers 48GB of GDDR6 with 362 BF16 TFLOPS and 864 GB/s memory bandwidth at 350W TDP.

The H100 SXM has +67% more VRAM, allowing it to run larger models without multi-GPU setups.

From a cost perspective, the L40S is more affordable at $0.85/hr vs $1.89/hr for the H100 SXM.

More GPU Comparisons