H100 SXM vs L40S
Side-by-side comparison of the NVIDIA H100 SXM and the NVIDIA L40S for AI inference workloads.
Specifications
Pricing
H100 SXM
| Provider | On-Demand | Reserved | Spot |
|---|---|---|---|
| runpod | $4.18/hr | - | $3.29/hr |
| lambda | $2.49/hr | $1.89/hr | - |
| coreweave | $3.79/hr | $2.57/hr | - |
| aws | $5.12/hr | $3.59/hr | - |
| gcp | $4.85/hr | $3.40/hr | - |
| azure | $4.98/hr | $3.49/hr | - |
| vast ai | $3.40/hr | - | $2.50/hr |
| tensordock | $3.29/hr | - | $2.49/hr |
| fluidstack | $2.85/hr | - | $2.10/hr |
L40S
| Provider | On-Demand | Reserved | Spot |
|---|---|---|---|
| runpod | $1.90/hr | - | $1.49/hr |
| lambda | $1.59/hr | $1.19/hr | - |
| coreweave | $1.84/hr | $1.34/hr | - |
| aws | $2.56/hr | $1.69/hr | - |
| gcp | $2.45/hr | $1.62/hr | - |
| vast ai | $1.29/hr | - | $0.95/hr |
| tensordock | $1.19/hr | - | $0.89/hr |
| fluidstack | $1.09/hr | - | $0.85/hr |
Cheapest available rate: H100 SXM at $1.89/hr vs L40S at $0.85/hr — L40S is +122% cheaper
Efficiency Metrics
TFLOPS / Watt
1.4
H100 SXM
1.0
L40S
BF16
VRAM / Dollar
42.3
H100 SXM
56.5
L40S
GB/$/hr
Bandwidth / Watt
4.8
H100 SXM
2.5
L40S
GB/s/W
Models (FP16, 1 GPU)
182.0
H100 SXM
162.0
L40S
Model Compatibility (FP16, Single GPU)
Only on H100 SXM (20)
- Yi 1.5 34B
- Qwen 2.5 32B
- Qwen 2.5 Coder 32B
- Aya 23 35B
- Command R
- DeepSeek Coder 33B
- DeepSeek R1 Distill 32B
- Gemma 2 27B
- Gemma 3 27B
- InternVL2 26B
- Vicuna 33B
- Code Llama 34B
- Mistral Small 24B
- Mistral Small 3.1 24B
- Qwen 3 32B
- WizardCoder 33B
- Qwen 3 30B-A3B
- JAIS 30B
- Command R (August 2024)
- MPT 30B
Both (162)
- Yi 1.5 9B
- Yi Coder 9B
- GTE Qwen2 7B
- Marco O1
- Qwen 1.5 MoE A2.7B
- Qwen 2 Audio 7B
- Qwen 2.5 14B
- Qwen 2.5 3B
- OLMo 2 13B
- OLMo 2 7B
- Amazon Nova Lite
- OpenELM 3B
- BGE Large EN v1.5
- BGE M3
- Baichuan 2 13B
- OctoCoder 15B
- StarCoder2 15B
- StarCoder2 3B
- StarCoder2 7B
- Aya 23 8B
- +142 more
Only on L40S (0)
None
Summary
The H100 SXM (hopper generation) offers 80GB of HBM3 with 990 BF16 TFLOPS and 3,350 GB/s memory bandwidth at 700W TDP.
The L40S (ada generation) offers 48GB of GDDR6 with 362 BF16 TFLOPS and 864 GB/s memory bandwidth at 350W TDP.
The H100 SXM has +67% more VRAM, allowing it to run larger models without multi-GPU setups.
From a cost perspective, the L40S is more affordable at $0.85/hr vs $1.89/hr for the H100 SXM.