L40S vs L40
Side-by-side comparison of the NVIDIA L40S and the NVIDIA L40 for AI inference workloads.
Specifications
Pricing
L40S
| Provider | On-Demand | Reserved | Spot |
|---|---|---|---|
| runpod | $1.90/hr | - | $1.49/hr |
| lambda | $1.59/hr | $1.19/hr | - |
| coreweave | $1.84/hr | $1.34/hr | - |
| aws | $2.56/hr | $1.69/hr | - |
| gcp | $2.45/hr | $1.62/hr | - |
| vast ai | $1.29/hr | - | $0.95/hr |
| tensordock | $1.19/hr | - | $0.89/hr |
| fluidstack | $1.09/hr | - | $0.85/hr |
L40
| Provider | On-Demand | Reserved | Spot |
|---|---|---|---|
| runpod | $1.59/hr | - | $1.19/hr |
| coreweave | $1.58/hr | $1.14/hr | - |
| vast ai | $1.09/hr | - | $0.79/hr |
| tensordock | $0.99/hr | - | $0.75/hr |
Cheapest available rate: L40S at $0.85/hr vs L40 at $0.75/hr — L40 is +13% cheaper
Efficiency Metrics
TFLOPS / Watt
1.0
L40S
1.2
L40
BF16
VRAM / Dollar
56.5
L40S
64.0
L40
GB/$/hr
Bandwidth / Watt
2.5
L40S
2.9
L40
GB/s/W
Models (FP16, 1 GPU)
162.0
L40S
162.0
L40
Model Compatibility (FP16, Single GPU)
Only on L40S (0)
None
Both (162)
- Yi 1.5 9B
- Yi Coder 9B
- GTE Qwen2 7B
- Marco O1
- Qwen 1.5 MoE A2.7B
- Qwen 2 Audio 7B
- Qwen 2.5 14B
- Qwen 2.5 3B
- OLMo 2 13B
- OLMo 2 7B
- Amazon Nova Lite
- OpenELM 3B
- BGE Large EN v1.5
- BGE M3
- Baichuan 2 13B
- OctoCoder 15B
- StarCoder2 15B
- StarCoder2 3B
- StarCoder2 7B
- Aya 23 8B
- +142 more
Only on L40 (0)
None
Summary
The L40S (ada generation) offers 48GB of GDDR6 with 362 BF16 TFLOPS and 864 GB/s memory bandwidth at 350W TDP.
The L40 (ada generation) offers 48GB of GDDR6 with 362 BF16 TFLOPS and 864 GB/s memory bandwidth at 300W TDP.
From a cost perspective, the L40 is more affordable at $0.75/hr vs $0.85/hr for the L40S.