What is the difference between L40S and L40?

The L40S has 48GB GDDR6 with 362 BF16 TFLOPS, while the L40 has 48GB GDDR6 with 362 BF16 TFLOPS. The L40S has 864 GB/s memory bandwidth vs 864 GB/s for the L40.

Which GPU is cheaper, L40S or L40?

The cheapest on-demand rate for the L40S is $0.85/hr, while the L40 starts at $0.75/hr. The L40 is +13% cheaper.

How many AI models fit on the L40S vs the L40?

At FP16 precision on a single GPU, the L40S can run 162 models from our catalog, while the L40 can run 162 models. Both GPUs support the same number of models.

L40S vs L40

Side-by-side comparison of the NVIDIA L40S and the NVIDIA L40 for AI inference workloads.

Specifications

Spec	L40S	L40
Generation	ada	ada
Memory Type	GDDR6	GDDR6
VRAM	48 GB	48 GB
Memory Bandwidth	864 GB/s	864 GB/s
BF16 TFLOPS	362	362
FP16 TFLOPS	362	362
FP8 TFLOPS	733	733
INT8 TOPS	733	733
TDP	350 W	300 W
Interconnect	pcie	pcie
Max GPUs per Node	8	8
PCIe Gen	Gen 4	Gen 4
CUDA Compute Capability	8.9	8.9

Pricing

L40S

Provider	On-Demand	Reserved	Spot
runpod	$1.90/hr	-	$1.49/hr
lambda	$1.59/hr	$1.19/hr	-
coreweave	$1.84/hr	$1.34/hr	-
aws	$2.56/hr	$1.69/hr	-
gcp	$2.45/hr	$1.62/hr	-
vast ai	$1.29/hr	-	$0.95/hr
tensordock	$1.19/hr	-	$0.89/hr
fluidstack	$1.09/hr	-	$0.85/hr

L40

Provider	On-Demand	Reserved	Spot
runpod	$1.59/hr	-	$1.19/hr
coreweave	$1.58/hr	$1.14/hr	-
vast ai	$1.09/hr	-	$0.79/hr
tensordock	$0.99/hr	-	$0.75/hr

Cheapest available rate: L40S at $0.85/hr vs L40 at $0.75/hr — L40 is +13% cheaper

Efficiency Metrics

TFLOPS / Watt

1.0

L40S

1.2

L40

BF16

VRAM / Dollar

56.5

L40S

64.0

L40

GB/$/hr

Bandwidth / Watt

2.5

L40S

2.9

L40

GB/s/W

Models (FP16, 1 GPU)

162.0

L40S

162.0

L40

Model Compatibility (FP16, Single GPU)

Only on L40S (0)

None

Both (162)

Yi 1.5 9B
Yi Coder 9B
GTE Qwen2 7B
Marco O1
Qwen 1.5 MoE A2.7B
Qwen 2 Audio 7B
Qwen 2.5 14B
Qwen 2.5 3B
OLMo 2 13B
OLMo 2 7B
Amazon Nova Lite
OpenELM 3B
BGE Large EN v1.5
BGE M3
Baichuan 2 13B
OctoCoder 15B
StarCoder2 15B
StarCoder2 3B
StarCoder2 7B
Aya 23 8B
+142 more

Only on L40 (0)

None

Summary

The L40S (ada generation) offers 48GB of GDDR6 with 362 BF16 TFLOPS and 864 GB/s memory bandwidth at 350W TDP.

The L40 (ada generation) offers 48GB of GDDR6 with 362 BF16 TFLOPS and 864 GB/s memory bandwidth at 300W TDP.

From a cost perspective, the L40 is more affordable at $0.75/hr vs $0.85/hr for the L40S.

More GPU Comparisons

H100 SXM vs A100 80GB SXM H200 SXM vs H100 SXM H100 SXM vs H100 PCIe A100 80GB SXM vs A100 40GB SXM RTX 4090 vs L40S H100 SXM vs B200 SXM A100 80GB SXM vs L40S RTX 3090 vs RTX 4090 H100 NVL vs H100 SXM B200 SXM vs H200 SXM B200 SXM vs B100 SXM H200 SXM vs A100 80GB SXM