What is the difference between RTX 4090 and L40S?

The RTX 4090 has 24GB GDDR6X with 165 BF16 TFLOPS, while the L40S has 48GB GDDR6 with 362 BF16 TFLOPS. The RTX 4090 has 1008 GB/s memory bandwidth vs 864 GB/s for the L40S.

Which GPU is cheaper, RTX 4090 or L40S?

The cheapest on-demand rate for the RTX 4090 is $0.39/hr, while the L40S starts at $0.85/hr. The RTX 4090 is +118% cheaper.

How many AI models fit on the RTX 4090 vs the L40S?

At FP16 precision on a single GPU, the RTX 4090 can run 136 models from our catalog, while the L40S can run 175 models. The L40S supports 39 more models due to its 48GB VRAM.

RTX 4090 vs L40S

Side-by-side comparison of the NVIDIA RTX 4090 and the NVIDIA L40S for AI inference workloads.

Specifications

Spec	RTX 4090	L40S
Generation	ada	ada
Memory Type	GDDR6X	GDDR6
VRAM	24 GB	48 GB
Memory Bandwidth	1,008 GB/s	864 GB/s
BF16 TFLOPS	165	362
FP16 TFLOPS	165	362
FP8 TFLOPS	330	733
INT8 TOPS	330	733
TDP	450 W	350 W
Interconnect	pcie	pcie
Max GPUs per Node	4	8
PCIe Gen	Gen 4	Gen 4
CUDA Compute Capability	8.9	8.9

Pricing

RTX 4090

Provider	On-Demand	Reserved	Spot
runpod	$1.10/hr	-	$0.79/hr
lambda	$0.89/hr	-	-
vast ai	$0.74/hr	-	$0.44/hr
tensordock	$0.69/hr	-	$0.44/hr
fluidstack	$0.59/hr	-	$0.39/hr

L40S

Provider	On-Demand	Reserved	Spot
runpod	$1.90/hr	-	$1.49/hr
lambda	$1.59/hr	$1.19/hr	-
coreweave	$1.84/hr	$1.34/hr	-
aws	$2.56/hr	$1.69/hr	-
gcp	$2.45/hr	$1.62/hr	-
vast ai	$1.29/hr	-	$0.95/hr
tensordock	$1.19/hr	-	$0.89/hr
fluidstack	$1.09/hr	-	$0.85/hr

Cheapest available rate: RTX 4090 at $0.39/hr vs L40S at $0.85/hr — RTX 4090 is +118% cheaper

Efficiency Metrics

TFLOPS / Watt

0.4

RTX 4090

1.0

L40S

BF16

VRAM / Dollar

61.5

RTX 4090

56.5

L40S

GB/$/hr

Bandwidth / Watt

2.2

RTX 4090

2.5

L40S

GB/s/W

Models (FP16, 1 GPU)

136.0

RTX 4090

175.0

L40S

Model Compatibility (FP16, Single GPU)

Only on RTX 4090 (0)

None

Both (136)

Yi 1.5 9B
Yi Coder 9B
GTE Qwen2 7B
Marco O1
Qwen 2 Audio 7B
Qwen 2.5 3B
OLMo 2 7B
OpenELM 3B
BGE Base EN v1.5
BGE Large EN v1.5
BGE M3
BGE Small EN v1.5
Baichuan 2 7B
SantaCoder 1.1B
StarCoder2 3B
StarCoder2 7B
BTLM 3B
Aya 23 8B
Command R 7B
Cohere Embed English v3
+116 more

Only on L40S (39)

Qwen 1.5 MoE A2.7B
Qwen 2.5 14B
OLMo 2 13B
Amazon Nova Lite
Claude 3.5 Haiku
Baichuan 2 13B
OctoCoder 15B
StarCoder2 15B
FLUX.1 Dev
FLUX.2
Cerebras GPT 13B
DeepSeek MoE 16B
DeepSeek R1 Distill 14B
DeepSeek V2 Lite
Gemma 3 12B
InternLM 2.5 20B
ELYZA 13B
KULLM 12.8B
Vicuna 13B
Code Llama 13B
+19 more

Summary

The RTX 4090 (ada generation) offers 24GB of GDDR6X with 165 BF16 TFLOPS and 1,008 GB/s memory bandwidth at 450W TDP.

The L40S (ada generation) offers 48GB of GDDR6 with 362 BF16 TFLOPS and 864 GB/s memory bandwidth at 350W TDP.

The L40S has +100% more VRAM, allowing it to run larger models without multi-GPU setups.

From a cost perspective, the RTX 4090 is more affordable at $0.39/hr vs $0.85/hr for the L40S.

More GPU Comparisons

H100 SXM vs A100 80GB SXM H200 SXM vs H100 SXM H100 SXM vs H100 PCIe A100 80GB SXM vs A100 40GB SXM H100 SXM vs B200 SXM A100 80GB SXM vs L40S RTX 3090 vs RTX 4090 H100 NVL vs H100 SXM B200 SXM vs H200 SXM B200 SXM vs B100 SXM H200 SXM vs A100 80GB SXM H100 SXM vs L40S