Skip to content

L40S vs L40

Side-by-side comparison of the NVIDIA L40S and the NVIDIA L40 for AI inference workloads.

Specifications

SpecL40SL40
Generationadaada
Memory TypeGDDR6GDDR6
VRAM48 GB48 GB
Memory Bandwidth864 GB/s864 GB/s
BF16 TFLOPS362362
FP16 TFLOPS362362
FP8 TFLOPS733733
INT8 TOPS733733
TDP350 W300 W
Interconnectpciepcie
Max GPUs per Node88
PCIe GenGen 4Gen 4
CUDA Compute Capability8.98.9

Pricing

L40S

ProviderOn-DemandReservedSpot
runpod$1.90/hr-$1.49/hr
lambda$1.59/hr$1.19/hr-
coreweave$1.84/hr$1.34/hr-
aws$2.56/hr$1.69/hr-
gcp$2.45/hr$1.62/hr-
vast ai$1.29/hr-$0.95/hr
tensordock$1.19/hr-$0.89/hr
fluidstack$1.09/hr-$0.85/hr

L40

ProviderOn-DemandReservedSpot
runpod$1.59/hr-$1.19/hr
coreweave$1.58/hr$1.14/hr-
vast ai$1.09/hr-$0.79/hr
tensordock$0.99/hr-$0.75/hr

Cheapest available rate: L40S at $0.85/hr vs L40 at $0.75/hrL40 is +13% cheaper

Efficiency Metrics

TFLOPS / Watt

1.0

L40S

vs

1.2

L40

BF16

VRAM / Dollar

56.5

L40S

vs

64.0

L40

GB/$/hr

Bandwidth / Watt

2.5

L40S

vs

2.9

L40

GB/s/W

Models (FP16, 1 GPU)

162.0

L40S

vs

162.0

L40

Model Compatibility (FP16, Single GPU)

Only on L40S (0)

None

Both (162)

  • Yi 1.5 9B
  • Yi Coder 9B
  • GTE Qwen2 7B
  • Marco O1
  • Qwen 1.5 MoE A2.7B
  • Qwen 2 Audio 7B
  • Qwen 2.5 14B
  • Qwen 2.5 3B
  • OLMo 2 13B
  • OLMo 2 7B
  • Amazon Nova Lite
  • OpenELM 3B
  • BGE Large EN v1.5
  • BGE M3
  • Baichuan 2 13B
  • OctoCoder 15B
  • StarCoder2 15B
  • StarCoder2 3B
  • StarCoder2 7B
  • Aya 23 8B
  • +142 more

Only on L40 (0)

None

Summary

The L40S (ada generation) offers 48GB of GDDR6 with 362 BF16 TFLOPS and 864 GB/s memory bandwidth at 350W TDP.

The L40 (ada generation) offers 48GB of GDDR6 with 362 BF16 TFLOPS and 864 GB/s memory bandwidth at 300W TDP.

From a cost perspective, the L40 is more affordable at $0.75/hr vs $0.85/hr for the L40S.

More GPU Comparisons