What is the difference between L4 and T4?

The L4 has 24GB GDDR6 with 121 BF16 TFLOPS, while the T4 has 16GB GDDR6 with 65 BF16 TFLOPS. The L4 has 300 GB/s memory bandwidth vs 300 GB/s for the T4.

Which GPU is cheaper, L4 or T4?

The cheapest on-demand rate for the L4 is $0.29/hr, while the T4 starts at $0.12/hr. The T4 is +142% cheaper.

How many AI models fit on the L4 vs the T4?

At FP16 precision on a single GPU, the L4 can run 124 models from our catalog, while the T4 can run 92 models. The L4 supports 32 more models due to its 24GB VRAM.

L4 vs T4

Side-by-side comparison of the NVIDIA L4 and the NVIDIA T4 for AI inference workloads.

Specifications

Spec	L4	T4
Generation	ada	turing
Memory Type	GDDR6	GDDR6
VRAM	24 GB	16 GB
Memory Bandwidth	300 GB/s	300 GB/s
BF16 TFLOPS	121	65
FP16 TFLOPS	121	65
FP8 TFLOPS	242	65
INT8 TOPS	242	130
TDP	72 W	70 W
Interconnect	pcie	pcie
Max GPUs per Node	8	8
PCIe Gen	Gen 4	Gen 3
CUDA Compute Capability	8.9	7.5

Pricing

L4

Provider	On-Demand	Reserved	Spot
runpod	$0.69/hr	-	$0.49/hr
lambda	$0.59/hr	-	-
aws	$0.81/hr	$0.52/hr	-
gcp	$0.70/hr	$0.49/hr	-
vast ai	$0.45/hr	-	$0.30/hr
tensordock	$0.39/hr	-	$0.29/hr

T4

Provider	On-Demand	Reserved	Spot
aws	$0.53/hr	$0.33/hr	$0.16/hr
gcp	$0.35/hr	$0.22/hr	-
azure	$0.45/hr	$0.28/hr	-
runpod	$0.37/hr	-	$0.22/hr
vast ai	$0.25/hr	-	$0.14/hr
tensordock	$0.19/hr	-	$0.12/hr

Cheapest available rate: L4 at $0.29/hr vs T4 at $0.12/hr — T4 is +142% cheaper

Efficiency Metrics

TFLOPS / Watt

1.7

0.9

BF16

VRAM / Dollar

82.8

133.3

GB/$/hr

Bandwidth / Watt

4.2

4.3

GB/s/W

Models (FP16, 1 GPU)

124.0

92.0

Model Compatibility (FP16, Single GPU)

Only on L4 (32)

Yi 1.5 9B
Yi Coder 9B
GTE Qwen2 7B
Marco O1
Qwen 2 Audio 7B
Aya 23 8B
DeepSeek R1 Distill 8B
Gemma 2 9B
InternLM 2.5 7B
Llama 3 8B
Llama 3.1 8B
Llama 3.2 11B Vision
Llama Guard 3 8B
Ministral 8B
Hermes 3 8B
Minitron 8B
NV Embed v2
Qwen 2.5 7B
Qwen 2.5 Coder 7B
Qwen 2.5 Math 7B
+12 more

Both (92)

Qwen 2.5 3B
OLMo 2 7B
OpenELM 3B
BGE Large EN v1.5
BGE M3
StarCoder2 3B
StarCoder2 7B
Command R 7B
DeepSeek Coder 6.7B
DeepSeek Math 7B
Falcon 7B
Gemma 1.1 2B
Gemma 2 2B
Gemma 3 1B
Gemma 3 4B
RecurrentGemma 2B
H2O Danube3 500M
SmolLM2 1.7B
Zephyr 7B
E5 Mistral 7B
+72 more

Only on T4 (0)

None

Summary

The L4 (ada generation) offers 24GB of GDDR6 with 121 BF16 TFLOPS and 300 GB/s memory bandwidth at 72W TDP.

The T4 (turing generation) offers 16GB of GDDR6 with 65 BF16 TFLOPS and 300 GB/s memory bandwidth at 70W TDP.

The L4 has +50% more VRAM, allowing it to run larger models without multi-GPU setups.

From a cost perspective, the T4 is more affordable at $0.12/hr vs $0.29/hr for the L4.

More GPU Comparisons

H100 SXM vs A100 80GB SXM H200 SXM vs H100 SXM H100 SXM vs H100 PCIe A100 80GB SXM vs A100 40GB SXM RTX 4090 vs L40S H100 SXM vs B200 SXM A100 80GB SXM vs L40S RTX 3090 vs RTX 4090 H100 NVL vs H100 SXM B200 SXM vs H200 SXM B200 SXM vs B100 SXM H200 SXM vs A100 80GB SXM