Updated minutes ago
RTX 4090
nvidia · ada · 24 GB GDDR6X · 450W TDP
VRAM
24 GB
BF16 TFLOPS
165
Bandwidth
1008 GB/s
From
$0.39/hr
Spec Sheet
VRAM24 GB GDDR6X
Memory Bandwidth1008 GB/s
BF16 TFLOPS165
FP16 TFLOPS165
FP8 TFLOPS330
INT8 TOPS330
TDP450W
InterconnectPCIE
Max per Node4
PCIe Gen4
CUDA Compute Capability8.9
Tensor CoresYes
Pricing by Provider
| Provider | On-Demand | Reserved | Spot | Badge |
|---|---|---|---|---|
| fluidstack | $0.59/hr | - | $0.39/hr | Cheapest |
| vast_ai | $0.74/hr | - | $0.44/hr | |
| tensordock | $0.69/hr | - | $0.44/hr | |
| runpod | $1.10/hr | - | $0.79/hr | |
| lambda | $0.89/hr | - | - |
Pricing History
runpod
$1.10/hr→ 0.0%
2024-01-012025-03-01
Low: $1.10High: $1.44
vast
$0.54/hr→ 0.0%
2024-01-012025-03-01
Low: $0.54High: $0.89
Compatible Models (218)
Single GPU (153 models)
GigaChat 20B20B FP8InternLM 20B20B FP8InternLM 2.5 20B19.9B FP8CogVLM2 19B19B FP8DeepSeek MoE 16B16.4B FP8CodeGen2 16B16B FP8DeepSeek V2 Lite15.7B FP8OctoCoder 15B15.5B FP8StarCoder2 15B15.5B FP8Nemotron 15B15B FP8Qwen 2.5 14B14.8B FP8DeepSeek R1 Distill 14B14.8B FP8Phi-414.7B FP8Qwen 2.5 Coder 14B14.7B FP8Qwen 1.5 MoE A2.7B14.3B FP8RWKV-6 14B14.1B FP8Phi 3 Medium 14B14B FP8Nekomata 14B14B FP8OLMo 2 13B13B FP8Baichuan 2 13B13B FP8+133 more
Multi-GPU (65 models)
Falcon 40Bx2 FP8VILA 1.5 40Bx2 FP8Aya 23 35Bx2 FP8Command Rx2 FP8Command R (August 2024)x2 FP8Yi 1.5 34Bx2 FP8Code Llama 34Bx2 FP8DeepSeek Coder 33Bx2 FP8Vicuna 33Bx2 FP8WizardCoder 33Bx2 FP8DeepSeek R1 Distill 32Bx2 FP8Qwen 3 32Bx2 FP8Qwen 2.5 32Bx2 FP8Qwen 2.5 Coder 32Bx2 FP8Qwen 3 30B-A3Bx2 FP8+50 more
Training Capabilities
Estimated GPU count for full fine-tuning (AdamW, BF16) and QLoRA
| Model Size | Full Fine-Tune | QLoRA |
|---|---|---|
| 7B model | 6 GPUs | 1 GPU |
| 13B model | 11 GPUs | 1 GPU |
| 70B model | 55 GPUs | 2 GPUs |
Energy Efficiency
Estimated tokens/second per Watt for popular models
Mistral 7B
0.31 t/s/WFP8
Qwen 2.5 7B
0.29 t/s/WFP8
Llama 3.1 8B
0.28 t/s/WFP8
Llama 3.1 70B
0.03 t/s/WFP8
Qwen 2.5 72B
0.03 t/s/WFP8
Similar GPUs
| GPU | VRAM | BF16 TFLOPS | BW (GB/s) | From |
|---|---|---|---|---|
| L4 | 24 GB | 121 | 300 | $0.29/hr |
| RTX 4080 | 16 GB | 97 | 717 | $0.32/hr |
| RTX 4060 Ti 16GB | 16 GB | 44 | 288 | $0.30/hr |
| RTX 4070 Ti | 12 GB | 93 | 504 | $0.25/hr |
| RTX 4070 Super | 12 GB | 55 | 504 | $0.22/hr |