Updated minutes ago
H100 NVL 94GB (per GPU pair)
nvidia · hopper · 188 GB HBM3 · 800W TDP
VRAM
188 GB
BF16 TFLOPS
1670
Bandwidth
7876 GB/s
From
$5.49/hr
Spec Sheet
VRAM188 GB HBM3
Memory Bandwidth7876 GB/s
BF16 TFLOPS1670
FP16 TFLOPS1670
FP8 TFLOPS3341
INT8 TOPS3341
TDP800W
InterconnectNVLINK
NVLink Bandwidth600 GB/s
Max per Node4
PCIe Gen5
CUDA Compute Capability9
Tensor CoresYes
Pricing by Provider
| Provider | On-Demand | Reserved | Spot | Badge |
|---|---|---|---|---|
| coreweave | $7.49/hr | $5.49/hr | - | Cheapest |
| runpod | $7.89/hr | - | - |
Compatible Models (249)
Single GPU (226 models)
Mixtral 8x22B141B FP8DBRX Base132B FP8DBRX Instruct132B FP8Mistral Large 2411123B FP8Mistral Large 2123B FP8Llama 4 Scout109B FP8Command R+104B FP8Yi-Large102.6B FP8YaLM 100B100B FP8Llama 3.2 90B Vision90B FP8Llama 3.2 90B Vision Instruct88.8B FP8Qwen 2.5 72B72.7B FP8Qwen 2.5 Math 72B72.7B FP8Qwen 2.5 VL 72B72.7B FP8Dolphin 2.9 72B72B FP8DeepSeek R1 Distill 70B70.6B FP8Llama 3 70B 1M Context70.6B FP8Llama 3 70B70.6B FP8Llama 3.1 70B70.6B FP8Llama 3.3 70B70.6B FP8+206 more
Multi-GPU (23 models)
Grok-2x2 FP8DeepSeek Coder V2 236Bx2 FP8DeepSeek V2.5x2 FP8Qwen 3 235Bx2 FP8Falcon 180Bx2 FP8Command Ax2 BF16Inflection 3x2 BF16DeepSeek R1x3 INT4DeepSeek V3x3 INT4Llama 3.1 405Bx3 FP8Llama 4 Maverickx3 FP8Jamba 1.5 Largex3 FP8Snowflake Arctic 128x3Bx3 FP8Nemotron 340Bx3 FP8Claude Opus 4x3 BF16+8 more
Training Capabilities
Estimated GPU count for full fine-tuning (AdamW, BF16) and QLoRA
| Model Size | Full Fine-Tune | QLoRA |
|---|---|---|
| 7B model | 1 GPU | 1 GPU |
| 13B model | 2 GPUs | 1 GPU |
| 70B model | 8 GPUs | 1 GPU |
Energy Efficiency
Estimated tokens/second per Watt for popular models
Mistral 7B
1.35 t/s/WFP8
Qwen 2.5 7B
1.30 t/s/WFP8
Llama 3.1 8B
1.23 t/s/WFP8
Llama 3.1 70B
0.14 t/s/WFP8
Qwen 2.5 72B
0.14 t/s/WFP8