Updated minutes ago
Instinct MI325X
amd · cdna3 · 256 GB HBM3e · 750W TDP
VRAM
256 GB
BF16 TFLOPS
1307
Bandwidth
6000 GB/s
From
$2.49/hr
Spec Sheet
VRAM256 GB HBM3e
Memory Bandwidth6000 GB/s
BF16 TFLOPS1307
FP16 TFLOPS1307
FP8 TFLOPS2614
INT8 TOPS2614
TDP750W
InterconnectINFINITY-FABRIC
Max per Node8
PCIe Gen5
Tensor CoresNo
Pricing by Provider
| Provider | On-Demand | Reserved | Spot | Badge |
|---|---|---|---|---|
| vast_ai | $3.49/hr | - | $2.49/hr | Cheapest |
| coreweave | $4.19/hr | $3.19/hr | - | |
| lambda | $3.29/hr | - | - |
Compatible Models (251)
Single GPU (228 models)
Falcon 180B180B FP8Mixtral 8x22B141B FP8DBRX Base132B FP8DBRX Instruct132B FP8Mistral Large 2411123B FP8Mistral Large 2123B FP8Llama 4 Scout109B FP8Command R+104B FP8Yi-Large102.6B FP8Inflection 3100B BF16YaLM 100B100B FP8Llama 3.2 90B Vision90B FP8Llama 3.2 90B Vision Instruct88.8B FP8Qwen 2.5 72B72.7B FP8Qwen 2.5 Math 72B72.7B FP8Qwen 2.5 VL 72B72.7B FP8Dolphin 2.9 72B72B FP8DeepSeek R1 Distill 70B70.6B FP8Llama 3 70B 1M Context70.6B FP8Llama 3 70B70.6B FP8+208 more
Multi-GPU (23 models)
Training Capabilities
Estimated GPU count for full fine-tuning (AdamW, BF16) and QLoRA
| Model Size | Full Fine-Tune | QLoRA |
|---|---|---|
| 7B model | 1 GPU | 1 GPU |
| 13B model | 1 GPU | 1 GPU |
| 70B model | 6 GPUs | 1 GPU |
Energy Efficiency
Estimated tokens/second per Watt for popular models
Mistral 7B
1.10 t/s/WFP8
Qwen 2.5 7B
1.05 t/s/WFP8
Llama 3.1 8B
1.00 t/s/WFP8
DeepSeek V3
0.22 t/s/WFP8
Llama 3.1 70B
0.11 t/s/WFP8
Qwen 2.5 72B
0.11 t/s/WFP8
Similar GPUs
| GPU | VRAM | BF16 TFLOPS | BW (GB/s) | From |
|---|---|---|---|---|
| Instinct MI300X | 192 GB | 1307 | 5300 | $1.79/hr |
| Groq LPU | 230 GB | 188 | 80000 | $0.00/hr |
| B300 | 288 GB | 2800 | 12000 | $0.00/hr |
| B100 SXM | 192 GB | 1750 | 8000 | $4.50/hr |
| GB200 NVL72 (per GPU) | 192 GB | 2250 | 8000 | $6.50/hr |