Updated minutes ago
TPU v5e
google · tpu · 16 GB HBM2e · 200W TDP
VRAM
16 GB
BF16 TFLOPS
200
Bandwidth
820 GB/s
From
$0.85/hr
Spec Sheet
VRAM16 GB HBM2e
Memory Bandwidth820 GB/s
BF16 TFLOPS200
FP16 TFLOPS200
FP8 TFLOPS400
INT8 TOPS400
TDP200W
InterconnectPCIE
Max per Node256
PCIe Gen4
Tensor CoresNo
Pricing by Provider
| Provider | On-Demand | Reserved | Spot | Badge |
|---|---|---|---|---|
| gcp | $1.20/hr | $0.85/hr | - | Cheapest |
Compatible Models (252)
Single GPU (134 models)
OLMo 2 13B13B FP8Baichuan 2 13B13B FP8Vicuna 13B13B FP8Code Llama 13B13B FP8Llama 2 13B13B FP8Orca 2 13B13B FP8VILA 1.5 13B13B FP8ELYZA 13B13B FP8Cerebras GPT 13B13B FP8KULLM 12.8B12.8B FP8StableLM 2 12B12.1B FP8Amazon Nova Lite12B FP8Gemma 3 12B12B FP8Mistral Nemo 12B12B FP8Pixtral 12B12B FP8FLUX.1 Dev12B FP8Llama 3.2 11B Vision11B FP8Falcon 11B11B FP8SOLAR 10.7B10.7B FP8GLM-4 9B9.4B FP8+114 more
Multi-GPU (118 models)
Gemma 2 27Bx2 FP8Gemma 3 27Bx2 FP8InternVL2 26Bx2 FP8Mistral Small 24Bx2 FP8Mistral Small 3.1 24Bx2 FP8Codestral 22Bx2 FP8Solar Pro 22Bx2 FP8GigaChat 20Bx2 FP8InternLM 20Bx2 FP8InternLM 2.5 20Bx2 FP8CogVLM2 19Bx2 FP8DeepSeek MoE 16Bx2 FP8CodeGen2 16Bx2 FP8DeepSeek V2 Litex2 FP8OctoCoder 15Bx2 FP8+103 more
Training Capabilities
Estimated GPU count for full fine-tuning (AdamW, BF16) and QLoRA
| Model Size | Full Fine-Tune | QLoRA |
|---|---|---|
| 7B model | 9 GPUs | 1 GPU |
| 13B model | 16 GPUs | 1 GPU |
| 70B model | 83 GPUs | 3 GPUs |
Energy Efficiency
Estimated tokens/second per Watt for popular models
Mistral 7B
0.56 t/s/WFP8
Qwen 2.5 7B
0.54 t/s/WFP8
Llama 3.1 8B
0.51 t/s/WFP8
DeepSeek V3
0.11 t/s/WFP8
Llama 3.1 70B
0.06 t/s/WFP8
Qwen 2.5 72B
0.06 t/s/WFP8
Similar GPUs
| GPU | VRAM | BF16 TFLOPS | BW (GB/s) | From |
|---|---|---|---|---|
| TPU v4 | 32 GB | 275 | 1200 | $2.25/hr |
| TPU v6e (Trillium) | 32 GB | 460 | 1640 | $1.75/hr |
| A4000 | 16 GB | 76 | 448 | $0.17/hr |
| RTX 4080 | 16 GB | 97 | 717 | $0.32/hr |
| T4 | 16 GB | 65 | 300 | $0.12/hr |