How much VRAM does Groq LPU have?

The Groq LPU has 230 GB of SRAM VRAM with a memory bandwidth of 80000 GB/s.

Updated minutes ago· Sources: GPU Pricing, API Token Pricing, Model Registry

Groq LPU

other · other · 230 GB SRAM · 300W TDP

VRAM

230 GB

BF16 TFLOPS

188

Bandwidth

80000 GB/s

From

$0.00/hr

Calculate ROI with this GPU →

Spec Sheet

VRAM230 GB SRAM

Memory Bandwidth80000 GB/s

BF16 TFLOPS188

FP16 TFLOPS188

FP8 TFLOPS376

INT8 TOPS750

TDP300W

InterconnectPCIE

Max per Node8

PCIe Gen5

Tensor CoresNo

Pricing by Provider

Provider	On-Demand	Reserved	Spot	Badge
groq	$0.00/hr	-	-	Cheapest

Training Capabilities

Estimated GPU count for full fine-tuning (AdamW, BF16) and QLoRA

Model Size	Full Fine-Tune	QLoRA
7B model	1 GPU	1 GPU
13B model	2 GPUs	1 GPU
70B model	6 GPUs	1 GPU

Train on this GPU →

Energy Efficiency

Estimated tokens/second per Watt for popular models

Mistral 7B

36.53 t/s/WFP8

Qwen 2.5 7B

35.09 t/s/WFP8

Llama 3.1 8B

33.21 t/s/WFP8

DeepSeek V3

7.21 t/s/WFP8

Llama 3.1 70B

3.78 t/s/WFP8

Qwen 2.5 72B

3.67 t/s/WFP8

Similar GPUs

GPU	VRAM	BF16 TFLOPS	BW (GB/s)	From
Trainium2	96 GB	756	3200	$1.95/hr
Cloud AI 100	32 GB	150	134	$0.00/hr
Instinct MI325X	256 GB	1307	6000	$2.49/hr
B100 SXM	192 GB	1750	8000	$4.50/hr
GB200 NVL72 (per GPU)	192 GB	2250	8000	$6.50/hr

Embed Badge

<a href="https://inferencebench.io/gpus/groq-lpu/"><img src="data:image/svg+xml,%3Csvg%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%22%20width%3D%22227%22%20height%3D%2220%22%20role%3D%22img%22%20aria-label%3D%22InferenceBench%3A%20Groq%20LPU%20%7C%20230GB%22%3E%0A%20%20%3Ctitle%3EInferenceBench%3A%20Groq%20LPU%20%7C%20230GB%3C%2Ftitle%3E%0A%20%20%3ClinearGradient%20id%3D%22s%22%20x2%3D%220%22%20y2%3D%22100%25%22%3E%0A%20%20%20%20%3Cstop%20offset%3D%220%22%20stop-color%3D%22%23bbb%22%20stop-opacity%3D%22.1%22%2F%3E%0A%20%20%20%20%3Cstop%20offset%3D%221%22%20stop-opacity%3D%22.1%22%2F%3E%0A%20%20%3C%2FlinearGradient%3E%0A%20%20%3CclipPath%20id%3D%22r%22%3E%0A%20%20%20%20%3Crect%20width%3D%22227%22%20height%3D%2220%22%20rx%3D%223%22%20fill%3D%22%23fff%22%2F%3E%0A%20%20%3C%2FclipPath%3E%0A%20%20%3Cg%20clip-path%3D%22url(%23r)%22%3E%0A%20%20%20%20%3Crect%20width%3D%22107%22%20height%3D%2220%22%20fill%3D%22%23333%22%2F%3E%0A%20%20%20%20%3Crect%20x%3D%22107%22%20width%3D%22120%22%20height%3D%2220%22%20fill%3D%22%234c1%22%2F%3E%0A%20%20%20%20%3Crect%20width%3D%22227%22%20height%3D%2220%22%20fill%3D%22url(%23s)%22%2F%3E%0A%20%20%3C%2Fg%3E%0A%20%20%3Cg%20fill%3D%22%23fff%22%20text-anchor%3D%22middle%22%20font-family%3D%22Verdana%2CGeneva%2CDejaVu%20Sans%2Csans-serif%22%20text-rendering%3D%22geometricPrecision%22%20font-size%3D%2211%22%3E%0A%20%20%20%20%3Ctext%20aria-hidden%3D%22true%22%20x%3D%2253.5%22%20y%3D%2214%22%20fill%3D%22%23010101%22%20fill-opacity%3D%22.3%22%3EInferenceBench%3C%2Ftext%3E%0A%20%20%20%20%3Ctext%20x%3D%2253.5%22%20y%3D%2213%22%3EInferenceBench%3C%2Ftext%3E%0A%20%20%20%20%3Ctext%20aria-hidden%3D%22true%22%20x%3D%22167%22%20y%3D%2214%22%20fill%3D%22%23010101%22%20fill-opacity%3D%22.3%22%3EGroq%20LPU%20%7C%20230GB%3C%2Ftext%3E%0A%20%20%20%20%3Ctext%20x%3D%22167%22%20y%3D%2213%22%3EGroq%20LPU%20%7C%20230GB%3C%2Ftext%3E%0A%20%20%3C%2Fg%3E%0A%3C%2Fsvg%3E" alt="InferenceBench — Groq LPU" /></a>

Methodology Note

Performance estimates for the Groq LPUare based on InferenceBench's roofline performance model with CUDA kernel-level optimization including FlashAttention v2 and PagedAttention. Memory calculations account for model weights (230 GB SRAM available), KV-cache allocation, and activation memory. Throughput predictions use the Groq LPU's rated 80000 GB/s memory bandwidth and 188 BF16 TFLOPS compute capacity as roofline ceilings, with empirical correction factors per GPU architecture (other). See our full methodology.

Frequently Asked Questions

How many AI models can run on Groq LPU?

The Groq LPU can run 300 AI models from our database within a single node. Compatible models range across various parameter sizes depending on the quantization precision (BF16, FP8, INT4). Smaller models fit on a single GPU while larger models may require multi-GPU setups up to 8x Groq LPU.

What is the Groq LPU inference throughput?

The Groq LPU delivers 188 BF16 TFLOPS and 376 FP8 TFLOPS with 80000 GB/s memory bandwidth. Actual inference throughput (tokens/sec) depends on the model size, precision, and batch size. Use our calculator for model-specific throughput estimates.

How much does Groq LPU cost per hour?

The Groq LPU is available starting from $0.00/hour via groq. Prices vary by provider and pricing tier (on-demand, reserved, spot). Compare pricing across all providers in the table above.

Groq LPU

Spec Sheet

Pricing by Provider

Compatible Models (300)

Single GPU (253 models)

Multi-GPU (47 models)

Training Capabilities

Energy Efficiency

Similar GPUs

Embed Badge

Methodology Note

Frequently Asked Questions

Embed Badge