What GPUs does Groq offer?

Groq is an inference api provider offering 10 AI model endpoints.

What is Groq pricing?

Groq model pricing starts from $0.05/M input tokens and $0.08/M output tokens.

Does Groq offer autoscaling?

Yes, Groq supports autoscaling for dynamic workload management. Cold start time is negligible.

Groq

Inference API Provider

Reputation:

86/100

Get your Groq API key groq.com

Groq offers 10 model endpoints with output pricing starting at $0.08/million tokens. Compared to the market average of $1.03/million output tokens across inference API providers, Groq's entry-level pricing is 92% below average.

Provider Overview

Type

inference

Billing

Per token

Egress

Free

SLA Uptime

99.9%

Autoscaling

Yes

Cold Start

None

Model Pricing (10)

Model	Input $/M	Output $/M	Latency	Throughput	Context
llama-3.1-8bCheapest	$0.05	$0.08	0.05s	1250 t/s	128k
qwen-2.5-7b	$0.05	$0.08	0.05s	1100 t/s	32k
phi-3-mini-128k	$0.05	$0.08	0.04s	1300 t/s	128k
gemma-2-9b	$0.20	$0.20	0.06s	900 t/s	8k
mixtral-8x7b	$0.24	$0.24	0.08s	575 t/s	33k
qwen-2.5-32b	$0.40	$0.40	0.1s	400 t/s	32k
gemma-2-27b	$0.50	$0.50	0.08s	500 t/s	8k
llama-3.3-70b	$0.59	$0.79	0.1s	350 t/s	128k
llama-3.1-70b	$0.59	$0.79	0.1s	330 t/s	128k
deepseek-r1-distill-llama-70b	$0.75	$0.99	0.3s	275 t/s	128k

Reputation Details

Pricing

Reliability

Features

Highlights

Very competitive token pricing
99.9%+ SLA
Autoscaling supported
Fast cold start

Compare with Others

Provider	Overall	Pricing	Reliability	Features	Models
Groq	86	90	90	75	10
Together AI	78	70	90	75	20
Fireworks AI	78	70	90	75	14
DeepInfra	86	90	90	75	21
DeepSeek	72	70	70	75	3

Embed Badge

<a href="https://inferencebench.io/providers/groq/"><img src="data:image/svg+xml,%3Csvg%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%22%20width%3D%22208%22%20height%3D%2220%22%20role%3D%22img%22%20aria-label%3D%22InferenceBench%20Verified%3A%20Groq%22%3E%0A%20%20%3Ctitle%3EInferenceBench%20Verified%3A%20Groq%3C%2Ftitle%3E%0A%20%20%3ClinearGradient%20id%3D%22s%22%20x2%3D%220%22%20y2%3D%22100%25%22%3E%0A%20%20%20%20%3Cstop%20offset%3D%220%22%20stop-color%3D%22%23bbb%22%20stop-opacity%3D%22.1%22%2F%3E%0A%20%20%20%20%3Cstop%20offset%3D%221%22%20stop-opacity%3D%22.1%22%2F%3E%0A%20%20%3C%2FlinearGradient%3E%0A%20%20%3CclipPath%20id%3D%22r%22%3E%0A%20%20%20%20%3Crect%20width%3D%22208%22%20height%3D%2220%22%20rx%3D%223%22%20fill%3D%22%23fff%22%2F%3E%0A%20%20%3C%2FclipPath%3E%0A%20%20%3Cg%20clip-path%3D%22url(%23r)%22%3E%0A%20%20%20%20%3Crect%20width%3D%22166%22%20height%3D%2220%22%20fill%3D%22%23333%22%2F%3E%0A%20%20%20%20%3Crect%20x%3D%22166%22%20width%3D%2242%22%20height%3D%2220%22%20fill%3D%22%238b5cf6%22%2F%3E%0A%20%20%20%20%3Crect%20width%3D%22208%22%20height%3D%2220%22%20fill%3D%22url(%23s)%22%2F%3E%0A%20%20%3C%2Fg%3E%0A%20%20%3Cg%20fill%3D%22%23fff%22%20text-anchor%3D%22middle%22%20font-family%3D%22Verdana%2CGeneva%2CDejaVu%20Sans%2Csans-serif%22%20text-rendering%3D%22geometricPrecision%22%20font-size%3D%2211%22%3E%0A%20%20%20%20%3Ctext%20aria-hidden%3D%22true%22%20x%3D%2283%22%20y%3D%2214%22%20fill%3D%22%23010101%22%20fill-opacity%3D%22.3%22%3EInferenceBench%20Verified%3C%2Ftext%3E%0A%20%20%20%20%3Ctext%20x%3D%2283%22%20y%3D%2213%22%3EInferenceBench%20Verified%3C%2Ftext%3E%0A%20%20%20%20%3Ctext%20aria-hidden%3D%22true%22%20x%3D%22187%22%20y%3D%2214%22%20fill%3D%22%23010101%22%20fill-opacity%3D%22.3%22%3EGroq%3C%2Ftext%3E%0A%20%20%20%20%3Ctext%20x%3D%22187%22%20y%3D%2213%22%3EGroq%3C%2Ftext%3E%0A%20%20%3C%2Fg%3E%0A%3C%2Fsvg%3E" alt="InferenceBench Verified — Groq" /></a>