Skip to content
Groq

Groq

Inference API Provider

Reputation:
86/100
groq.com

Provider Overview

Type

inference

Billing

Per token

Egress

Free

SLA Uptime

99.9%

Autoscaling

Yes

Cold Start

None

Model Pricing (10)

ModelInput $/MOutput $/MLatencyThroughputContext
llama-3.1-8bCheapest$0.05$0.080.05s1250 t/s128k
qwen-2.5-7b$0.05$0.080.05s1100 t/s32k
phi-3-mini-128k$0.05$0.080.04s1300 t/s128k
gemma-2-9b$0.20$0.200.06s900 t/s8k
mixtral-8x7b$0.24$0.240.08s575 t/s33k
qwen-2.5-32b$0.40$0.400.1s400 t/s32k
gemma-2-27b$0.50$0.500.08s500 t/s8k
llama-3.3-70b$0.59$0.790.1s350 t/s128k
llama-3.1-70b$0.59$0.790.1s330 t/s128k
deepseek-r1-distill-llama-70b$0.75$0.990.3s275 t/s128k

Reputation Details

Pricing
90
Reliability
90
Features
75

Highlights

  • Very competitive token pricing
  • 99.9%+ SLA
  • Autoscaling supported
  • Fast cold start

Compare with Others

ProviderOverallPricingReliabilityFeaturesModels
Groq8690907510
Together AI7870907520
Fireworks AI7870907514
DeepInfra8690907521
DeepSeek727070753