Skip to content
DeepInfra

DeepInfra

Inference API Provider

Reputation:
86/100
deepinfra.com

Provider Overview

Type

inference

Billing

Per token

Egress

Free

SLA Uptime

99.9%

Autoscaling

Yes

Cold Start

None

Model Pricing (21)

ModelInput $/MOutput $/MLatencyThroughputContext
llama-3.2-1bCheapest$0.02$0.020.08s350 t/s128k
llama-3.2-3b$0.04$0.040.1s280 t/s128k
phi-3-mini-128k$0.05$0.050.12s230 t/s128k
llama-3.1-8b$0.06$0.060.15s200 t/s128k
gemma-2-9b$0.06$0.060.12s200 t/s8k
qwen-2.5-7b$0.07$0.070.15s180 t/s32k
llama-3.2-11b-vision$0.12$0.120.2s150 t/s128k
phi-4-14b$0.12$0.120.15s160 t/s16k
phi-3-medium-128k$0.14$0.140.2s130 t/s128k
qwen-2.5-32b$0.18$0.200.25s100 t/s32k
qwen-2.5-coder-32b$0.18$0.200.25s95 t/s32k
mixtral-8x7b$0.24$0.240.2s120 t/s33k
deepseek-v3$0.30$0.300.3s80 t/s64k
gemma-2-27b$0.30$0.300.25s90 t/s8k
llama-3.1-70b$0.35$0.400.3s85 t/s128k
llama-3.3-70b$0.35$0.400.28s90 t/s128k
qwen-2.5-72b$0.35$0.400.35s75 t/s32k
mixtral-8x22b$0.65$0.650.4s65 t/s66k
llama-3.2-90b-vision$0.65$0.650.5s50 t/s128k
llama-3.1-405b$1.80$1.800.7s35 t/s128k
deepseek-r1$1.50$4.002s30 t/s64k

Reputation Details

Pricing
90
Reliability
90
Features
75

Highlights

  • Very competitive token pricing
  • 99.9%+ SLA
  • Autoscaling supported
  • Fast cold start

Compare with Others

ProviderOverallPricingReliabilityFeaturesModels
DeepInfra8690907521
Together AI7870907520
Fireworks AI7870907514
Groq8690907510
DeepSeek727070753