Skip to content
Fireworks AI

Fireworks AI

Inference API Provider

Reputation:
78/100
fireworks.ai

Provider Overview

Type

inference

Billing

Per token

Egress

Free

SLA Uptime

99.9%

Autoscaling

Yes

Cold Start

None

Model Pricing (14)

ModelInput $/MOutput $/MLatencyThroughputContext
llama-3.1-8bCheapest$0.20$0.200.15s250 t/s128k
gemma-2-9b$0.20$0.200.15s180 t/s8k
qwen-2.5-7b$0.20$0.200.15s200 t/s32k
deepseek-v3$0.50$0.500.35s75 t/s64k
mixtral-8x7b$0.50$0.500.2s120 t/s33k
qwen-2.5-32b$0.50$0.500.25s110 t/s32k
qwen-2.5-coder-32b$0.50$0.500.25s105 t/s32k
llama-3.1-70b$0.90$0.900.3s90 t/s128k
llama-3.3-70b$0.90$0.900.28s95 t/s128k
qwen-2.5-72b$0.90$0.900.35s80 t/s32k
gemma-2-27b$0.90$0.900.3s85 t/s8k
mixtral-8x22b$1.20$1.200.45s65 t/s66k
llama-3.1-405b$3.00$3.000.7s40 t/s128k
deepseek-r1$3.00$8.002.5s25 t/s64k

Reputation Details

Pricing
70
Reliability
90
Features
75

Highlights

  • Good pricing
  • 99.9%+ SLA
  • Autoscaling supported
  • Fast cold start

Compare with Others

ProviderOverallPricingReliabilityFeaturesModels
Fireworks AI7870907514
Together AI7870907520
Groq8690907510
DeepInfra8690907521
DeepSeek727070753