Together AI vs Fireworks AI
Token Pricing Comparison
| Model | Together AI In $/M | Out $/M | Fireworks AI In $/M | Out $/M |
|---|---|---|---|---|
| phi-3-mini-128k | $0.10 | $0.10 | — | — |
| llama-3.1-8b | $0.18 | $0.18 | $0.20 | $0.20 |
| qwen-2.5-7b | $0.20 | $0.20 | $0.20 | $0.20 |
| codellama-7b | $0.20 | $0.20 | — | — |
| gemma-2-9b | $0.30 | $0.30 | $0.20 | $0.20 |
| codellama-13b | $0.22 | $0.22 | — | — |
| phi-4-14b | $0.30 | $0.30 | — | — |
| deepseek-v3 | $0.50 | $0.50 | $0.50 | $0.50 |
| mixtral-8x7b | $0.60 | $0.60 | $0.50 | $0.50 |
| qwen-2.5-32b | $0.50 | $0.50 | $0.50 | $0.50 |
| qwen-2.5-coder-32b | $0.50 | $0.50 | $0.50 | $0.50 |
| phi-3-medium-128k | $0.50 | $0.50 | — | — |
| codellama-34b | $0.78 | $0.78 | — | — |
| gemma-2-27b | $0.80 | $0.80 | $0.90 | $0.90 |
| llama-3.1-70b | $0.88 | $0.88 | $0.90 | $0.90 |
| llama-3.3-70b | $0.88 | $0.88 | $0.90 | $0.90 |
| qwen-2.5-72b | $0.90 | $0.90 | $0.90 | $0.90 |
| mixtral-8x22b | $1.20 | $1.20 | $1.20 | $1.20 |
| llama-3.1-405b | $3.50 | $3.50 | $3.00 | $3.00 |
| deepseek-r1 | $3.00 | $7.50 | $3.00 | $8.00 |
Latency & Throughput
| Model | Together AI Latency | tok/s | Fireworks AI Latency | tok/s |
|---|---|---|---|---|
| phi-3-mini-128k | 0.15s | 220 | — | — |
| llama-3.1-8b | 0.2s | 200 | 0.15s | 250 |
| qwen-2.5-7b | 0.2s | 180 | 0.15s | 200 |
| codellama-7b | 0.15s | 200 | — | — |
| gemma-2-9b | 0.2s | 160 | 0.15s | 180 |
| codellama-13b | 0.2s | 150 | — | — |
| phi-4-14b | 0.2s | 140 | — | — |
| deepseek-v3 | 0.4s | 70 | 0.35s | 75 |
| mixtral-8x7b | 0.3s | 100 | 0.2s | 120 |
| qwen-2.5-32b | 0.3s | 110 | 0.25s | 110 |
| qwen-2.5-coder-32b | 0.3s | 105 | 0.25s | 105 |
| phi-3-medium-128k | 0.25s | 120 | — | — |
| codellama-34b | 0.4s | 70 | — | — |
| gemma-2-27b | 0.3s | 85 | 0.3s | 85 |
| llama-3.1-70b | 0.4s | 80 | 0.3s | 90 |
| llama-3.3-70b | 0.35s | 85 | 0.28s | 95 |
| qwen-2.5-72b | 0.4s | 75 | 0.35s | 80 |
| mixtral-8x22b | 0.5s | 60 | 0.45s | 65 |
| llama-3.1-405b | 0.8s | 35 | 0.7s | 40 |
| deepseek-r1 | 2s | 30 | 2.5s | 25 |
Feature Comparison
FeatureTogether AIFireworks AI
Provider TypeInference APIInference API
Billing Granularitytokentoken
Autoscaling Yes Yes
SLA Uptime99.9%99.9%
Cold StartNoneNone
Free Egress Yes Yes
Storage CostN/AN/A
GPU CountN/AN/A
Models Offered20 models14 models
Pros & Cons Summary
Together AI
Strengths
- +Lower token pricing than Fireworks AI on most shared models
- +Broader model catalog (20 vs 14 models)
Fireworks AI
Weaknesses
- -Higher token pricing than Together AI on most shared models
- -Smaller model catalog (14 vs 20 models)
Compare Other Providers
Together AI vs RunPod→Together AI vs Amazon Web Services→Together AI vs Google Cloud Platform→Together AI vs Microsoft Azure→Together AI vs Lambda Labs→Together AI vs CoreWeave→Fireworks AI vs RunPod→Fireworks AI vs Amazon Web Services→Fireworks AI vs Google Cloud Platform→Fireworks AI vs Microsoft Azure→