Skip to content
Together AI

Together AI

Inference API Provider

Reputation:
78/100
together.ai

Together AI offers 20 model endpoints with output pricing starting at $0.10/million tokens. Compared to the market average of $1.03/million output tokens across inference API providers, Together AI's entry-level pricing is 90% below average.

Provider Overview

Type

inference

Billing

Per token

Egress

Free

SLA Uptime

99.9%

Autoscaling

Yes

Cold Start

None

Model Pricing (20)

ModelInput $/MOutput $/MLatencyThroughputContext
phi-3-mini-128kCheapest$0.10$0.100.15s220 t/s128k
llama-3.1-8b$0.18$0.180.2s200 t/s128k
qwen-2.5-7b$0.20$0.200.2s180 t/s32k
codellama-7b$0.20$0.200.15s200 t/s16k
codellama-13b$0.22$0.220.2s150 t/s16k
gemma-2-9b$0.30$0.300.2s160 t/s8k
phi-4-14b$0.30$0.300.2s140 t/s16k
deepseek-v3$0.50$0.500.4s70 t/s64k
qwen-2.5-32b$0.50$0.500.3s110 t/s32k
qwen-2.5-coder-32b$0.50$0.500.3s105 t/s32k
phi-3-medium-128k$0.50$0.500.25s120 t/s128k
mixtral-8x7b$0.60$0.600.3s100 t/s33k
codellama-34b$0.78$0.780.4s70 t/s16k
gemma-2-27b$0.80$0.800.3s85 t/s8k
llama-3.1-70b$0.88$0.880.4s80 t/s128k
llama-3.3-70b$0.88$0.880.35s85 t/s128k
qwen-2.5-72b$0.90$0.900.4s75 t/s32k
mixtral-8x22b$1.20$1.200.5s60 t/s66k
llama-3.1-405b$3.50$3.500.8s35 t/s128k
deepseek-r1$3.00$7.502s30 t/s64k

Reputation Details

Pricing
70
Reliability
90
Features
75

Highlights

  • Good pricing
  • 99.9%+ SLA
  • Autoscaling supported
  • Fast cold start

Compare with Others

ProviderOverallPricingReliabilityFeaturesModels
Together AI7870907520
Fireworks AI7870907514
Groq8690907510
DeepInfra8690907521
DeepSeek727070753

Embed Badge

InferenceBench Verified: Together AI InferenceBench Verified Together AI
<a href="https://inferencebench.io/providers/together-ai/"><img src="data:image/svg+xml,%3Csvg%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%22%20width%3D%22254%22%20height%3D%2220%22%20role%3D%22img%22%20aria-label%3D%22InferenceBench%20Verified%3A%20Together%20AI%22%3E%0A%20%20%3Ctitle%3EInferenceBench%20Verified%3A%20Together%20AI%3C%2Ftitle%3E%0A%20%20%3ClinearGradient%20id%3D%22s%22%20x2%3D%220%22%20y2%3D%22100%25%22%3E%0A%20%20%20%20%3Cstop%20offset%3D%220%22%20stop-color%3D%22%23bbb%22%20stop-opacity%3D%22.1%22%2F%3E%0A%20%20%20%20%3Cstop%20offset%3D%221%22%20stop-opacity%3D%22.1%22%2F%3E%0A%20%20%3C%2FlinearGradient%3E%0A%20%20%3CclipPath%20id%3D%22r%22%3E%0A%20%20%20%20%3Crect%20width%3D%22254%22%20height%3D%2220%22%20rx%3D%223%22%20fill%3D%22%23fff%22%2F%3E%0A%20%20%3C%2FclipPath%3E%0A%20%20%3Cg%20clip-path%3D%22url(%23r)%22%3E%0A%20%20%20%20%3Crect%20width%3D%22166%22%20height%3D%2220%22%20fill%3D%22%23333%22%2F%3E%0A%20%20%20%20%3Crect%20x%3D%22166%22%20width%3D%2288%22%20height%3D%2220%22%20fill%3D%22%238b5cf6%22%2F%3E%0A%20%20%20%20%3Crect%20width%3D%22254%22%20height%3D%2220%22%20fill%3D%22url(%23s)%22%2F%3E%0A%20%20%3C%2Fg%3E%0A%20%20%3Cg%20fill%3D%22%23fff%22%20text-anchor%3D%22middle%22%20font-family%3D%22Verdana%2CGeneva%2CDejaVu%20Sans%2Csans-serif%22%20text-rendering%3D%22geometricPrecision%22%20font-size%3D%2211%22%3E%0A%20%20%20%20%3Ctext%20aria-hidden%3D%22true%22%20x%3D%2283%22%20y%3D%2214%22%20fill%3D%22%23010101%22%20fill-opacity%3D%22.3%22%3EInferenceBench%20Verified%3C%2Ftext%3E%0A%20%20%20%20%3Ctext%20x%3D%2283%22%20y%3D%2213%22%3EInferenceBench%20Verified%3C%2Ftext%3E%0A%20%20%20%20%3Ctext%20aria-hidden%3D%22true%22%20x%3D%22210%22%20y%3D%2214%22%20fill%3D%22%23010101%22%20fill-opacity%3D%22.3%22%3ETogether%20AI%3C%2Ftext%3E%0A%20%20%20%20%3Ctext%20x%3D%22210%22%20y%3D%2213%22%3ETogether%20AI%3C%2Ftext%3E%0A%20%20%3C%2Fg%3E%0A%3C%2Fsvg%3E" alt="InferenceBench Verified — Together AI" /></a>