What GPUs does Together AI offer?

Together AI is an inference api provider offering 20 AI model endpoints.

What is Together AI pricing?

Together AI model pricing starts from $0.10/M input tokens and $0.10/M output tokens.

Does Together AI offer autoscaling?

Yes, Together AI supports autoscaling for dynamic workload management. Cold start time is negligible.

Together AI

Inference API Provider

Reputation:

78/100

Get your Together AI API key together.ai

Together AI offers 20 model endpoints with output pricing starting at $0.10/million tokens. Compared to the market average of $1.03/million output tokens across inference API providers, Together AI's entry-level pricing is 90% below average.

Provider Overview

Type

inference

Billing

Per token

Egress

Free

SLA Uptime

99.9%

Autoscaling

Yes

Cold Start

None

Model Pricing (20)

Model	Input $/M	Output $/M	Latency	Throughput	Context
phi-3-mini-128kCheapest	$0.10	$0.10	0.15s	220 t/s	128k
llama-3.1-8b	$0.18	$0.18	0.2s	200 t/s	128k
qwen-2.5-7b	$0.20	$0.20	0.2s	180 t/s	32k
codellama-7b	$0.20	$0.20	0.15s	200 t/s	16k
codellama-13b	$0.22	$0.22	0.2s	150 t/s	16k
gemma-2-9b	$0.30	$0.30	0.2s	160 t/s	8k
phi-4-14b	$0.30	$0.30	0.2s	140 t/s	16k
deepseek-v3	$0.50	$0.50	0.4s	70 t/s	64k
qwen-2.5-32b	$0.50	$0.50	0.3s	110 t/s	32k
qwen-2.5-coder-32b	$0.50	$0.50	0.3s	105 t/s	32k
phi-3-medium-128k	$0.50	$0.50	0.25s	120 t/s	128k
mixtral-8x7b	$0.60	$0.60	0.3s	100 t/s	33k
codellama-34b	$0.78	$0.78	0.4s	70 t/s	16k
gemma-2-27b	$0.80	$0.80	0.3s	85 t/s	8k
llama-3.1-70b	$0.88	$0.88	0.4s	80 t/s	128k
llama-3.3-70b	$0.88	$0.88	0.35s	85 t/s	128k
qwen-2.5-72b	$0.90	$0.90	0.4s	75 t/s	32k
mixtral-8x22b	$1.20	$1.20	0.5s	60 t/s	66k
llama-3.1-405b	$3.50	$3.50	0.8s	35 t/s	128k
deepseek-r1	$3.00	$7.50	2s	30 t/s	64k

Reputation Details

Pricing

Reliability

Features

Highlights

Good pricing
99.9%+ SLA
Autoscaling supported
Fast cold start

Compare with Others

Provider	Overall	Pricing	Reliability	Features	Models
Together AI	78	70	90	75	20
Fireworks AI	78	70	90	75	14
Groq	86	90	90	75	10
DeepInfra	86	90	90	75	21
DeepSeek	72	70	70	75	3

Embed Badge

<a href="https://inferencebench.io/providers/together-ai/"><img src="data:image/svg+xml,%3Csvg%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%22%20width%3D%22254%22%20height%3D%2220%22%20role%3D%22img%22%20aria-label%3D%22InferenceBench%20Verified%3A%20Together%20AI%22%3E%0A%20%20%3Ctitle%3EInferenceBench%20Verified%3A%20Together%20AI%3C%2Ftitle%3E%0A%20%20%3ClinearGradient%20id%3D%22s%22%20x2%3D%220%22%20y2%3D%22100%25%22%3E%0A%20%20%20%20%3Cstop%20offset%3D%220%22%20stop-color%3D%22%23bbb%22%20stop-opacity%3D%22.1%22%2F%3E%0A%20%20%20%20%3Cstop%20offset%3D%221%22%20stop-opacity%3D%22.1%22%2F%3E%0A%20%20%3C%2FlinearGradient%3E%0A%20%20%3CclipPath%20id%3D%22r%22%3E%0A%20%20%20%20%3Crect%20width%3D%22254%22%20height%3D%2220%22%20rx%3D%223%22%20fill%3D%22%23fff%22%2F%3E%0A%20%20%3C%2FclipPath%3E%0A%20%20%3Cg%20clip-path%3D%22url(%23r)%22%3E%0A%20%20%20%20%3Crect%20width%3D%22166%22%20height%3D%2220%22%20fill%3D%22%23333%22%2F%3E%0A%20%20%20%20%3Crect%20x%3D%22166%22%20width%3D%2288%22%20height%3D%2220%22%20fill%3D%22%238b5cf6%22%2F%3E%0A%20%20%20%20%3Crect%20width%3D%22254%22%20height%3D%2220%22%20fill%3D%22url(%23s)%22%2F%3E%0A%20%20%3C%2Fg%3E%0A%20%20%3Cg%20fill%3D%22%23fff%22%20text-anchor%3D%22middle%22%20font-family%3D%22Verdana%2CGeneva%2CDejaVu%20Sans%2Csans-serif%22%20text-rendering%3D%22geometricPrecision%22%20font-size%3D%2211%22%3E%0A%20%20%20%20%3Ctext%20aria-hidden%3D%22true%22%20x%3D%2283%22%20y%3D%2214%22%20fill%3D%22%23010101%22%20fill-opacity%3D%22.3%22%3EInferenceBench%20Verified%3C%2Ftext%3E%0A%20%20%20%20%3Ctext%20x%3D%2283%22%20y%3D%2213%22%3EInferenceBench%20Verified%3C%2Ftext%3E%0A%20%20%20%20%3Ctext%20aria-hidden%3D%22true%22%20x%3D%22210%22%20y%3D%2214%22%20fill%3D%22%23010101%22%20fill-opacity%3D%22.3%22%3ETogether%20AI%3C%2Ftext%3E%0A%20%20%20%20%3Ctext%20x%3D%22210%22%20y%3D%2213%22%3ETogether%20AI%3C%2Ftext%3E%0A%20%20%3C%2Fg%3E%0A%3C%2Fsvg%3E" alt="InferenceBench Verified — Together AI" /></a>