Solar Pro 22B
Upstage · dense · 22B parameters · 4,096 context
Parameters
22B
Context Window
4K tokens
Architecture
Dense
Best GPU
H20
Cheapest API
$0.50/M
Intelligence Brief
Solar Pro 22B is a 22B parameter DENSE model from Upstage, featuring Grouped Query Attention (GQA) with 48 layers and 4,096 hidden dimensions. With a 4,096 token context window, it supports tools, structured output, code, math, multilingual. The most cost-effective API deployment is via upstage at $0.50/M output tokens. For self-hosted inference, H20 delivers optimal throughput at $940/month.
Architecture Details
Memory Requirements
BF16 Weights
44.0 GB
FP8 Weights
22.0 GB
INT4 Weights
11.0 GB
GPU Compatibility Matrix
Solar Pro 22B is compatible with 74% of GPU configurations across 41 GPUs at 3 precision levels.
GPU Recommendations
FP8 · 1 GPU · tensorrt-llm
100/100
score
Throughput
1.1K tok/s
Latency (ITL)
1.0ms
Est. TTFT
0ms
Cost/Month
$940
Cost/M Tokens
$0.34
FP8 · 1 GPU · tensorrt-llm
95/100
score
Throughput
1.1K tok/s
Latency (ITL)
1.0ms
Est. TTFT
0ms
Cost/Month
$1794
Cost/M Tokens
$0.65
FP8 · 1 GPU · tensorrt-llm
95/100
score
Throughput
760.5 tok/s
Latency (ITL)
1.3ms
Est. TTFT
0ms
Cost/Month
$1794
Cost/M Tokens
$0.90
Deployment Options
API Deployment
upstage
$0.50/M
output tokens
Single GPU
H20
$940/mo
Min VRAM: 22 GB
Multi-GPU
A100 40GB SXM x2
375.0 tok/s
TP· $1613/mo
API Pricing Comparison
| Provider | Input $/M | Output $/M | Badges |
|---|---|---|---|
| upstage | $0.50 | $0.50 | Cheapest |
Cost Analysis
| Provider | Input $/M | Output $/M | ~Monthly Cost |
|---|---|---|---|
| upstageBest Value | $0.50 | $0.50 | $5 |
Cost per 1,000 Requests
Short (500 tok)
$0.35
via upstage
Medium (2K tok)
$1.40
via upstage
Long (8K tok)
$5.00
via upstage
Performance Estimates
Throughput by GPU
VRAM Breakdown (H20, FP8)
Precision Impact
bf16
44.0 GB
weights/GPU
fp8
22.0 GB
weights/GPU
~1.1K tok/s
int4
11.0 GB
weights/GPU
Capabilities
Features
Supported Frameworks
Supported Precisions
Where to Deploy Solar Pro 22B
Self-Hosted Infrastructure
Similar Models
Codestral 22B
22B params · dense
Quality: 63
from $0.90/M
Mistral Small 24B
24B params · dense
Quality: 68
from $0.30/M
Mistral Small 3.1 24B
24B params · dense
Quality: 50
from $0.30/M
GigaChat 20B
20B params · dense
Quality: 50
Claude 3.5 Haiku
20B params · dense
Quality: 67
from $4.00/M
Frequently Asked Questions
How much VRAM does Solar Pro 22B need for inference?
Solar Pro 22B requires approximately 44.0 GB of VRAM at BF16 precision, 22.0 GB at FP8, or 11.0 GB at INT4 quantization. Additional VRAM is needed for KV-cache (196608 bytes per token) and activations (~1.50 GB).
What is the best GPU for Solar Pro 22B?
The top recommended GPU for Solar Pro 22B is the H20 using FP8 precision. It achieves approximately 1.1K tokens/sec at an estimated cost of $940/month ($0.34/M tokens). Score: 100/100.
How much does Solar Pro 22B inference cost?
Solar Pro 22B API inference starts from $0.50/M input tokens and $0.50/M output tokens. Self-hosted inference costs depend on your GPU configuration — use our ROI calculator for a detailed breakdown.