Updated minutes ago· Sources: GPU Pricing, API Token Pricing, Model Registry

Gemini 1.5 Pro

Google · moe · 175B parameters · 2,097,152 context

Quality

86.0

Calculate ROI →Compare with others Fine-Tune This Model →

Architecture Details

TypeMOE

Total Parameters175B

Active Parameters40B

Layers72

Hidden Dimension8,192

Attention Heads64

KV Heads8

Head Dimension128

Vocab Size256,000

Total Experts16

Active Experts2

Memory Requirements

BF16 Weights

350.0 GB

FP8 Weights

175.0 GB

INT4 Weights

87.5 GB

KV-Cache per Token147456 bytes

Activation Estimate4.00 GB

Fits on (single-node)

B200 SXM INT4B100 SXM INT4GB200 NVL72 (per GPU) INT4GB300 NVL72 (per GPU) INT4H200 SXM INT4H100 NVL 94GB (per GPU pair) INT4Instinct MI300X INT4Instinct MI325X FP8

GPU Recommendations

B200 NVL (pair)optimal

BF16 · 2 GPUs · tensorrt-llm

98/100

score

Throughput

280.0 tok/s

Cost/Month

$19929

Cost/M Tokens

$27.08

Use this config →

B200 SXMoptimal

BF16 · 4 GPUs · tensorrt-llm

93/100

score

Throughput

280.0 tok/s

Cost/Month

$17044

Cost/M Tokens

$23.16

Use this config →

H200 SXMoptimal

BF16 · 4 GPUs · tensorrt-llm

90/100

score

Throughput

280.0 tok/s

Cost/Month

$10211

Cost/M Tokens

$13.88

Use this config →

API Pricing Comparison

Provider	Input $/M	Output $/M	Badges
google	$1.25	$5.00	Cheapest

Quality Benchmarks

MMLU

86.5

HumanEval

65.0

GSM8K

92.0

MT-Bench

87.0

Capabilities

Features

✓ Tool Use✓ Vision✓ Code✓ Math✗ Reasoning✓ Multilingual✓ Structured Output

Supported Frameworks

Supported Precisions

BF16 (default)

Similar Models

Claude 3 Opus

175B params · dense

Quality: 88

from $75.00/M

Falcon 180B

180B params · dense

Quality: 60

from $2.40/M

Claude Opus 4

200B params · dense

Quality: 94

from $75.00/M

GPT-4o

200B params · moe

Quality: 91

from $10.00/M

GPT-4 Turbo

200B params · moe

Quality: 86

from $30.00/M