Updated minutes ago· Sources: GPU Pricing, API Token Pricing, Model Registry

Megatron-Turing NLG 530B

NVIDIA · dense · 530B parameters · 2,048 context

Quality

58.0

Calculate ROI →Compare with others Fine-Tune This Model →

Architecture Details

TypeDENSE

Total Parameters530B

Active Parameters530B

Layers105

Hidden Dimension20,480

Attention Heads128

KV Heads128

Head Dimension160

Vocab Size50,257

Memory Requirements

BF16 Weights

1060.0 GB

FP8 Weights

530.0 GB

INT4 Weights

265.0 GB

KV-Cache per Token3440640 bytes

Activation Estimate12.00 GB

Fits on (single-node)

B200 NVL (pair) INT4B200 SXMx2 INT4B100 SXMx2 INT4GB200 NVL72 (per GPU)x2 INT4GB300 NVL72 (per GPU)x2 INT4H100 NVL 94GB (per GPU pair)x2 INT4Instinct MI300Xx2 INT4Instinct MI325Xx2 INT4

GPU Recommendations

B200 NVL (pair)optimal

FP8 · 2 GPUs · tensorrt-llm

88/100

score

Throughput

140.0 tok/s

Cost/Month

$19929

Cost/M Tokens

$54.17

Use this config →

B200 SXMoptimal

FP8 · 4 GPUs · tensorrt-llm

83/100

score

Throughput

140.0 tok/s

Cost/Month

$17044

Cost/M Tokens

$46.33

Use this config →

B100 SXMoptimal

FP8 · 4 GPUs · tensorrt-llm

83/100

score

Throughput

140.0 tok/s

Cost/Month

$17082

Cost/M Tokens

$46.43

Use this config →

API Pricing Comparison

No API pricing data available for this model.

Quality Benchmarks

MMLU

63.0

HumanEval

30.0

GSM8K

50.0

MT-Bench

70.0

Capabilities

Features

✗ Tool Use✗ Vision✓ Code✓ Math✗ Reasoning✓ Multilingual✗ Structured Output

Supported Frameworks

tensorrt-llmvllm

Supported Precisions

BF16 (default)FP8INT4

Similar Models

Snowflake Arctic 480B

480B params · moe

Quality: 50

from $1.50/M

Gemini 2.0 Pro

600B params · moe

Quality: 88

from $4.00/M

Grok 3

600B params · moe

Quality: 90

from $15.00/M

Llama 3.1 405B

405B params · dense

Quality: 88

from $3.00/M

Llama 4 Maverick

400B params · moe

Quality: 89

from $1.80/M