← All hardware
Rack-scale Blackwell Ultra: 72 GPUs + 36 Grace CPUs as one giant accelerator
Pros
- Best-in-class rack-scale reasoning throughput
- Single coherent NVLink domain
- Turnkey for AI factories
- Big leap over GB200 NVL72
Cons
- Multi-million-dollar price
- Requires liquid-cooling datacenter
- Hyperscaler-only realistically
- Long lead times
✓ Where it shines / best for
- AI reasoning models and long-context inference
- Hyperscale LLM training and serving
- Cloud and frontier-lab AI factories
✕ Not the best fit for
- Small or mid-size deployments
- Air-cooled-only datacenters
- Budget-constrained buyers
Features
- ✓ 72 Blackwell Ultra GPUs + 36 Grace CPUs in a fully liquid-cooled rack
- ✓ 1.44 EFLOPS sparse FP4 (~1.1 EXAFLOPS) inference per rack
- ✓ 1.5x dense FP4 FLOPS and 2x attention performance vs standard Blackwell
- ✓ 288GB HBM3e per GPU; 20TB total GPU memory at up to 576TB/s per rack
- ✓ 5th-gen NVLink (130TB/s) unifying 72 GPUs in one NVLink domain
- ✓ ConnectX-8 SuperNICs deliver 800Gb/s networking per GPU
- ✓ Purpose-built for AI reasoning and test-time-scaling inference
- ✓ Up to 50x AI factory output vs Hopper-based platforms
Pricing
| Plan | Price | Billing | Notes |
|---|---|---|---|
| Full NVL72 rack | $3,000,000-$4,000,000 | one-time | Estimated rack-scale price; Blackwell Ultra generation commands premium over GB200 |
| Cloud rental (per GB300 GPU) | ~$12.00-$30.00 | per hour | Early on-demand neocloud pricing; reserved pricing materially lower |
Pricing verified from the official source. Prices change often — confirm on the vendor's site before buying.
Specifications
| use | Exascale AI reasoning factories |
| power | ~120-140kW per rack |
| memory | ~20TB+ HBM3E across rack |
| performance | ~1.1 EFLOPS FP4 inference per rack |
| architecture | Blackwell Ultra + Grace (rack-scale) |
Sponsored
A full review is being generated for this product and will appear here shortly.