Velox™ NVIDIA GB200 Deployment hardware example - Velox

2 September 2025

GB200 pod setup (ultra dense) – Total TDP ≈ 2.05 MW / Max power ≈ 2.27 MW

Rack-by-rack AI network design using NVIDIA GB200 NVL72 as the accelerator platform, optimized to pack the maximum number of GB200 GPUs into exactly 18 racks (1pod) while remaining fully operational (compute + fabric + storage + OOB).

Goal & key assumptions

Accelerator choice: NVIDIA GB200 NVL72 rack-scale systems (each integrates 72 Blackwell GPUs + 36 Grace CPUs, liquid-cooled, NVLink inside the rack). NVIDIA Supermicro Aspen Systems Inc.
Scale-out fabric: Single, Ethernet-only AI fabric built on NVIDIA Spectrum-X/Spectrum Ethernet (RoCEv2), 400 GbE class today; platform scales to 800 GbE as needed. (Spectrum-X improves AI performance/efficiency vs. traditional Ethernet). NVIDIA+2NVIDIA+2
Storage & OOB: Shared over the same Ethernet fabric (separate VLANs/VRFs), plus a dedicated 1/10 GbE OOB network.
Cooling: Vendor liquid cooling as part of NVL72 (in-row CDU/loop as provided by the NVL72 solution). Supermicro

Rack allocation (18 total) — maximize GPUs

Racks 1–17 (Compute): 17× NVIDIA GB200 NVL72 (identical). → Total GPUs = 17 × 72 = 1,224 GPUs. NVIDIA
Rack 18 (Network / Storage / OOB): AI Ethernet leaves & spines, storage front-end switches and servers, OOB gear, and core edge.

This gives you the highest GPU count while still hosting all non-compute infrastructure in a single rack.

High-level topology

AI fabric (Ethernet, RoCEv2): Two logical planes (A/B) for fault isolation, each built from leaf–spine Spectrum Ethernet switches. Hosts (NVL72 compute trays) connect at 400 GbE today; the platform supports migration to 800 GbE later. NVIDIA
Storage: NVMe-oF / NFS over the same Ethernet fabric (dedicated VLANs/queues).
OOB: Redundant 1/10 GbE management fabric, separate PDUs, BMC reachability to all trays/switches.

Port math (so you can cable it)

Each NVL72 exposes scale-out Ethernet at 400 Gb/s via SuperNICs (per compute tray). For planning, allocate 18× 400 GbE per NVL72 rack (i.e., one per compute tray). 17 racks → 306 × 400 GbE endpoints into the AI fabric. (Supermicro + NVIDIA materials describe 400 Gb/s SuperNIC scale-out on these systems.) Supermicro
Leaves & spines (per fabric plane):
- Leaves: 4 switches × 64×400 GbE = 256 ports (38–39 downlinks per leaf; remainder for uplinks).
- Spines: 2 switches × 64×400 GbE = 128 ports used for leaf uplinks.
- Across A+B: 8 leaves + 4 spines total.
Uplinks: ~26 uplinks/leaf → ~104 uplinks/fabric to spines (non-blocking or very low-oversub).
Storage: Attach storage front-end NICs to the leaf layer (both fabrics if you want multipath).

If you later decide to dual-home each compute tray at 2× 400 GbE (A+B), increase leaf count (e.g., 6 leaves per fabric = 12 total) and keep the same 2-spine per fabric. The single network rack below has space reserved for that growth.

Rack elevations

Racks 1–17 — GB200 NVL72 (identical)

Logical elevation (vendor rack; exact U-heights are managed by the NVL72 integration)

Top: Ladder/cable managers (fiber trunk exits to Rack 18)
NVLink switch trays (internal NVLink fabric; 5th-gen NVLink, fully connects 72 GPUs intra-rack) NVIDIA Developer
Compute trays (18 trays total) — each tray houses Grace-Blackwell superchips; 1× 400 GbE scale-out port per tray to AI fabric (A or B as you assign) NVIDIA Supermicro
Power shelves / bus bars (as provided)
Liquid cooling manifolds (as provided; in-row or in-rack CDU per vendor solution) Supermicro
Bottom: PDUs (A/B)

Per compute rack cabling:

18× 400 GbE to leaves (split ~half to Fabric A, half to Fabric B).
OOB 1/10 GbE: 2–4 ports to OOB leaf pair in Rack 18.
(Optional) 25/100 GbE host mgmt to storage/OOB VLANs if your NVL72 BOM includes extra ports.

Rack 18 — Network / Storage / OOB (top-to-bottom)

U48–U47: Horizontal cable managers
U46–U45: Spine A (2× Spectrum Ethernet 64×400 GbE) — AI fabric spine (Plane A) NVIDIA
U44–U43: Spine B (2× Spectrum Ethernet 64×400 GbE) — AI fabric spine (Plane B) NVIDIA
U42–U39: Leaf A (4× Spectrum Ethernet 64×400 GbE) — downlinks to compute, uplinks to Spine A NVIDIA
U38–U35: Leaf B (4× Spectrum Ethernet 64×400 GbE) — downlinks to compute, uplinks to Spine B NVIDIA
U34–U33: Storage front-end Ethernet (2× Spectrum Ethernet; 100–400 GbE as required) — uplink into both leaf groups (A+B), MLAG/ECMP. NVIDIA
U32–U27: Storage servers (e.g., 3–6× 2U NVMe nodes; size to capacity/IOPS target).
U26: Edge/Firewall pair (2× 1U) — north-south to core/WAN (out of scope for GPU count).
U25: Mgmt aggregation (1× 48-port 10 GbE; uplinks to core).
U24: Mgmt aggregation (redundant).
U23–U22: Console server + KVM.
U21–U20: Timing (PTP/1PPS/ToD) if needed.
U19–U16: Spare space (reserved so you can add +2 leaves per fabric later for dual-homed 2× 400 GbE per tray).
U15–U1: PDUs / blanking / cable slack

Cabling plan (summary)

400 GbE fabric:
- From each compute rack: 18× 400G → 9 to Leaf-A, 9 to Leaf-B (Rack 18).
- Leaves uplink to their respective Spines (A or B) using 400G trunks (ECMP).
Storage: Storage nodes connect to both Leaf-A and Leaf-B (LAG/ECMP).
OOB: Each tray/BMC and each switch has 1–2× 1/10 GbE into the OOB pair in Rack 18.

Why this maximizes GPU count

You dedicate 17 of 18 racks to NVL72 compute, each with 72 GPUs, and consolidate all fabric/storage/OOB into a single shared rack.
NVL72 is specifically designed as a rack-scale, liquid-cooled GPU complex with internal NVLink; you only externalize the Ethernet scale-out ports, which keeps non-GPU hardware to a minimum across the row. NVIDIA Supermicro
Using Spectrum-X Ethernet avoids a second fabric technology (e.g., InfiniBand) and leverages AI-tuned Ethernet that is power-efficient and predictable at scale, with a roadmap to 800 GbE without redesigning the rack plan. NVIDIA+1

Bill of materials (at a glance)

Compute: 17× NVIDIA GB200 NVL72 racks (72 GPUs + 36 Grace per rack). → 1,224 GPUs total. NVIDIA
AI fabric: 8× 64-port 400 GbE leaves (4 per fabric A/B) + 4× 64-port 400 GbE spines (2 per fabric A/B). (All NVIDIA Spectrum Ethernet family / Spectrum-X capable.) NVIDIA
Storage front-end: 2× Spectrum Ethernet (100–400 GbE), 3–6× 2U NVMe storage servers (size per capacity target).
OOB: 2× 48-port 1/10 GbE mgmt switches, 1× console server, 1× KVM, PDUs.

power breakdown for the 18-rack deployment (17× GB200 NVL72 compute + 1× network/storage/OOB rack). I’m showing typical TDP vs a worst-case max using vendor specs (or closest published figures) and reasonable assumptions.

Scope & key sources

NVL72 rack power: ~120 kW typical (planning), with vendor configs listing up to ~132 kW capacity per NVL72 rack. Sunbird DCIM Supermicro training.continuumlabs.ai
400 GbE Spectrum-4 switches (64×400G): ~0.9 kW typical without optics; ~2.08 kW populated with 64 optical modules (includes ~17 W/OSFP optics). NVIDIA Docs+1
Storage 2U NVMe nodes: measured ~0.72 kW under heavy load per 24-bay system (use ~0.5 kW typical). ServeTheHome
Extra Ethernet for storage FE (SN4600/SN4600C class): ~0.47–0.60 kW per switch (without optics). NVIDIA Docs
OOB 48-port 1/10GbE: ~0.06–0.07 kW per switch at load. HPE Support+1 securewirelessworks.com
Console server (Opengear OM2200): ≤ 30–45 W. opengear.com ftp.opengear.com
Edge/Firewall: small DC pair can be ~40 W each, but a high-end pair (e.g., PA-5220 class) can reach ~0.87 kW each; I show both by putting typical (small) vs max (big-iron). Exclusive Networks Router-Switch.com Palo Alto Networks TechDocs

Per-rack / per-device assumptions used

Compute racks (17× NVL72)
• Typical TDP: 120 kW per rack → planning numbers widely cited. Sunbird DCIM training.continuumlabs.ai
• Max/nameplate: 132 kW per rack (Supermicro 8×33 kW shelves). Supermicro
AI Fabric (Ethernet, RoCEv2)
• 12× 64-port 400 GbE switches (8 leaf + 4 spine), per our topology.
• Typical: 0.9 kW/switch (no optics); Max: 2.08 kW/switch (64 optics). NVIDIA Docs
Storage
• 3× 2U NVMe storage nodes (minimal, shared), 0.5 kW typ / 0.722 kW max each. ServeTheHome
• 2× storage front-end Ethernet switches: 0.466 kW typ / 0.60 kW max (per SN4600/C family). NVIDIA Docs
OOB / Mgmt
• 2× OOB 1/10 GbE switches: 0.065–0.068 kW each. HPE Support+1
• 1× Console server: 0.03 kW typ / 0.045 kW max. opengear.com
• Edge/Firewall pair: 0.078 kW typ (two ~39 W units) / 1.74 kW max (two PA-5220 class). Exclusive Networks Router-Switch.com
• Misc (KVM, timing, controllers, PDUs overhead): 0.1 kW typ / 0.2 kW max (small, conservative allowance).

Power breakdown (18 racks total)

Subsystem / Gear	Qty	Typical TDP	Max power
NVL72 compute racks (≈120 kW/rack typ; 132 kW max)	17	2,040 kW	2,244 kW
AI fabric 400 GbE (Spectrum-4 64×400G)	12	10.8 kW (0.9 kW ea)	24.96 kW (2.08 kW ea)
Storage nodes, 2U NVMe	3	1.50 kW (0.5 kW ea)	2.17 kW (0.722 kW ea)
Storage FE Ethernet (SN4600/C class)	2	0.93 kW (0.466 kW ea)	1.20 kW (0.60 kW ea)
OOB mgmt switches (48×1/10G)	2	0.13 kW	0.14 kW
Console server (Opengear/Lantronix)	1	0.03 kW	0.045 kW
Edge/Firewall (pair)	2	0.078 kW	1.74 kW
Misc (KVM, timing, mgmt appliances, slack)	—	0.10 kW	0.20 kW
Totals (IT load)	—	≈ 2,053.6 kW (≈ 2.05 MW)	≈ 2,274.4 kW (≈ 2.27 MW)