Nvidia introduces a new merged CPU and GPU AI processor

Nvidia has announced two products: the GB200 NVL4, a monster quad-B200 GPU module featuring two Grace CPUs, and the H200 NVL PCIe GPU, aimed at air-cooled data centers.

Advertisement

The GB200 Grace Blackwell NVL4 Superchip is an even more potent variant of the standard (non-NVL4) dual-GPU variant, featuring a whopping fourB200Blackwell GPUs hooked up to each other with NVLink and twoGrace ARM-based CPUsall on one motherboard. The solution is aimed at HPC and AI-hybrid workloads featuring a whopping 1.3TB of coherent memory. Nvidia advertises the GB200 NVL4 as having 2.2X the simulation, 1.8X the training, and 1.8X the inference performance of the Nvidia GH200 NVL4 Grace Hopper Superchip — its direct predecessor.

Nvidia says the GB200 NVL4 super chip will be available in 2H 2024 from various providers, such as MSI, Asus, Gigabyte, Wistron, Pegatron, ASRock Rack, Lenovo, HP Enterprise, and more.

Advertisement

Swinging around to the opposite side of the spectrum, Nvidia's H200 NVL is a dual-slot air-cooled GPU featuring PCIe 5.0 connectivity (128 GB/s). The cooler is optimized for rack-mount solutions, featuring a flow-through design where intake air runs from right to left; there is no blower-style fan.

Performance is slightly worse than Nvidia's outgoingH200in the SXM form factor. The H200 NVL is rated at 30 TFLOPS of FP64 and 60 TFLOPS of FP32. Tensor core performance is rated at 60 TFLOPS of FP64, 835 TFLOPS of TF32, 1,671 TFLOPS of BFLOAT16, 1,671 TFLOPS of FP16, 3,341 TFLOPS of FP8 and 3,341 TFLOPs of INT8.

However, Nvidia says the H200 NVL is much faster than the H100 NVL it replaces. It features 1.5X the memory capacity and 1.2X the memory bandwidth, delivering up to 1.7X faster inference performance and 1.3X faster performance for HPC workloads. Nvidia also compared quickly to Ampere, stating that the H200 NVL is 2.5X faster than Ampere's equivalent GPUs.

The H200 NVL PCIe GPU is optimized for the vast majority of data center configurations, including air-cooled server racks. Nvidia states that according to a survey, roughly 70% of enterprise racks use air cooling and 20kW of power or less. Being a PCIe GPU, data center providers can re-use their existing racks and only replace the GPUs, reducing waste and significantly reducing the cost of upgrading hardware. The H200 NVL is also equipped with NVLink, offering up to 900 GB/s of bandwidth per GPU and enabling system providers to hook up to four GPUs in a single rig to boost performance.

Nvidia's new air-cooled GPU arrives at a time when Nvidia's Blackwell GPUs are having severeoverheating problems. Despite operating with full-blown liquid cooling systems, system integrators are forced to redesign Blackwell GPU-supported server racks due to the GPU's enormous amount of heat dissipation in racks that consume up to 120KW alone. The H200 NVL is not even a close competitor to theB200, but Nvidia's air-cooled datacenter GPU highlights the significant advantages of low-power-consuming air-cooled GPUs.

H200 NVL will be available from various providers such asDell, HP Enterprise, Lenovo, and Supermicro. Additionally, the new GPU will be available in platforms from Aivres, ASRock Rack, Asus, Gigabyte, Ingrasys, Inventec, MSI, Pegatron, QCT, Wistron, and Wiwynn.

Advertisement

Hot Rec

Advertisement

Toshiba stuffs an entire PC into a dot matrix printer

Toshiba stuffs an entire PC into a dot matrix printer

Teen 'swatter for hire' pleads guilty to making more than 375 swattings and fake mass-shooting and bombing calls

Teen 'swatter for hire' pleads guilty to making more than 375 swattings and fake mass-shooting and bombing calls

AMD Phoenix CPU brings palm-sized SBC to life for up to $329

AMD Phoenix CPU brings palm-sized SBC to life for up to $329

Maker creates thumb-sized Raspberry Pi USB-C Ethernet module using the RP2040

Maker creates thumb-sized Raspberry Pi USB-C Ethernet module using the RP2040

ASRock launches almost 20 ATX 3.1 power supplies

ASRock launches almost 20 ATX 3.1 power supplies

Chinese DDR4 producers are undercutting South Korean rivals' pricing by 50%

Chinese DDR4 producers are undercutting South Korean rivals' pricing by 50%

Intel's Core 200 family poised to mix Arrow, Lunar, Meteor, Alder, and Raptor Lake parts

Intel's Core 200 family poised to mix Arrow, Lunar, Meteor, Alder, and Raptor Lake parts

This Raspberry Pi 'Expanso Football' is a cool distributed compute cluster in a briefcase

This Raspberry Pi 'Expanso Football' is a cool distributed compute cluster in a briefcase

Dell ships first Nvidia Blackwell server racks — PowerEdge XE9712 servers are enterprise-ready

Dell ships first Nvidia Blackwell server racks — PowerEdge XE9712 servers are enterprise-ready

Intel celebrates the arrival of MRDIMMs

Intel celebrates the arrival of MRDIMMs