Nvidia CEO Jensen Huang, speaking at the opening keynote of the Computex computer technology conference, on Monday in Taipei, Taiwan, unveiled a host of fresh products, including a new kind of ethernet switch dedicated to moving high volumes of data for artificial intelligence (AI) tasks.
“How do we introduce a new ethernet, that is backward compatible with everything, to turn every data center into a generative AI data center?” posed Huang in his keynote. “For the very first time, we are bringing the capabilities of high-performance computing into the ethernet market.”
Also: The best AI chatbots
The Spectrum-X, as the family of ethernet products is known, is “the world’s first high-performance ethernet for AI”, according to Nvidia. A key feature of the technology is that it “doesn’t drop packets”, said Gilad Shainer, the senior vice president of networking, in a media briefing.
The first iteration of Spectrum-X is Spectrum-4, said Nvidia, which it called “the world’s first 51Tb/sec Ethernet switch built specifically for AI networks”. The switch works in conjunction with Nvidia’s BlueField data-processing unit, or DPU, chips that handle data fetching and queueing, and Nvidia fiber-optic transceivers. The switch can route 128 ports of 400-gigabit ethernet, or 64 800-gig ports, from end to end, the company said.
Huang held up the silver Spectrum-4 ethernet switch chip on stage, noting that it’s “gigantic”, consisting of one hundred billion transistors on a 90-millimeter by 90-millimeter die that’s built with Taiwan Semiconductor Manufacturing’s “4N” process technology. The part runs at 500 watts, said Huang.
Nvidia’s chip has the potential to change the ethernet-networking market. The vast majority of switch silicon is supplied by chip maker Broadcom. Those switches are sold to networking-equipment makers Cisco Systems, Arista Networks, Extreme Networks, Juniper Networks, and others. Those companies have been expanding their equipment to better handle AI traffic.
The Spectrum-X family is built to address the bifurcation of data centers into two forms. One form is what Huang called “AI factories”, which are facilities that cost hundreds of millions of dollars for the most powerful GPUs that are based on Nvidia’s NVLink and Infiniband, which are used for AI training, serving a small number of very large workloads.
The other type of data center facility is AI cloud, which is multi-tenant, based on ethernet, and handles hundreds and hundreds of workloads for customers simultaneously, and which is focused on things such as serving up the predictions to consumers of AI, which will be served by the Spectrum-X.
The Spectrum-X, said Shainer, is able to “spread traffic across the network in the best way”, using “a new mechanism for congestion control”, which averts a pile-up of packets that can happen in the memory buffer of network routers.
“We use advanced telemetry to understand latencies across the network to identify hotspots before they cause anything, to keep it congestion-free.”
Nvidia said in prepared remarks that “the world’s top hyperscalers are adopting NVIDIA Spectrum-X, including industry-leading cloud innovators.”
Nvidia is building a test-bed computer, it said, at its Israel offices, called Israel-1, a “generative AI supercomputer”, using Dell PowerEdge XE9680 servers composed of H100 GPUs running data across the Spectrum-4 switches.
All the news at Computex is available in Nvidia’s newsroom.
In addition to the announcement of its new ethernet technology, Huang’s keynote featured a new model in the company’s “DGX” series of computers for AI, the DGX GH200, which the company bills as “a new class of large-memory AI supercomputer for giant generative AI models”.
The GH200 is the first system to ship with what the company calls its “superchip”, the Grace Hopper board, which contains on a single circuit board a Hopper GPU, and the Grace CPU, a CPU based on ARM instruction set that is meant to compete with x86 CPUs from Intel and Advanced Micro Devices.
The first iteration of Grace Hopper, the GH200, is “in full production”, said Huang. Nvidia said in a press release that “global hyperscalers and supercomputing centers in Europe and the U.S. are among several customers that will have access to GH200-powered systems.”
The DGX GH200 combines 256 of the superchips, said Nvidia, to achieve a combined 1 exaflops — ten to the power of 18, or, one billion, billion floating point operations per second — utilizing 144 terabytes of shared memory. The computer is 500 times as fast as the original DGX A100 machine released in 2020, according to Nvidia.
The keynote also unveiled MGX, a reference architecture for system makers to quickly and cost effectively build 100-plus server variations. The first partners to use the spec are ASRock Rack, ASUS, GIGABYTE, Pegatron, QCT, and Supermicro, with QCT and Supermicro to be first to market with systems, in August, said Nvidia.
The entire keynote can be seen as a replay from the Nvidia website.