April 14, 2021
Nvidia debuted its Arm-based Grace CPU for giant artificial intelligence and high-performance computing applications, the company’s first such data center CPU. At Nvidia’s GTC 2021 conference, chief executive Jensen Huang said Grace, which offers 10 times the performance using energy-efficient Arm cores, will first be used by the Swiss National Supercomputing Centre (CSCS) and the U.S. Department of Energy’s Los Alamos National Laboratory (LANL). The CPU, named for U.S. Navy rear admiral and computer programming pioneer Grace Hopper, is slated for availability in early 2023.
VentureBeat reports that, according to Huang, Grace is “the world’s first CPU designed for terabyte scale computing … [and] is the result of more than 10,000 engineering years of work.” The Grace CPU is aimed at “natural language processing, recommender systems, and AI supercomputing,” advanced applications that “analyze enormous datasets requiring both ultra-fast compute performance and massive memory.”
In addition to the Arm CPU cores, Grace also has “an innovative low-power memory subsystem to deliver high performance with great efficiency.” Nvidia did not disclose the number of transistors in a Grace chip.
Huang noted that, “coupled with the GPU and DPU, Grace gives us the third foundational technology for computing and the ability to re-architect the data center to advance AI … Nvidia is now a three-chip company.” Nvidia senior director of product management and marketing Paresh Kharya noted that Nvidia is “not competing with x86” chips from Intel and AMD.
VB explains that, “today’s largest AI models include billions of parameters and are doubling every two and a half months,” which means that training “requires a new CPU that can be tightly coupled with a GPU to eliminate system bottlenecks.” Moor Insights & Strategies analyst Patrick Moorhead pointed out that Grace is “a tightly integrated CPU for over a trillion parameter AI models.”
Grace is based on a “4th generation Nvidia NVLink interconnect technology, which provides 900 gigabyte-per-second connections between Grace and Nvidia graphics processing units (GPUs) to enable 30 times higher aggregate bandwidth compared to today’s leading servers … [and] will also utilize an innovative LPDDR5x memory subsystem that will deliver twice the bandwidth and 10 times better energy efficiency compared with DDR4 memory.”
The architecture also “provides unified cache coherence with a single memory address space, combining system and HBM GPU memory to simplify programmability.”
At Tirias Research, analyst Kevin Krewell noted that, “the key to Grace is that using the custom ARM CPU, it will be possible to scale to large LPDDR5 DRAM arrays far larger than possible with high-bandwidth memory directly attached to the GPUs.”
Grace will be integrated into the Swiss supercomputer, dubbed Alps and to be built by HP Enterprise, which “will feature 20 exaflops of AI processing.” Alps is expected to come online in 2023. Nvidia will also “make its graphics chips available with Amazon Web Services’ Graviton2 Arm-based CPU for data centers for cloud computing.”