Hewlett Packard Enterprise recently revealed at ISC High Performance 2024 that it has delivered Aurora, the second exascale supercomputer in the world, to the US Department of Energy’s Argonne National Laboratory in partnership with Intel.
Since Aurora was designed to be an AI-capable system from the beginning, scientists will be able to leverage generative AI models on it to hasten scientific discovery.
Deep learning-enhanced high energy particle physics, machine learning-accelerated drug discovery, and brain mapping to better comprehend the 80 billion neurons that make up the human brain are just a few of the early AI-driven research projects that scientists have carried out on Aurora.
According to the TOP500 list of the most powerful supercomputers, Aurora is now the second-fastest supercomputer in the world, having achieved 1.012 exaflops on 87% of the system. As the world leader in supercomputing, HPE’s Aurora system is not only the company’s second exascale system but also the largest AI-capable system globally. It achieved 10.6 exaflops on 89% of the system, placing it first on the HPL Mixed Precision (MxP) Benchmark.
“We are honored to celebrate another significant milestone in exascale with Aurora, which delivers massive compute capabilities to make breakthrough scientific discoveries and help solve the world’s toughest problems,” said Trish Damkroger, senior vice president and general manager, HPC & AI Infrastructure Solutions at HPE. “We are proud of the strong partnership with the U.S. Department of Energy, Argonne National Laboratory, and Intel to realize a system of this scale and magnitude that was made possible through our joint innovative engineering, multiple teams, and most importantly, shared value of delivering state-of-the-art technology to fuel science and benefit humankind.”
One quintillion operations is said to be processed in a second by an exascale supercomputer machine. Solving some of the most challenging issues facing humanity is made feasible by this level of computational capability. The HPE Cray EX supercomputer, designed specifically to handle the scale and magnitude of exascale, powers Aurora. Additionally, the system is the biggest single-system deployment of HPE Slingshot, an open, Ethernet-based supercomputing interconnect.
Enabling high-speed networking across Aurora’s 10,624 compute blades, 21,248 Intel® Xeon® CPU Max Series processors, and 63,744 Intel® Data Center GPU Max units, this fabric links the 7,500 compute node endpoints, 2,400 storage and service network endpoints, and 5,600 switches to improve performance. This makes Aurora one of the largest GPU clusters in the world.
“Aurora is a first-of-its-kind supercomputer and we expect it to be a gamechanger for researchers,” said Rick Stevens, associate laboratory director and distinguished fellow at Argonne National Laboratory. “Reaching this milestone with a second exascale system in the U.S. is an incredibly significant achievement that will advance open science initiatives globally.”
With co-investment and co-development required to build the breakthrough engineering required to advance research, HPE, Intel, the U.S. Department of Energy, and Argonne National Laboratory have worked together to establish the powerful private-public cooperation that has resulted in the Aurora exascale supercomputer.
Research conducted via the Aurora Early Science Program demonstrates that collaborations between the public and commercial sectors are essential to accomplishing scientific advancement. Researchers have already successfully tested a wide variety of programming models, languages, and applications on the system as part of the process of optimizing and stress-testing it.
“The Aurora supercomputer was designed to support the research and science communities within the HPC and AI space,” said Ogi Brkic, Intel vice president and general manager, Data Center AI Solutions. “Our ongoing collaboration with Argonne National Laboratory and HPE has resulted in promising early science success stories. And we’re excited to see what’s to come as we continue to optimize system performance to accelerate the science and march toward what is next.”
Aurora has tapped 9,234 of the total nodes in the system to reach exascale on a partial run. The Argonne Leadership Computing Facility (ALCF), a user facility of the Office of scientific of the U.S. Department of Energy, is home to the open scientific system Aurora.