Accelerating Customer Results with Accelerated Computing

March 3, 2023 Published

Data Center

A large computer server with the word Aurora displayed on its side. Logos of Argonne National Laboratory, the U.S. Department of Energy, Intel, and Hewlett Packard Enterprise are visible. The room is well-lit with a tiled floor.

Argonne Aurora Servers installed at Argonne National Labs. Hi-res image with embedded 1x1, 4x3, and 16x9 renditions for intel.com AEM usage.

Intel ramps accelerated computing and data center graphics business with streamlined product lineup and software ecosystem enablement.

Listen to audio

Play

Pause

Options

0:00

-:--

Playback Speed

By GSpeech

accelerating customer results with accelerated computing. march 3, 2023 published data center argonne aurora servers installed at argonne national labs. hi-res image with embedded 1x1, 4x3, and 16x9 renditions for intel.com aem usage. intel ramps accelerated computing and data center graphics business with streamlined product lineup and software ecosystem enablement. in this article: in january, we launched our strongest offerings for high performance computing (hpc) and ai ever with the 4th gen intel® xeon® scalable processors, intel® xeon® cpu max series and intel® data center gpu max series. we also introduced the intel® data center gpu flex series last year – a flagship product for media streaming, cloud gaming and ai inference – and the habana® gaudi®2 deep learning processor for training. co-designed with leading cloud service providers, enterprise and supercomputing customers, these products showcase key technical innovations, including the integration of high-bandwidth memory with x86 cpus and advanced chiplet architectures. intel’s full data center and ai hardware portfolio, including our xeon and habana products, have been developed to help our customers solve the world’s most difficult problems and train the largest ai models. accelerated computing and gpus are among the fastest-growing segments of the computing market and central to intel’s long-term success. we are seeing great customer support and we continue to demonstrate tremendous performance improvements in real-world hpc and ai workloads on these recently deployed products. building on this momentum, with close customer engagement on their requirements, we are simplifying and streamlining our data center gpu roadmap. this enables our customers and the ecosystem to maximize their investments on currently available max series and flex series gpus, while ensuring next-generation products deliver significant leaps in performance and developer productivity. let me share details related to customer adoption, real-world application performance improvements and roadmap updates. early customer adoption. our early efforts to ramp intel® xeon, max series and flex series gpus into the data center market have seen a positive reception from customers. you have probably heard about argonne national laboratory, which will be deploying more than 60,000 max series gpus and 20,000 max series cpus to power the aurora supercomputer this year. aurora is expected to become the world’s first supercomputer with 2 exaflops of peak performance. deployment is going well, with intel collaborating closely on testing and development. argonne expects the system to be accessible to early researchers by the third quarter of 2023. lawrence livermore national laboratories (llnl) and sandia national laboratories are installing thousands of nodes of 4th gen intel xeons in their cts-2 systems – the supercomputing workhorse of the department of energy (doe). llnl’s intel xeon-powered predecessor, jade, recently contributed to the breakthrough in fusion energy, helping to design the optimal package for laser induction. los alamos national laboratory (lanl), another doe research center, is installing more than 10,000 max series cpus for its crossroads supercomputer, which will power national security and wildfire research. the impact of these technologies on science, engineering and industry cannot be underestimated. performance. gpus have seen explosive growth in the hpc and ai space, with the number of flops from gpus on the top 500 list of the world’s fastest supercomputers growing at three times the pace of those from cpus. with the max series gpu, intel introduced its most sophisticated processor ever, using the most advanced packaging and manufacturing processes, with rich features such as hardware-accelerated ray tracing, rambo cache, deep systolic arrays for ai. the list goes on and on. but how does it perform? at this week’s intel extreme performance user group (ixpug) meeting, tim williams, deputy director of argonne's computational science division, presented performance data for real-world applications on production max series gpus. for materials science, nuclear engineering, cosmology and plasma physics codes, researchers measured 30% to 260% speedups over leading alternative gpus. the flex series gpu is also showing leadership in media stream density and visual quality and is now shipping initial deployments to cloud service providers and multinational companies, enabling large-scale cloud gaming and media delivery deployments. these early results give us tremendous confidence that our investments are already paying dividends for our customers and the developer ecosystem – and that our gpu products have the capabilities and scalability needed to help solve the world’s most challenging problems today and tomorrow. roadmap. with a goal of maximizing return on investment for customers, we will move to a two-year cadence for data center gpus. this matches customer expectations on new product introductions and allows time to develop their ecosystems. building on the momentum of the max series gpu, our next product in the max series family will be the gpu architecture code-named falcon shores. targeted for introduction in 2025, falcon shores’ flexible chiplet-based architecture will address the exponential growth of computing needs for hpc and ai. we are working on variants for this architecture supporting ai, hpc and the convergence of these markets. this foundational architecture will have the flexibility to integrate new ip (including cpu cores and other chiplets) from intel and customers over time, manufactured using our idm 2.0 model. rialto bridge, which was intended to provide incremental improvements over our current architecture, will be discontinued. the flex series product family will also move to a two-year cadence. we will discontinue the development of lancaster sound, which was intended to be an incremental improvement over our current generation. this allows us to accelerate development on melville sound, which will be a significant architectural leap from the current generation in terms of performance, features and the workloads it will enable. in addition to streamlining our roadmap, we are increasing our focus on the software ecosystem. we will be providing continuous updates for our max series and flex series products, with performance improvements, new features, expanded operating systems support and new use cases to broaden the benefits of these products. accelerating our customers’ work. our accelerated computing products are in the market and ramping. the oneapi open software ecosystem is maturing by the day. we have simplified our roadmap with the goal of doing fewer things better and are rapidly rolling out products to our customers. stay tuned for frequent updates on deployments, workloads and performance. i look forward to sharing more at upcoming events and hope to see you at the international supercomputing conference (isc) in may. jeff mcveigh is corporate vice president and interim general manager of the accelerated computing systems and graphics group at intel corporation.

In this article:

In January, we launched our strongest offerings for high performance computing (HPC) and AI ever with the 4th Gen Intel® Xeon® Scalable processors, Intel® Xeon® CPU Max Series and Intel® Data Center GPU Max Series. We also introduced the Intel® Data Center GPU Flex Series last year – a flagship product for media streaming, cloud gaming and AI inference – and the Habana® Gaudi®2 deep learning processor for training.

Co-designed with leading cloud service providers, enterprise and supercomputing customers, these products showcase key technical innovations, including the integration of high-bandwidth memory with x86 CPUs and advanced chiplet architectures. Intel’s full data center and AI hardware portfolio, including our Xeon and Habana products, have been developed to help our customers solve the world’s most difficult problems and train the largest AI models.

Accelerated computing and GPUs are among the fastest-growing segments of the computing market and central to Intel’s long-term success. We are seeing great customer support and we continue to demonstrate tremendous performance improvements in real-world HPC and AI workloads on these recently deployed products.

Building on this momentum, with close customer engagement on their requirements, we are simplifying and streamlining our data center GPU roadmap. This enables our customers and the ecosystem to maximize their investments on currently available Max Series and Flex Series GPUs, while ensuring next-generation products deliver significant leaps in performance and developer productivity.

Let me share details related to customer adoption, real-world application performance improvements and roadmap updates.

Early Customer Adoption

Our early efforts to ramp Intel® Xeon, Max Series and Flex Series GPUs into the data center market have seen a positive reception from customers.

You have probably heard about Argonne National Laboratory, which will be deploying more than 60,000 Max Series GPUs and 20,000 Max Series CPUs to power the Aurora supercomputer this year. Aurora is expected to become the world’s first supercomputer with 2 exaflops of peak performance. Deployment is going well, with Intel collaborating closely on testing and development. Argonne expects the system to be accessible to early researchers by the third quarter of 2023.

Lawrence Livermore National Laboratories (LLNL) and Sandia National Laboratories are installing thousands of nodes of 4th Gen Intel Xeons in their CTS-2 systems – the supercomputing workhorse of the Department of Energy (DOE). LLNL’s Intel Xeon-powered predecessor, JADE, recently contributed to the breakthrough in fusion energy, helping to design the optimal package for laser induction.

Los Alamos National Laboratory (LANL), another DOE research center, is installing more than 10,000 Max Series CPUs for its Crossroads supercomputer, which will power national security and wildfire research.

The impact of these technologies on science, engineering and industry cannot be underestimated.

Performance

GPUs have seen explosive growth in the HPC and AI space, with the number of flops from GPUs on the Top 500 List of the world’s fastest supercomputers growing at three times the pace of those from CPUs. With the Max Series GPU, Intel introduced its most sophisticated processor ever, using the most advanced packaging and manufacturing processes, with rich features such as hardware-accelerated ray tracing, RAMBO cache, deep systolic arrays for AI ... the list goes on and on.

But how does it perform? At this week’s Intel Extreme Performance User Group (IXPUG) meeting, Tim Williams, deputy director of Argonne's Computational Science Division, presented performance data for real-world applications on production Max Series GPUs. For materials science, nuclear engineering, cosmology and plasma physics codes, researchers measured 30% to 260% speedups over leading alternative GPUs.

The Flex Series GPU is also showing leadership in media stream density and visual quality and is now shipping initial deployments to cloud service providers and multinational companies, enabling large-scale cloud gaming and media delivery deployments.

These early results give us tremendous confidence that our investments are already paying dividends for our customers and the developer ecosystem – and that our GPU products have the capabilities and scalability needed to help solve the world’s most challenging problems today and tomorrow.

Roadmap

With a goal of maximizing return on investment for customers, we will move to a two-year cadence for data center GPUs. This matches customer expectations on new product introductions and allows time to develop their ecosystems.

Building on the momentum of the Max Series GPU, our next product in the Max Series family will be the GPU architecture code-named Falcon Shores. Targeted for introduction in 2025, Falcon Shores’ flexible chiplet-based architecture will address the exponential growth of computing needs for HPC and AI. We are working on variants for this architecture supporting AI, HPC and the convergence of these markets. This foundational architecture will have the flexibility to integrate new IP (including CPU cores and other chiplets) from Intel and customers over time, manufactured using our IDM 2.0 model. Rialto Bridge, which was intended to provide incremental improvements over our current architecture, will be discontinued.

The Flex Series product family will also move to a two-year cadence. We will discontinue the development of Lancaster Sound, which was intended to be an incremental improvement over our current generation. This allows us to accelerate development on Melville Sound, which will be a significant architectural leap from the current generation in terms of performance, features and the workloads it will enable.

In addition to streamlining our roadmap, we are increasing our focus on the software ecosystem. We will be providing continuous updates for our Max Series and Flex Series products, with performance improvements, new features, expanded operating systems support and new use cases to broaden the benefits of these products.

Accelerating Our Customers’ Work

Our accelerated computing products are in the market and ramping. The oneAPI open software ecosystem is maturing by the day. We have simplified our roadmap with the goal of doing fewer things better and are rapidly rolling out products to our customers. Stay tuned for frequent updates on deployments, workloads and performance. I look forward to sharing more at upcoming events and hope to see you at the International Supercomputing Conference (ISC) in May.

Jeff McVeigh is corporate vice president and interim general manager of the Accelerated Computing Systems and Graphics Group at Intel Corporation.