Running HPC Workloads in the Cloud
HPC cloud computing has come a long way in terms of performance level and availability. Organizations that need to run HPC workloads can leverage cloud services to take on their most-complex and -challenging compute and storage-intensive requirements. HPC in the cloud can also improve time to results, so researchers spend their time researching, not waiting for the job queue. Many of the Intel® technologies that enable on-premises HPC deployments can also be found in the HPC cloud. Intel works closely with CSPs, using key toolkits and hardware-based security and acceleration to optimize HPC workloads on Intel® architecture.
Cloud HPC Architecture Considerations
In the cloud, customers pay by the hour and prioritize their budget based on time to results. Architecture plays a key role in delivering the performance necessary to crunch HPC workloads within desired time frames, ultimately contributing to the customer’s bottom line. Intel leads the way in working with CSPs to deploy performant architecture where it’s most impactful for HPC cloud instances. Key frameworks and technologies include the following:
- Intel® Xeon® Scalable processors are the beating heart of HPC cloud servers, delivering the performance and memory capacity for the most compute-intensive workloads. The Intel® Xeon® Scalable processor platform also supports several key technologies enumerated below that enable specific HPC use cases including AI convergence. 3rd Generation Intel® Xeon® Scalable processors, soon to be deployed in major CSP HPC cloud offerings, will deliver more memory bandwidth1 and higher instructions per clock2 over the previous generation processor. Intel® Speed Select Technology on Intel® Xeon® Scalable processors allows for multiple configurations in one server to meet the needs of diverse workloads. These enhancements can help customers achieve better cost per performance and time to results.
- Intel® Software Guard Extensions (Intel® SGX) on select Intel® Xeon® Scalable processors is an extremely desirable feature for running and helping secure HPC workloads in the cloud. While many security technologies focus on protecting data at rest, Intel® SGX helps protect data during the critical moment when it is being processed. In multitenant HPC cloud environments, Intel® SGX also helps protect containers and VMs by using memory enclaves to isolate cryptographic keys and data in transit.
- Intel® Advanced Vector Extensions 512 (Intel® AVX-512) is a set of processor instructions that boost vector-intensive compute workload performance. Intel® AVX-512 is especially well suited to vector/matrix operations on large data sets and is a competitive differentiator for Intel® Xeon® Scalable processors. Researchers and data scientists can use Intel® AVX-512 to help increase the performance for AI/DL workloads, DNA sequencing, simulations, financial analytics, and 3D modeling.
- Intel® Deep Learning Boost (Intel® DL Boost) in Intel® Xeon® Scalable processors includes a new set of Vector Neural Network Instructions (VNNI) that extends Intel® AVX-512. VNNI can also help reduce the number and complexity of convolutional operations required for AI inference, resulting in lower power and memory requirements for HPC cloud systems.3 Intel® DL Boost can accelerate convolutional neural network loops and increase AI operations in HPC cloud instances, with 3.4x greater performance.3
- Intel® oneAPI is a unified programming model specifically designed for heterogeneous HPC infrastructure. This model includes key performance libraries such as the Intel® Distribution for Python and Intel® MKL, which help optimize and accelerate HPC workloads on Intel architecture. Intel® MPI is a differentiated offering available in many CSP marketplaces that enables developers to easily deploy complex applications to multiple clusters, optimize code for high performance, and use automatic tuning to achieve low latency and high bandwidth. Customers and CSPs use these frameworks to help ensure they achieve the most impact from their HPC investments.
- Intel® HPC Platform Specification is a set of minimum requirements for compute, memory, storage, and fabric, as well as compatible applications for HPC infrastructure. Customers and businesses can rely on this specification for assurance that HPC cloud service provider offerings will satisfy a high standard of quality for their HPC workloads.
Intel-Based HPC Cloud Service Providers
Intel works closely with leading HPC cloud service providers, including AWS, Google Cloud Platform, Microsoft Azure, and Oracle. Each CSP offers their own cloud instances with a strong foundation of Intel® Xeon® Scalable processors, which are optimized for Intel® MPI and offer built-in Intel® DL Boost. Additionally, each CSP has its own marketplace of Intel and third-party solutions to help businesses get started quickly on Intel-based HPC instances.
- Intel-based Amazon Web Services instances use Intel® Xeon® Scalable processors and offer multiple configuration options to help match capacity to HPC requirements. AWS ParallelCluster is another service offering that helps customers orchestrate multiple AWS clusters into a consolidated HPC cloud solution. Intel has also achieved the AWS HPC competency status, a designation that showcases Intel’s deep level of expertise in HPC cloud solutions with AWS.
Learn more about Intel’s partnership with AWS ›
Read the AWS electronic design automation case study ›
Learn more about Intel-enabled Amazon EC2 instances ›
Video: Improving HPC simulation efficiency in the Cloud › - Google Cloud Platform N2 and C2 machine types use Intel® Xeon® Scalable processors and Intel® AVX-512 to support intensive HPC workloads in the cloud. N2 instances use Intel® DL Boost and deliver 2.82x higher AI inference performance vs. N1 instances4. Starting in 2021, Google Cloud announced pre-tuned HPC VM images for use on their clusters, featuring the Intel® MPI library as a key optimization.
Learn more about Intel’s partnership with Google Cloud ›
Getting started with genomics analytics on Google Cloud Platform ›
Read about how Intel empowers Google Cloud › - Microsoft Azure HC-Series Virtual Machines feature up to 44 Intel® Xeon® Scalable processor cores and capabilities such as Intel® AVX-512 and Intel® MKL. Azure also uses Intel® Arria® 10 FPGAs to accelerate AI and machine learning model training for HPC workloads. Microsoft recently launched the Azure HPC and AI Collaboration Center, with Intel as a key partner, to help disseminate HPC and AI best practices.
Blog: Evaluating Genomics Pipelines on Azure: Intel-based Virtual Machines ›
Video: Microsoft Azure HPC announces the new FX service virtual machines designed for EDA workloads ›
Learn more about Intel and Microsoft Azure partnership ›
Read the Intel and Microsoft Azure HPC guide ›
Read about the business advantages of Azure › - Intel-enabled HPC cloud services from Oracle deliver performance that rivals on-premises solutions, with the added benefit of cloud economics and on-demand resources. 3rd Gen Intel® Xeon® Scalable processors in Oracle X9 Generation instances drive a 42 percent performance increase compared to existing X7 Generation instances.5
Read the Oracle X9 press release ›
Read the Nissan engineering simulation case study ›
Simplifying Intel-Based CSP Onboarding
As businesses consider using Intel-enabled CSPs, choosing the correct HPC cloud offerings may be a complex and daunting endeavor. Fortunately, third-party cloud service integration partners are available that can help businesses choose the best-fit offerings and streamline the onboarding process. These partners are typically smaller organizations that assist with setting up workloads, enabling nonstandard features, offering unique insights via rich UI dashboards, or even discovering ways to replicate workload processing in a manner consistent with on-premises usage models. Key examples of these types of technology partners include Rescale, RONIN, Six Nines, and OnScale. HPC in the cloud offers immense choice and variety in service offerings, but that choice can cut both ways and increase the difficulty in selecting the right solution. Partners help by offering guidance, expertise, and specialization.
HPC Cloud Case Studies
These use cases show how HPC workloads in the cloud help deliver necessary compute resources combined with cloud flexibility and agility for answering questions and solving big problems:
- OnScale makes digital prototyping accessible ›
- University of Victoria builds next gen cloud infrastructure ›
- MicroSeismic deploys 3D visualization in the AWS cloud ›
- ClimaCell delivers innovative weather prediction ›
- Ansys uses Azure virtual machines to fully simulate radio frequency integrated circuits ›
- Learn how to deploy the Genome Analysis Toolkit (GATK) on Intel-enabled cloud infrastructure ›
Leading the Way for HPC Cloud
Many IT decision-makers are aware of the role that Intel fills in providing expertise for designing on-premises HPC architecture. However, Intel can fill the same role as a trusted adviser for HPC in the cloud. Any organization looking for a point of entry can start with Intel and benefit from a global ecosystem of CSPs and technology partners.