Run Your HPC Applications Faster
Intel’s investment in independent software vendors (ISVs) means commercial HPC applications are engineered to run faster on Intel® architecture.
Why Intel?
Pervasive HPC Portfolio
Intel® solutions for HPC go beyond the processor and include memory, storage, networking, and software. Many engineers and developers are already trained on building for Intel-enabled systems, leading to fast deployments with low risk and low TCO.
Powerful Developer Tools
Intel offers exclusive toolkits like Intel® oneAPI, Intel® MKL, Intel® MPI, and Intel® compilers that are optimized for incredible performance on Intel® architecture.
Open Source Leadership
Intel engages with open source communities for software including GROMACS, LAMMPS, NAMD, WRF, Relion, and OpenFOAM. Intel is the top Linux contributor to open source frameworks, and helps make solutions more accessible to the global community.
Performance Through Partnership
As an industry leader in HPC, Intel engages with software vendors like Ansys, Altair, Quantifi, and Dassault, while providing marketing and engineering expertise and resources. These relationships help ensure that popular HPC solutions take full advantage of Intel® hardware and software capabilities so end customers enjoy exceptional power, performance, and price.
HPC Focus Segments
Software vendor HPC applications take advantage of Intel® Xeon® Scalable processors with proven leadership performance1 and the highest available memory bandwidth in any HPC CPU.2 These improvements combined with revolutionary memory and storage capabilities in Intel® Optane™ technology help support compute- and storage-intensive workloads for advanced HPC use cases across multiple segments.
HPC for Manufacturing
Intel-enabled HPC applications allow for fast rendering and simulation for computer-aided design (CAD), computational fluid dynamics (CFD), finite element analysis (FEA), and other fields. These optimizations allow product designers to iterate quickly, create better products, and accelerate prototyping and get to market fast for extreme competitive advantage.
Health and Life Sciences
With enhanced performance to accelerate genomic analysis, sequencing, and medical image segmentation, paired with key storage and memory solutions that handle data sets of increasing size, Intel-enabled HPC applications are pushing the boundaries for healthcare research and discovery.
- Accelerating inference on AMAX deep learning all-in-one systems
- Intel-enabled HPC accelerates single-cell RNA sequencing
- South Africa CHPC mobilizes Intel HPC clusters to fight COVID-19
- HPC improved efficiency for HYHY AI medical imaging
- Intel, TGen, and Dell enable next-generation genomic sequencing
Financial Services
Intel-enabled HPC applications are making an impact in the financial services sector by ramping up capabilities for risk assessment, fraud detection, and commodities trading. Financial firms are benefiting with a sharp competitive advantage to handle more data and support faster transactions.
HPC-AI Convergence
Designed for convergence and optimized with all major AI frameworks, Intel® Xeon® Scalable processors are the only mainstream HPC CPUs with AI acceleration technology built in. Intel® AVX-512 accelerates AI model training by reducing computational requirements when transitioning from FP32 to INT8 data types. Intel® Deep Learning Boost (Intel® DL Boost) provides up to 11x better AI inference, generation over generation on ResNet-50.3
Middleware and Tools
Key software vendor offerings also include middleware and tools that make it easier to manage Intel-enabled HPC clusters, onboard data, accelerate results, or provide rich UI for better ease of use. Intel partnerships with middleware vendors help ensure that end users benefit from high levels of optimization on Intel® architecture.
HPC ISV Applications in the Cloud
ISV partner applications optimized on Intel® architecture deliver the same level of performance on-premises vs. in the cloud. Many software vendors host applications in the cloud, pair applications with specific cloud instances, or assist with onboarding to major cloud service providers (CSPs).
Continue Your HPC ISV Journey
Explore additional resources related to HPC applications.
Products & Technology
Optimize performance and accelerate key workloads for HPC, AI, and cloud convergence.
Product and Performance Information
HPCG: Platinum 8358: 1-node, 2x Intel® Xeon® Platinum 8358 (32C/2.6GHz, 250W TDP) processor on Intel Software Development Platform with 256 GB (16 slots/ 16GB/ 3200) total DDR4 memory, ucode 0x261, HT on, Turbo on, CentOS Linux 8.3.2011, 4.18.0-240.1.1.el8_3.crt1.x86_64, 1x Intel_SSDSC2KG96, App Version: 2019u5 MKL; Build notes: Tools: Intel MKL 2020u4, Intel C Compiler 2020u4, Intel MPI 2019u8; threads/core: 1; Turbo: used; Build knobs: -O3 -ip -xCORE-AVX512. EPYC 7543: 1-node, 2-socket AMD EPYC 7543 (32C/2.8GHz, 240W cTDP) on Dell PowerEdge R7525 server with 1024 GB (16 slots/ 64GB/3200) total DDR4 memory, ucode 0xa001119, SMT on, Boost on, Power deterministic mode, NPS=4, Red Hat Enterprise Linux 8.3, 4.18, 2x Micron 5300 Pro, App Version: 2019u5 MKL; Build notes: Tools: Intel MKL 2020u4, Intel C Compiler 2020u4, Intel MPI 2019u8; threads/core: 1; Turbo: used; Build knobs: -O3 -ip -march=core-avx2, tested by Intel and results as of April 2021. HPL: Platinum 8358: 1-node, 2x Intel® Xeon® Platinum 8358 (32C/2.6GHz, 250W TDP) processor on Intel Software Development Platform with 256 GB (16 slots/ 16GB/ 3200) total DDR4 memory, ucode 0x261, HT on, Turbo on, CentOS Linux 8.3.2011, 4.18.0-240.1.1.el8_3.crt1.x86_64, 1x Intel_SSDSC2KG96, App Version: The Intel Distribution for LINPACK Benchmark; Build notes: Tools: Intel MPI 2019u7; threads/core: 1; Turbo: used; Build: build script from Intel Distribution for LINPACK package; 1 rank per NUMA node: 1 rank per socket, EPYC 7543: 1-node, 2-socket AMD EPYC 7543 (32C/2.8GHz, 240W cTDP) on Dell PowerEdge R7525 server with 1024 GB (16 slots/ 64GB/3200) total DDR4 memory, ucode 0xa001119, SMT on, Boost on, Power deterministic mode, NPS=4, Red Hat Enterprise Linux 8.3, 4.18, 2x Micron 5300 Pro, App Version: AMD official HPL 2.3 MT version with BLIS 2.1; Build notes: Tools: hpc-x 2.7.0; threads/core: 1; Turbo: used; Build: pre-built binary (gcc built) from https://developer.amd.com/amd-aocl/blas-library/; 1 rank per L3 cache, 4 threads per rank, tested by Intel and results as of April 2021. STREAM Triad: Platinum 8358: 1-node, 2x Intel® Xeon® Platinum 8358 (32C/2.6GHz, 250W TDP) processor on Intel Software Development Platform with 256 GB (16 slots/ 16GB/ 3200) total DDR4 memory, ucode 0x261, HT on, Turbo on, CentOS Linux 8.3.2011, 4.18.0-240.1.1.el8_3.crt1.x86_64, 1x Intel_SSDSC2KG96, App Version: McCalpin_STREAM_OMP-version; Build notes: Tools: Intel C Compiler 2019u5; threads/core: 1; Turbo: used; BIOS settings: HT=on Turbo=On SNC=On. EPYC 7543: 1-node, 2-socket AMD EPYC 7543 (32C/2.8GHz, 240W cTDP) on Dell PowerEdge R7525 server with 1024 GB (16 slots/ 64GB/3200) total DDR4 memory, ucode 0xa001119, SMT on, Boost on, Power deterministic mode, NPS=4, Red Hat Enterprise Linux 8.3, 4.18, 2x Micron 5300 Pro, App Version: McCalpin_STREAM_OMP-version; Build notes: Tools: Intel C Compiler 2019u5; threads/core: 1; Turbo: used; BIOS settings: HT=on Turbo=On SNC=On, tested by Intel and results as of April 2021. WRF Geomean of Conus-12km, Conus-2.5km, NWSC-3 NA-3km: Platinum 8358: 1-node 2x Intel® Xeon® Platinum 8358 (32C/2.6GHz, 250W TDP) processor on Intel Software Development Platform with 256 GB (16 slots/ 16GB/ 3200) total DDR4 memory, ucode 0x261, HT on, Turbo on, CentOS Linux 8.3.2011, 4.18.0-240.1.1.el8_3.crt1.x86_64, 1x Intel_SSDSC2KG96, App Version: 4.2.2; Build notes: Intel Fortran Compiler 2020u4, Intel MPI 2020u4; threads/core: 1; Turbo: used; Build knobs:-ip -w -O3 -xCORE-AVX2 -vec-threshold0 -ftz -align array64byte -qno-opt-dynamic-align -fno-alias $(FORMAT_FREE) $(BYTESWAPIO) -fp-model fast=2 -fimf-use-svml=true -inline-max-size=12000 -inline-max-total-size=30000. EPYC 7543: 1-node, 2-socket AMD EPYC 7543 (32C/2.8GHz, 240W cTDP) on Dell PowerEdge R7525 server with 1024 GB (16 slots/ 64GB/3200) total DDR4 memory, ucode 0xa001119, SMT on, Boost on, Power deterministic mode, NPS=4, Red Hat Enterprise Linux 8.3, 4.18, 2x Micron 5300 Pro, App Version: 4.2.2; Build notes: Intel Fortran Compiler 2020u4, Intel MPI 2020u4; threads/core: 1; Turbo: used; Build knobs: -ip -w -O3 -march=core-avx2 -ftz -align all -fno-alias $(FORMAT_FREE) $(BYTESWAPIO) -fp-model fast=2 -inline-max-size=12000 -inline-max-total-size=30000, tested by Intel and results as of April 2021. Binomial Options: Platinum 8358: 1-node, 2x Intel® Xeon® Platinum 8358 (32C/2.6GHz, 250W TDP) processor on Intel Software Development Platform with 256 GB (16 slots/ 16GB/ 3200) total DDR4 memory, ucode 0x261, HT on, Turbo on, CentOS Linux 8.3.2011, 4.18.0-240.1.1.el8_3.crt1.x86_64, 1x Intel_SSDSC2KG96, App Version: v1.0; Build notes: Tools: Intel C Compiler 2020u4, Intel Threading Building Blocks ; threads/core: 2; Turbo: used; Build knobs: -O3 -xCORE-AVX512 -qopt-zmm-usage=high -fimf-domain-exclusion=31 -fimf-accuracy-bits=11 -no-prec-div -no-prec-sqrt. EPYC 7543: 1-node, 2-socket AMD EPYC 7543 (32C/2.8GHz, 240W cTDP) on Dell PowerEdge R7525 server with 1024 GB (16 slots/ 64GB/3200) total DDR4 memory, ucode 0xa001119, SMT on, Boost on, Power deterministic mode, NPS=4, Red Hat Enterprise Linux 8.3, 4.18, 2x Micron 5300 Pro, App Version: v1.0; Build notes: Tools: Intel C Compiler 2020u4, Intel Threading Building Blocks ; threads/core: 2; Turbo: used; Build knobs: -O3 -march=core-avx2 -fimf-domain-exclusion=31 -fimf-accuracy-bits=11 -no-prec-div -no-prec-sqrt, tested by Intel and results as of April 2021. Monte Carlo: Platinum 8358: 1-node, 2x Intel® Xeon® Platinum 8358 (32C/2.6GHz, 250W TDP) processor on Intel Software Development Platform with 256 GB (16 slots/ 16GB/ 3200) total DDR4 memory, ucode 0x261, HT on, Turbo on, CentOS Linux 8.3.2011, 4.18.0-240.1.1.el8_3.crt1.x86_64, 1x Intel_SSDSC2KG96, App Version: v1.1; Build notes: Tools: Intel MKL 2020u4, Intel C Compiler 2020u4, Intel Threading Building Blocks 2020u4; threads/core: 1; Turbo: used; Build knobs: -O3 -xCORE-AVX512 -qopt-zmm-usage=high -fimf-precision=low -fimf-domain-exclusion=31 -no-prec-div -no-prec-sqrt. EPYC 7543: 1-node, 2-socket AMD EPYC 7543 (32C/2.8GHz, 240W cTDP) on Dell PowerEdge R7525 server with 1024 GB (16 slots/ 64GB/3200) total DDR4 memory, ucode 0xa001119, SMT on, Boost on, Power deterministic mode, NPS=4, Red Hat Enterprise Linux 8.3, 4.18, 2x Micron 5300 Pro, App Version: v1.1; Build notes: Tools: Intel MKL 2020u4, Intel C Compiler 2020u4, Intel Threading Building Blocks 2020u4; threads/core: 2; Turbo: used; Build knobs: -O3 -march=core-avx2 -fimf-precision=low -fimf-domain-exclusion=31 -no-prec-div -no-prec-sqrt, tested by Intel and results as of April 2021. Ansys Fluent Geomean of aircraft_wing_14m, aircraft_wing_2m, combustor_12m, combustor_16m, combustor_71m, exhaust_system_33m, fluidized_bed_2m, ice_2m, landing_gear_15m, oil_rig_7m, pump_2m, rotor_3m, sedan_4m: Platinum 8358: 1-node, 2x Intel® Xeon® Platinum 8358 (32C/2.6GHz, 250W TDP) processor on Intel Software Development Platform with 256 GB (16 slots/ 16GB/ 3200) total DDR4 memory, ucode 0x261, HT on, Turbo on, CentOS Linux 8.3.2011, 4.18.0-240.1.1.el8_3.crt1.x86_64, 1x Intel_SSDSC2KG96, App Version: 2021 R1; Build notes: One thread per core; Multi-threading Enabled; Turbo Boost Enabled; Intel FORTRAN Compiler 19.5.0; Intel C/C++ Compiler 19.5.0; Intel Math Kernel Library 2020.0.0; Intel MPI Library 2019 Update 8. EPYC 7543: 1-node, 2-socket AMD EPYC 7543 (32C/2.8GHz, 240W cTDP) on Dell PowerEdge R7525 server with 1024 GB (16 slots/ 64GB/3200) total DDR4 memory, ucode 0xa001119, SMT on, Boost on, Power deterministic mode, NPS=4, Red Hat Enterprise Linux 8.3, 4.18, 2x Micron 5300 Pro, App Version: 2021 R1; Build notes: One thread per core; Multi-threading Enabled; Turbo Boost Enabled; Intel FORTRAN Compiler 19.5.0; Intel C/C++ Compiler 19.5.0; Intel Math Kernel Library 2020.0.0; Intel MPI Library 2019 Update 8, tested by Intel and results as of April 2021. Ansys LS-DYNA Geomean of car2car-120ms, ODB_10M-30ms: Platinum 8358: 1-node, 2x Intel® Xeon® Platinum 8358 (32C/2.6GHz, 250W TDP) processor on Intel Software Development Platform with 256 GB (16 slots/ 16GB/ 3200) total DDR4 memory, ucode 0x261, HT on, Turbo on, CentOS Linux 8.3.2011, 4.18.0-240.1.1.el8_3.crt1.x86_64, 1x Intel_SSDSC2KG96, App Version: R11; Build notes: Tools: Intel Compiler 2019u5 (AVX512), Intel MPI 2019u9; threads/core: 1; Turbo: used. EPYC 7543: 1-node, 2-socket AMD EPYC 7543 (32C/2.8GHz, 240W cTDP) on Dell PowerEdge R7525 server with 1024 GB (16 slots/ 64GB/3200) total DDR4 memory, ucode 0xa001119, SMT on, Boost on, Power deterministic mode, NPS=4, Red Hat Enterprise Linux 8.3, 4.18, 2x Micron 5300 Pro, App Version: R11; Build notes: Tools: Intel Compiler 2019u5 (AMDAVX2), Intel MPI 2019u9; threads/core: 1; Turbo: used, tested by Intel and results as of April 2021. OpenFOAM 42M_cell_motorbike: Platinum 8358: 1-node, 2x Intel® Xeon® Platinum 8358 (32C/2.6GHz, 250W TDP) processor on Intel Software Development Platform with 256 GB (16 slots/ 16GB/ 3200) total DDR4 memory, ucode 0x261, HT on, Turbo on, CentOS Linux 8.3.2011, 4.18.0-240.1.1.el8_3.crt1.x86_64, 1x Intel_SSDSC2KG96, App Version: v8; Build notes: Tools: Intel FORTRAN Compiler 2020u4, Intel C Compiler 2020u4, Intel MPI 2019u8; threads/core: 1; Turbo: used; Build knobs: -O3 -ip -xCORE-AVX512. EPYC 7543: 1-node, 2-socket AMD EPYC 7543 (32C/2.8GHz, 240W cTDP) on Dell PowerEdge R7525 server with 1024 GB (16 slots/ 64GB/3200) total DDR4 memory, ucode 0xa001119, SMT on, Boost on, Power deterministic mode, NPS=4, Red Hat Enterprise Linux 8.3, 4.18, 2x Micron 5300 Pro, App Version: v8; Build notes: Tools: Intel FORTRAN Compiler 2020u4, Intel C Compiler 2020u4, Intel MPI 2019u8; threads/core: 1; Turbo: used; Build knobs: -O3 -ip -march=core-avx2, tested by Intel and results as of April 2021. LAMMPS Geomean of Polyethylene, Stillinger-Weber, Tersoff, Water: Platinum 8358: 1-node, 2x Intel® Xeon® Platinum 8358 (32C/2.6GHz, 250W TDP) processor on Intel Software Development Platform with 256 GB (16 slots/ 16GB/ 3200) total DDR4 memory, ucode 0x261, HT on, Turbo on, CentOS Linux 8.3.2011, 4.18.0-240.1.1.el8_3.crt1.x86_64, 1x Intel_SSDSC2KG96, App Version: v2020-10-29; Build notes: Tools: Intel MKL 2020u4, Intel C Compiler 2020u4, Intel Threading Building Blocks 2020u4, Intel MPI 2019u8; threads/core: 2; Turbo: used; Build knobs: -O3 -ip -xCORE-AVX512 -qopt-zmm-usage=high. EPYC 7543: 1-node, 2-socket AMD EPYC 7543 (32C/2.8GHz, 240W cTDP) on Dell PowerEdge R7525 server with 1024 GB (16 slots/ 64GB/3200) total DDR4 memory, ucode 0xa001119, SMT on, Boost on, Power deterministic mode, NPS=4, Red Hat Enterprise Linux 8.3, 4.18, 2x Micron 5300 Pro, App Version: v2020-10-29; Build notes: Tools: Intel MKL 2020u4, Intel C Compiler 2020u4, Intel Threading Building Blocks 2020u4, Intel MPI 2019u8; threads/core: 2; Turbo: used; Build knobs: -O3 -ip -march=core-avx2, tested by Intel and results as of April 2021. NAMD Geomean of Apoa1, STMV: Platinum 8358: 1-node, 2x Intel® Xeon® Platinum 8358 (32C/2.6GHz, 250W TDP) processor on Intel Software Development Platform with 256 GB (16 slots/ 16GB/ 3200) total DDR4 memory, ucode 0x261, HT on, Turbo on, CentOS Linux 8.3.2011, 4.18.0-240.1.1.el8_3.crt1.x86_64, 1x Intel_SSDSC2KG96, App Version: 2.15-Alpha1 (includes AVX tiles algorithm); Build notes: Tools: Intel MKL , Intel C Compiler 2020u4, Intel MPI 2019u8, Intel Threading Building Blocks 2020u4; threads/core: 2; Turbo: used; Build knobs: -ip -fp-model fast=2 -no-prec-div -qoverride-limits -qopenmp-simd -O3 -xCORE-AVX512 -qopt-zmm-usage=high. EPYC 7543: 1-node, 2-socket AMD EPYC 7543 (32C/2.8GHz, 240W cTDP) on Dell PowerEdge R7525 server with 1024 GB (16 slots/ 64GB/3200) total DDR4 memory, ucode 0xa001119, SMT on, Boost on, Power deterministic mode, NPS=4, Red Hat Enterprise Linux 8.3, 4.18, 2x Micron 5300 Pro, App Version: 2.15-Alpha1 (includes AVX tiles algorithm); Build notes: Tools: Intel MKL , AOCC 2.2.0, gcc 9.3.0, Intel MPI 2019u8; threads/core: 2; Turbo: used; Build knobs: -O3 -fomit-frame-pointer -march=znver1 -ffast-math, tested by Intel and results as of April 2021. RELION Plasmodium Ribosome: Platinum 8358: 1-node, 2x Intel® Xeon® Platinum 8358 (32C/2.6GHz, 250W TDP) processor on Intel Software Development Platform with 256 GB (16 slots/ 16GB/ 3200) total DDR4 memory, ucode 0x261, HT on, Turbo on, CentOS Linux 8.3.2011, 4.18.0-240.1.1.el8_3.crt1.x86_64, 1x Intel_SSDSC2KG96, App Version: 3_1_1; Build notes: Tools: Intel C Compiler 2020u4, Intel MPI 2019u9; threads/core: 2; Turbo: used; Build knobs: -O3 -ip -g -debug inline-debug-info -xCOMMON-AVX512 -qopt-report=5 –restrict. EPYC 7543: 1-node, 2-socket AMD EPYC 7543 (32C/2.8GHz, 240W cTDP) on Dell PowerEdge R7525 server with 1024 GB (16 slots/ 64GB/3200) total DDR4 memory, ucode 0xa001119, SMT on, Boost on, Power deterministic mode, NPS=4, Red Hat Enterprise Linux 8.3, 4.18, 2x Micron 5300 Pro, App Version: 3_1_1; Build notes: Tools: Intel C Compiler 2020u4, Intel MPI 2019u9; threads/core: 2; Turbo: used; Build knobs: -O3 -ip -g -debug inline-debug-info -march=core-avx2 -qopt-report=5 -restrict, tested by Intel and results as of April 2021.
Highest available memory bandwidth of any HPC CPU (9200) with 12 DDR4 channels for powering memory-bound workloads. As of July 21, 2021, Intel offers up to 12 DDR4 memory channels for Intel® Xeon® processors, as compared to 8 for AMD EPYC.
11X higher batch AI inference performance with Intel-optimized Tensor Flow vs. stock Cascade Lake FP32 configuration New: 8380: 1-node, 2x Intel Xeon Platinum 8380 processor on Coyote Pass with 512 GB (16 slots/32GB/3200) total DDR4 memory, ucode X261, HT on, Turbo on, Ubuntu 20.04 LTS, 5.4.0-65-generic, 1x Intel_SSDSC2KG96, Intel SSDPE2KX010T8, ResNet-50 v1.5, gcc-9.3.0, oneDNN 1.6.4, BS=128 FP32, INT8, TensorFlow 2.4.1 with Intel optimizations for 3rd Gen Intel Xeon Scalable processor, upstreamed to TensorFlow-2.5 (container-intel/intel-optimized-tensorflow:tf-r2.5-icx-b631821f), Model zoo: https://github.com/IntelAI/models/tree/icx-launch-public/quickstart/, Unoptimized model: TensorFlow-2.4.1, Model zoo: https://github.com/IntelAI/models -b master, test by Intel on 3/12/2021. Baseline: 8280: 1-node, 2x Intel Xeon Platinum 8280 processor on Wolf Pass with 384 GB (12 slots/32GB/2933) total DDR4 memory, ucode 0x5003003, HT on, Turbo on, Ubuntu 20.04 LTS, 5.4.0-48-generic, 1x Samsung_SSD_860, Intel SSDPE2KX040T8, ResNet-50 v1.5, gcc-9.3.0, oneDNN 1.6.4, BS=128 FP32, INT8, Optimized model: TensorFlow 2.4.1 with Intel optimizations for 3rd Gen Intel Xeon Scalable processor, upstreamed to TensorFlow-2.5 (container-intel/intel-optimized-tensorflow:tf-r2.5-icx-b631821f), Model zoo: https://github.com/IntelAI/models/tree/icx-launch-public/quickstart/, Unoptimized model: TensorFlow-2.4.1, Model zoo: https://github.com/IntelAI/models -b master, test by Intel on 2/17/2021.