I am pleased to announce the latest Intel® oneAPI tools release (2022.3) in support of open accellerated computing. We look forward to your feedback.
The oneAPI initiative, and its open specification, are focused on enabling an open, multiarchitecture world with strong support for software developers—a world of open choice without sacrificing performance or functionality.
These oneAPI tools are based on decades of Intel software products that have been globally used to build myriad applications and they incorporate multiple open-source projects to which Intel is proud to contribute.
Our oneAPI tools emphasize support for standards including C, C++, Fortran, MPI, OpenMP, and SYCL. We also provide support for accelerating Python (an open source de facto standard).
GROMACS: The Promise of SYCL + oneAPI is delivered
SYCL and oneAPI aim to provide a a vendor neutral solution to allowing a single source, single build, solution to deliver performance and portability across devices from multiple vendors.
GROMACS has embraced SYCL for exactly this reason. GROMACS is a molecular dynamics code and it has been estimated that 4-5% of supercomputing cycles go to running GROMACS.
In a recent GROMACS demo shown at Intel Innovation, a single binary was run on a machine with GPUs from multiple vendors. Thanks to SYCL and oneAPI, the binary was able to use all the GPUs and the CPU for execution – and each was efficiently used for impressive performance. This was not a stunt – it was the outcome we aim for when using SYCL and oneAPI.
The decision that GROMACS, and other projects, are making to use vendor and architecture agnostic approaches instead of being locked into proprietary approaches, highlights why SYCL and oneAPI continue to receive so much attention. Join us as they grow in capabilities.
oneAPI tools from Intel, including support for SYCL 2020, continue to grow.
We are very appreciative of all the support and feedback from users supporting open accellerated computing that helps guide this work. We are really blessed with an enthusiastic group of developers who want open to win.
The C++ compiler adds more SYCL 2020 features (built from the LLVM SYCL project called DPC++) to improve developer productivity for programming various hardware accelerators including GPUs and FPGAs. Migration of CUDA code continues to be made easier thanks to support by the SYCLomatic project (the prebuilt and Intel supported version is still called ‘Intel® DPC++ Compatibility Tool’). This migration tool has added support for CUDA 11.7 header files, CUDA runtimes, CUDA driver APIs, cuDNN, NCCL, Thrust, cuBLAS, and cuFFT to help migration of the latest CUDA codes.
For ease of coding common functions, Intel® oneAPI DPC++ Library expands support for the C++ standard library in SYCL kernels with nine additional heap and sorting algorithms. The Intel® oneAPI Math Kernel Library adds features for more efficient GPU resource allocation as well as faster detection and recovery from exceptions. The Intel® oneAPI Video Processing Library now includes the ability to provide extensive data about what is encoded, opening opportunities for quality improvement and algorithm innovation. It was also upstreamed to the popular FFMPEG open source project, achieving high quality streaming on Intel GPUs.
LLVM + Fortran = Bright Future for Fortran
For several years we have offered our very popular 'classic' Fortran compiler while we started a migration to LLVM. The LLVM version offers a bright future for this favorite Fortran implementation. We will continue supporting the ‘classic’ version as users transition to the LLVM -based Fortran. We are happy to announce, our LLVM-based Fortran has caught up with the classic in terms of functionality. Because the LLVM-based Fortran is superior in support for offload, the LLVM-based Fortran is ahead in important ways. Performance tuning remains, but in many cases the LLVM-based compiler may offer performance advantages as well. We are eager for feedback.
The Intel® Fortran Compiler, based on modern LLVM technology, now has full support for Fortran 2003, 2008, and 2018 including coarrays, DLLImport/ DLLExport, and DO CONCURRENT offload support. We believe our Fortran continues to offer by far the best Fortran support in the world, a legacy we inherited from DEC VMS Fortran from which our frontend came to us along with the engineers who created it.
Fortran continues to power more than half the computation cycles in the HPC (High Performance Computing) world. The number of Fortran developers worldwide continues to grow. Scientist and Engineers need the strengths of Fortran, and all the computer science oriented shortcomings do not matter. In fact, there is a resurgence in recognizing the dangers of pointers (consider Rust vs. C++) that could notice Fortran avoided pointers in 1956 from the start. Shout out to Jim Cownie for his recent blog highlighting Fortran’s importance and tackling the myth that Fortran is dying (or dead). At Intel, we could not agree more the Fortran remain an important part of accelerated computing alongside C, C++, OpenMP, SYCL, MPI, Python, and more. We invest in them all heavily.
Our C++ and Fortran implementations offer improved portability & compatibility by adding extended OpenMP cluster offload capability, and continued strong support for OpenMP accelerator offload.
Performance is fashionable everywhere.
You cannot love performance for software and not include Fortran. Loving Fortran is a love for speed that carries into our work on C++, Python, and all AI tools. We believe in open foundations that offer multivendor and multiarchitecture support for an open future. That offers a future with the best portability and performance together.
AI Support Where It Is Needed
The optimizations for AI workloads are nothing short of amazing.The AI support in 2022.3 is a direct way to ensure you get the best, but because our support is already being adopted into projects, builds, and many other places there is also a high chance it is already reaching you through other channels. It never hurts to grab it from Intel also, to be sure. The AI optimizations from Intel are quite the important success story at helping AI it happens most: on Intel processors.
Optimizations for PyTorch now include automatic int8 quantization and the MultiStreamModule feature further boosts throughput in the offline inference scenario. The Intel® Neural Compressor improves your productivity with a lighter binary size and a new quantization accuracy diagnostic feature supported by GUI. It offers TensorFlow quantization API, QDQ quantization for ITEX, mixed precision enhancement, DyNAS, training for block-wise structure sparsity, and op-type wise tuning strategy.
Get oneAPI with Updates Now
If you are new to oneAPI, or looking for updates start with a visit to the Intel oneAPI tools on Intel Developer Zone.
The tools are also accessible on the Intel DevCloud which includes very useful oneAPI training resources. This in-the-cloud environment is a fantastic place to develop, test and optimize code, for free, across an array of Intel CPUs and accelerators without having to set up your own systems.
About oneAPI
oneAPI is an open, unified, and cross-architecture programming model for heterogeneous computing. oneAPI and SYCL support open accelerated computing by embracing an open, multivendor, multiarchitecture approach to programming tools (libraries, compilers, debuggers, analyzers, frameworks, etc.). Based on standards, the programming model simplifies software development and delivers uncompromised performance for accelerated compute without proprietary lock-in, while enabling the integration of legacy code.