Visible to Intel only — GUID: GUID-E7A31C29-ABD9-43A2-8FC5-9CE095A9D903
Execution Model Overview
Thread Mapping and GPU Occupancy
Kernels
Using Libraries for GPU Offload
Host/Device Memory, Buffer and USM
Host/Device Coordination
Using Multiple Heterogeneous Devices
Compilation
OpenMP Offloading Tuning Guide
Multi-GPU, Multi-Stack and Multi-C-Slice Architecture and Programming
Level Zero
Performance Profiling and Analysis
Configuring GPU Device
Sub-Groups and SIMD Vectorization
Removing Conditional Checks
Registerization and Avoiding Register Spills
Porting Code with High Register Pressure to Intel® Max GPUs
Small Register Mode vs. Large Register Mode
Shared Local Memory
Pointer Aliasing and the Restrict Directive
Synchronization among Threads in a Kernel
Considerations for Selecting Work-Group Size
Prefetch
Reduction
Kernel Launch
Executing Multiple Kernels on the Device at the Same Time
Submitting Kernels to Multiple Queues
Avoiding Redundant Queue Constructions
Programming Intel® XMX Using SYCL Joint Matrix Extension
Doing I/O in the Kernel
Explicit Scaling on Multi-GPU, Multi-Stack, Multi-C-Slice in SYCL
Explicit Scaling Using Intel® oneAPI Math Kernel Library (oneMKL) in SYCL
Explicit Scaling on Multi-GPU, Multi-Stack and Multi-C-Slice in OpenMP
Explicit Scaling Using Intel® oneAPI Math Kernel Library (oneMKL) in OpenMP
Explicit Scaling Summary
Visible to Intel only — GUID: GUID-E7A31C29-ABD9-43A2-8FC5-9CE095A9D903
References
For more information, see:
Intel® oneAPI DPC++/C++ Compiler Developer Guide and Reference
Intel® Fortran Compiler Classic and Intel® Fortran Compiler Developer Guide and Reference
OpenMP Features and Extensions Supported in Intel® oneAPI DPC++/C++ Compiler
Fortran Language and OpenMP Features Implemented in Intel® Fortran Compiler (Beta)
Developer Reference for Intel® oneAPI Math Kernel Library - C