Visible to Intel only — GUID: GUID-0A8FBA1D-7909-4302-90CA-414A092849C6
Introduction
Getting Started
Parallelization
Intel® Iris® Xe GPU Architecture
GPU Execution Model Overview
SYCL* Thread Mapping and GPU Occupancy
Kernels
Using Libraries for GPU Offload
Host/Device Memory, Buffer and USM
Host/Device Coordination
Using Multiple Heterogeneous Devices
Compilation
Optimizing Media Pipelines
OpenMP Offloading Tuning Guide
Debugging and Profiling
GPU Analysis with Intel® Graphics Performance Analyzers (Intel® GPA)
Reference
Terms and Conditions
Sub-groups and SIMD Vectorization
Removing Conditional Checks
Registerization and Avoid Register Spills
Shared Local Memory
Pointer Aliasing and the Restrict Directive
Synchronization among Threads in a Kernel
Considerations for Selecting Work-group Size
Reduction
Kernel Launch
Executing Multiple Kernels on the Device at the Same Time
Submitting Kernels to Multiple Queues
Avoid Redundant Queue Construction
Visible to Intel only — GUID: GUID-0A8FBA1D-7909-4302-90CA-414A092849C6
Kernels
A kernel is the unit of computation in the oneAPI offload model. By submitting a kernel on an iteration space, you are requesting that the computation be applied to the specified data objects.
In this section we cover topics related to the coding, submission, and execution of kernels.
- Sub-groups and SIMD Vectorization
- Removing Conditional Checks
- Registerization and Avoid Register Spills
- Shared Local Memory
- Pointer Aliasing and the Restrict Directive
- Synchronization among Threads in a Kernel
- Considerations for Selecting Work-group Size
- Reduction
- Kernel Launch
- Executing Multiple Kernels on the Device at the Same Time
- Submitting Kernels to Multiple Queues
- Avoid Redundant Queue Construction