SYCL* Thread and Memory Hierarchy

Intel® oneAPI Programming Guide

Download PDF

ID 771723

Date 11/08/2023

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Visible to Intel only — GUID: GUID-8A454FD7-BC53-416B-A635-D13E0A5EF54E

View Details

SYCL* Thread and Memory Hierarchy

Thread Hierarchy

The SYCL* execution model exposes an abstract view of GPU execution. The SYCL thread hierarchy consists of a 1-, 2-, or 3-dimensional grid of work-items. These work-items are grouped into equal sized thread groups called work-groups. Threads in a work-group are further divided into equal sized vector groups called sub-groups.

To learn more about how this hierarchy works with a GPUor a CPU with Intel® UHD Graphics, see SYCL* Thread Mapping and GPU Occupancy in the oneAPI GPU Optimization Guide.

Memory Hierarchy

The General Purpose GPU (GPGPU) compute model consists of a host connected to one or more compute devices. Each compute device consists of many GPU Compute Engines (CE), also known as Execution Units (EU) or Xe Vector Engines (XVE). The compute devices may also include caches, shared local memory (SLM), high-bandwidth memory (HBM), and so on, as shown in the figure below. Applications are then built as a combination of host software (per the host framework) and kernels submitted by the host to run on the VEs with a predefined decoupling point.

To learn more about memory hierarchy within the General Purpose GPU (GPGPU) compute model, see Execution Model Overview in the oneAPI GPU Optimization Guide.

Using Data Prefetching to Reduce Memory Latency in GPUs

Utilizing data prefetching can reduce the amount of write backs, reduce latency, and improve performance in Intel® GPUs.

To learn more about how prefetching works with oneAPI, see Prefetching in the oneAPI GPU Optimization Guide.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® oneAPI Programming Guide

SYCL* Thread and Memory Hierarchy

Thread Hierarchy

Memory Hierarchy

Using Data Prefetching to Reduce Memory Latency in GPUs