Visible to Intel only — GUID: GUID-844185B8-672A-490B-A7A0-ACCEEF129387
Visible to Intel only — GUID: GUID-844185B8-672A-490B-A7A0-ACCEEF129387
Synchronization among Threads in a Kernel
There are a variety of ways in which the work-items in a kernel can synchronize to exchange data, update data, or cooperate with each other to accomplish a task in a specific order. These are:
-
- Accessor classes
-
Accessor classes specify acquisition and release of buffer and image data structures. Depending on where they are created and destroyed, the runtime generates appropriate data transfers and synchronization primitives.
-
- Atomic operations
-
SYCL devices support a restricted subset of C++ atomics.
-
- Fences
-
Fence primitives are used to order loads and stores. Fences can have acquire semantics, release semantics, or both.
-
- Barriers
-
Barriers are used to synchronize sets of work-items within individual groups.
-
- Hierarchical parallel dispatch
-
In the hierarchical parallelism model of describing computations, synchronization within the work-group is made explicit through multiple instances of the parallel_for_work_item function call, rather than through the use of explicit work-group barrier operations.
-
- Device event
-
Events are used inside kernel functions to wait for asynchronous operations to complete.
In many cases, any of the preceding synchronization events can be used to achieve the same functionality, but with significant differences in efficiency and performance.