Intel® Inspector User Guide for Linux* OS

ID 767796
Date 3/22/2024
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Asynchronous Buffer

Occurs when operations between a program executed on host and a kernel are not synchronized.

Problem type: Uninitialized memory access

ID

Code Location

Description

1

Allocation site

Represents a source location of passing data from a host program to a buffer without synchronization.

2

Read

Represents a source location of copying data from a buffer to host program, when kernel execution is not yet complete.

If the operations of passing data to a buffer and copying the calculated data from this buffer back to the host are not synchronized, the program copies data from the buffer before the kernel completes execution. This results in getting initially passed data from the device kernel.

DPC++ Example

queue.submit([&](cl::sycl::handler &cgh) 

{ 

   cgh.parallel_for<class my_task>(cl::sycl::range<1> { N }, [=](cl::sycl::id<1> idx) 

   { 

    // We compute squares 

   deviceData[idx] *= deviceData[idx]; 

   }); 

}); 

// queue.wait(); 


for(int i=0; i<n; i++)   std::cout << deviceData[i] << “ “; 

Possible Correction Strategies

To copy the correct data to the buffer after a kernel execution, do the following:

  • In the OpenCL™ kernel, use the following events to set kernel ordering:
    • clGetEventInfo() enables you to get information about the current state of the kernel.
    • clFinish() waits until the kernel execution ends and its state is completed.
  • In the Data Parallel C++ (DPC++) program, use the queue.wait(); command to wait until the end of kernel execution before copying the calculated data to the host.