Visible to Intel only — GUID: GUID-FB3198A4-421A-4842-ABE1-43D4F64ECE10
Visible to Intel only — GUID: GUID-FB3198A4-421A-4842-ABE1-43D4F64ECE10
DPCT1065
Message
Consider replacing sycl::<...>::barrier() with sycl::<...>::barrier(sycl::access::fence_space::local_space) for better performance if there is no access to global memory.
Detailed Help
The function sycl::<...>::barrier() ensures correct memory access ordering in the global and local address space. If the kernel function has no memory accesses in the global memory, it is safe to replace sycl::<...>::barrier() with sycl::<...>::barrier(sycl::access::fence_space::local_space) for better performance.
Suggestions to Fix
Replace sycl::<...>::barrier() with sycl::<...>::barrier(sycl::access::fence_space::local_space).
For example, this original CUDA* code:
struct Data_t { float *host_data; float *device_data; }; __global__ void k(Data_t *data) { auto tid = threadIdx.x + blockDim.x * blockIdx.x; only_read_data(data[tid].device_data); __syncthreads(); only_read_data(data[tid].device_data); }
results in the following migrated SYCL* code:
struct Data_t { float *host_data; float *device_data; }; void k(Data_t *data, const sycl::nd_item<3> &item_ct1) { auto tid = item_ct1.get_local_id(2) + item_ct1.get_local_range(2) * item_ct1.get_group(2); only_read_data(data[tid].device_data); /* DPCT1065:0: Consider replacing sycl::nd_item::barrier() with sycl::nd_item::barrier(sycl::access::fence_space::local_space) for better performance if there is no access to global memory. */ item_ct1.barrier(); only_read_data(data[tid].device_data); }
which is rewritten to:
struct Data_t { float *host_data; float *device_data; }; void k(Data_t *data, const sycl::nd_item<3> &item_ct1) { auto tid = item_ct1.get_local_id(2) + item_ct1.get_local_range(2) * item_ct1.get_group(2); only_read_data(data[tid].device_data); // global_local_space can be replaced with local_space if the access // of the global memory after the barrier does not depend on (read-after-write or // write-after-read or write-after-write) the access of the same global memory // before the barrier among work-items in the current work-group. item_ct1.barrier(sycl::access::fence_space::local_space); only_read_data(data[tid].device_data); }