Visible to Intel only — GUID: GUID-4F1139BF-7987-45FC-B081-5DB025D5633A
Visible to Intel only — GUID: GUID-4F1139BF-7987-45FC-B081-5DB025D5633A
DPCT1049
Message
The work-group size passed to the SYCL kernel may exceed the limit. To get the device limit, query info::device::max_work_group_size. Adjust the work-group size if needed.
Detailed Help
The work-group size passed to the SYCL* kernel for SYCL device has a limit (see SYCL 2020 standard, 4.6.4.2 Device information descriptors).
This warning appears if dimensions of the local range could not all be evaluated, or if the product of the dimensions of the local range is more than 256.
Suggestions to Fix
Query info::device::max_work_group_size to define the work-group size limit for the device you use. If the work-group size used in the code is below the limit, you can ignore this warning. Otherwise, you need to decrease the work-group size.
For example, this original CUDA* code:
__global__ void k() {}
void foo() {
k<<<1, 2048>>>();
}
results in the following migrated SYCL code:
void k() {}
void foo() {
/*
DPCT1049:0: The work-group size passed to the SYCL kernel may exceed the
limit. To get the device limit, query info::device::max_work_group_size.
Adjust the work-group size if needed.
*/
dpct::get_default_queue().parallel_for(
sycl::nd_range<3>(sycl::range<3>(1, 1, 2048), sycl::range<3>(1, 1, 2048)),
[=](sycl::nd_item<3> item_ct1) {
k();
});
}
which is rewritten to:
void k() {}
void foo() {
size_t max_work_group_size =
dpct::get_default_queue()
.get_device()
.get_info<sycl::info::device::max_work_group_size>();
size_t work_group_size = 2048;
if (work_group_size > max_work_group_size) {
work_group_size = max_work_group_size;
}
size_t work_group_num = std::ceil((float)2048 / (float)work_group_size);
dpct::get_default_queue().parallel_for(
sycl::nd_range<3>(sycl::range<3>(1, 1, work_group_num * work_group_size),
sycl::range<3>(1, 1, work_group_size)),
[=](sycl::nd_item<3> item_ct1) { k(); });
}