DPCT1018

Intel® DPC++ Compatibility Tool Developer Guide and Reference

Download PDF

ID 768918

Date 6/24/2024

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Visible to Intel only — GUID: GUID-EF34DA01-70B6-444B-8B69-D50AC6E7379A

View Details

DPCT1018

Message

The <API name> was migrated, but due to <reason>, the generated code performance may be sub-optimal.

Detailed Help

This warning appears in the following cases:

Migration of the cublasSetMatrix function. Intel® DPC++ Compatibility Tool replaced the cublasSetMatrix with memory copying from the host to the device. When the rows parameter of the cublasSetMatrix is smaller than the lda parameter, the generated code copies more data (lda*cols) than the actual data available in the matrix (rows*cols).

To improve performance, consider changing the values of lda and ldb. If the rows parameter is greater than or equal to lda, no action is required for this code.
Migration of the cublasSetVector function. Intel® DPC++ Compatibility Tool replaced the cublasSetVector with memory copying from the host to the device. When the incx parameter of the cublasSetVector equals the incy parameter, but is greater than 1, the generated code copies more data (incx*n) than the actual data available in the vector (n). To improve performance, consider changing the values of incx and incy.

Suggestions to Fix

If the rows parameter of the cublasSetMatrix is smaller than the lda parameter and you observe performance issues, consider changing the values of lda and ldb.

If the incx parameter of the cublasSetVector equals the incy parameter, but is greater than 1 and you observe performance issues, consider changing the values of incx and incy.

For example, this original CUDA* code:

void foo() {
  const int element_num = 128;
  const int h_inc = 128;
  const int d_inc = 128;
  cublasSetVector(element_num, sizeof(float), data, h_inc, d_data, d_inc);
}

results in the following migrated SYCL* code:

void foo() {
  const int element_num = 128;
  const int h_inc = 128;
  const int d_inc = 128;
  /*
  DPCT1018:0: The cublasSetVector was migrated, but due to parameter h_inc
  equals to parameter d_inc but greater than 1, the generated code performance
  may be sub-optimal.
  */
  dpct::matrix_mem_copy((void *)d_data, (void *)data, d_inc, h_inc, 1,
                        element_num, sizeof(float));
}

which is rewritten to:

void foo() {
  const int element_num = 128;

  //Save the data in d_data continuously and change h_inc and d_inc from 128 to 1.
  const int h_inc = 1;
  const int d_inc = 1;

  // Now there is no padding between each element, so memcpy can be used directly.
  dpct::get_default_queue().memcpy(d_data, data, sizeof(float) * element_num).wait();
}

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® DPC++ Compatibility Tool Developer Guide and Reference

DPCT1018

Message

Detailed Help

Suggestions to Fix