Developer Reference

Migrating OpenCL™ FPGA Designs to SYCL*

ID 767849
Date 5/08/2024
Public

Device Code Modification

This topic describes how to convert your OpenCL device code to SYCL*. Assume that you have a simple OpenCL kernel that appears as the following code snippet:

__kernel void copy_kernel(__global int *in, __global int *out, int N) {
  for (int i = 0; i < N; i++)
    out[i] = in[i];
}

The following sections describe the two most basic options for converting your code:

Method 1: Change Your .cl File to a .hpp File

  1. Make your .cl source file a header file by renaming the extension from .cl to .hpp.
  2. Add #include guards to your new header file to avoid double inclusion since this header file is included in the main .cpp file.
  3. Remove __kernel and __global OpenCL keywords from the function.
  4. Include the new header file in your main file.

Suppose your kernel file is called copy_kernel.hpp (previously kernel.cl) and your main file is called main.cpp, and you have the following code:

//// copy_kernel.hpp

#ifndef __COPY_KERNEL_HPP__
#define __COPY_KERNEL_HPP__

void copy_kernel(int *in, int *out, int N) {
  for (int i = 0; i < N; i++)
    out[i] = in[i];
}

#endif
//// main.cpp
#include <sycl/sycl.hpp>
#include <sycl/ext/intel/fpga_extensions.hpp>

using namespace sycl;

#include “copy_kernel.hpp”

// Forward declare the kernel name in the global scope to reduce
// name mangling. This is an FPGA best practice that makes it
// easier to identify the kernel in the optimization reports.
class CopyKernel;

void main() {
  // Create the device queue using the FPGA device selector.
  queue device_queue(ext::intel::fpga_selector());
    
  // The input and output data on the host.
  int in_data[N], out_data[N];

  // Allocate memory on the device.
  int *in = malloc_device<int>(N, device_queue);
  int *out = malloc_device<int>(N, device_queue);

  // Copy input data to the device and wait for it to finish.
  device_queue.memcpy(in, in_data, N * sizeof(int)).wait();

  // Submit kernel to the device queue.
  event my_event = device_queue.single_task<CopyKernel>([=] {
    copy_kernel(in, out, N);
  });
  my_event.wait();  // wait on the kernel to finish

  // Copy output data back from the device and wait for it to finish.
  device_queue.memcpy(out_data, out, N * sizeof(int)).wait();
};

Method 2: Embed Your Kernel Code Inside the Lambda

In this method, you copy your OpenCL kernel contents into the kernel function lambda, as shown in the following example:

//// main.cpp
#include <sycl/sycl.hpp>
#include <sycl/ext/intel/fpga_extensions.hpp>

using namespace sycl;

// Forward declare the kernel name in the global scope to reduce
// name mangling. This is an FPGA best practice that makes it
// easier to identify the kernel in the optimization reports.
class CopyKernel;

void main() {
  // Create the device queue using the FPGA device selector.
  queue device_queue(ext::intel::fpga_selector());
    
  // The input and output data on the host.
  int in_data[N], out_data[N];

  // Allocate memory on the device.
  int *in = malloc_device<int>(N, device_queue);
  int *out = malloc_device<int>(N, device_queue);

  // Copy input data to the device and wait for it to finish.
  device_queue.memcpy(in, in_data, N * sizeof(int)).wait();

  // Submit kernel to the device queue.
  event my_event = device_queue.single_task<CopyKernel>([=] {
    for (int i = 0; i < N; i++)
      out[i] = in[i];
  });
  my_event.wait();  // wait on the kernel to finish

  // Copy output data back from the device and wait for it to finish.
  device_queue.memcpy(out_data, out, N * sizeof(int)).wait();
};