Device Code Modification
This topic describes how to convert your OpenCL device code to SYCL*. Assume that you have a simple OpenCL kernel that appears as the following code snippet:
__kernel void copy_kernel(__global int *in, __global int *out, int N) {
for (int i = 0; i < N; i++)
out[i] = in[i];
}
The following sections describe the two most basic options for converting your code:
Method 1: Change Your .cl File to a .hpp File
- Make your .cl source file a header file by renaming the extension from .cl to .hpp.
- Add #include guards to your new header file to avoid double inclusion since this header file is included in the main .cpp file.
- Remove __kernel and __global OpenCL keywords from the function.
- Include the new header file in your main file.
Suppose your kernel file is called copy_kernel.hpp (previously kernel.cl) and your main file is called main.cpp, and you have the following code:
//// copy_kernel.hpp
#ifndef __COPY_KERNEL_HPP__
#define __COPY_KERNEL_HPP__
void copy_kernel(int *in, int *out, int N) {
for (int i = 0; i < N; i++)
out[i] = in[i];
}
#endif
//// main.cpp
#include <sycl/sycl.hpp>
#include <sycl/ext/intel/fpga_extensions.hpp>
using namespace sycl;
#include “copy_kernel.hpp”
// Forward declare the kernel name in the global scope to reduce
// name mangling. This is an FPGA best practice that makes it
// easier to identify the kernel in the optimization reports.
class CopyKernel;
void main() {
// Create the device queue using the FPGA device selector.
queue device_queue(ext::intel::fpga_selector());
// The input and output data on the host.
int in_data[N], out_data[N];
// Allocate memory on the device.
int *in = malloc_device<int>(N, device_queue);
int *out = malloc_device<int>(N, device_queue);
// Copy input data to the device and wait for it to finish.
device_queue.memcpy(in, in_data, N * sizeof(int)).wait();
// Submit kernel to the device queue.
event my_event = device_queue.single_task<CopyKernel>([=] {
copy_kernel(in, out, N);
});
my_event.wait(); // wait on the kernel to finish
// Copy output data back from the device and wait for it to finish.
device_queue.memcpy(out_data, out, N * sizeof(int)).wait();
};
Method 2: Embed Your Kernel Code Inside the Lambda
In this method, you copy your OpenCL kernel contents into the kernel function lambda, as shown in the following example:
//// main.cpp
#include <sycl/sycl.hpp>
#include <sycl/ext/intel/fpga_extensions.hpp>
using namespace sycl;
// Forward declare the kernel name in the global scope to reduce
// name mangling. This is an FPGA best practice that makes it
// easier to identify the kernel in the optimization reports.
class CopyKernel;
void main() {
// Create the device queue using the FPGA device selector.
queue device_queue(ext::intel::fpga_selector());
// The input and output data on the host.
int in_data[N], out_data[N];
// Allocate memory on the device.
int *in = malloc_device<int>(N, device_queue);
int *out = malloc_device<int>(N, device_queue);
// Copy input data to the device and wait for it to finish.
device_queue.memcpy(in, in_data, N * sizeof(int)).wait();
// Submit kernel to the device queue.
event my_event = device_queue.single_task<CopyKernel>([=] {
for (int i = 0; i < N; i++)
out[i] = in[i];
});
my_event.wait(); // wait on the kernel to finish
// Copy output data back from the device and wait for it to finish.
device_queue.memcpy(out_data, out, N * sizeof(int)).wait();
};
Parent topic: Basic Modifications