Visible to Intel only — GUID: GUID-5F4625B8-55F7-4D1F-A36E-8DCDAF2EB978
Visible to Intel only — GUID: GUID-5F4625B8-55F7-4D1F-A36E-8DCDAF2EB978
Pipelined Kernels
By default, SYCL* task kernels are not pipelined. A kernel blocks further kernel invocations from starting until the current invocation has completed.
However, kernels with a ready/valid handshaking invocation interface can be pipelined. For information about kernels with a ready/valid handshaking invocation interface, refer to Ready/Valid Handshaking Kernel Invocation Interface.
You can pipeline kernels by using the pipelined kernel property.
Separate invocations of a kernel are independent. The independence of separate kernel invocation means that you cannot make assumptions about memory ordering or memory dependencies between kernel invocations.
Ensure that you use synchronization mechanisms such as the .wait() function or SYCL* atomic operations to avoid race conditions.
If you want to guarantee sequential equivalence, write your kernel with a while(1) loop in the kernel body instead of using a pipelined kernel.
This restriction can affect you if you are migrating code from the Intel® HLS Compiler to SCYL* code. Repeatedly-invoked kernel code that would have worked correctly when compiled by the Intel® HLS Compiler might result in undefined behavior in SYCL* code and might not function as you expect.
The pipelined Kernel Property
To pipeline a streaming kernel, add the pipelined<N> property to your streaming kernel.
The pipelined<N> property takes the following values for N:
Property | Description |
---|---|
pipelined<-1> pipelined<> |
The compiler generates hardware that allows kernel invocations to execute in a pipelined fashion, while attempting to achieve the lowest possible II (initiation interval) at the targeted fMAX. This is the default behavior if no value for N is specified. |
pipelined<N> where N > 0 |
Where N represents the desired II value. The compiler generates hardware that allows kernel invocations to execute in a pipelined fashion, while attempting to achieve the specified II (N) at the targeted fMAX. |
pipelined<0> | The compiler does not generate hardware to allows kernel invocations to execute in a pipelined fashion. This is the equivalent of not specifying the pipelined kernel property. |
To use the pipelined kernel property:
- Include the following header file in your code:
sycl/ext/intel/fpga_extensions.hpp
- Label your kernel with the pipelined property as follows:
- Functor Model:
- Add a member function named get to the functor. Have the get function take an argument of type properties_tag and a return type auto.
- Create a properties object in the new function with the streaming_interface and pipelined properties and return it.
Functor Model pipelined Kernel Property Code Example#include <sycl/sycl.hpp> #include <sycl/ext/intel/fpga_extensions.hpp> using namespace sycl; using namespace sycl::ext::intel::experimental; using namespace sycl::ext::oneapi::experimental; struct MyFunctorIP { int *input_a, *input_b, *input_c; void operator()() const { *input_c = *input_a + *input_b } } auto get(properties_tag) { return properties{streaming_interface<>, pipelined<>}; } }; /* To exercise the pipelined nature of the kernel in simulation, you must queue up multiple instances of the functions before you call the wait() function. The following code example shows how to exercise a pipelined kernel: */ for (int i = 0; i < kN; i++) { q.single_task(MyFunctorIP{functor_input_A, functor_input_B, functor_input_C}); } q.wait();
- Lambda Model:
- Pass a properties object that contains the streaming_interface and pipelined properties to your q.single_task call.
Lambda Model pipelined Kernel Property Code Example#include <sycl/sycl.hpp> #include <sycl/ext/intel/fpga_extensions.hpp> using namespace sycl; using namespace sycl::ext::intel::experimental; using namespace sycl::ext::oneapi::experimental; class MyLambdaIP; // Create a properties object containing the kernel invocation // interface property properties kernel_properties{streaming_interface_remove_downstream_stall, pipelined<>}; ... /* To exercise the pipelined nature of the kernel in simulation, you must queue up multiple instances of the functions before you call the wait() function. The following code example shows how to exercise a pipelined kernel: */ for (int i = 0; i < kN; i++) { q.single_task<MyLambdaIP>(kernel_properties, [=] { lambda_input_C[i] = lambda_input_A[i] + lambda_input_B[i]; }); } q.wait();
- Functor Model: