Visible to Intel only — GUID: GUID-5F4625B8-55F7-4D1F-A36E-8DCDAF2EB978
Visible to Intel only — GUID: GUID-5F4625B8-55F7-4D1F-A36E-8DCDAF2EB978
Pipelined Kernels
By default, SYCL* task kernels are not pipelined. A kernel blocks further kernel invocations from starting until the current invocation has completed.
However, kernels with a ready/valid handshaking invocation interface can be pipelined. For information about kernels with a ready/valid handshaking invocation interface, refer to Ready/Valid Handshaking Kernel Invocation Interface.
You can pipeline kernels in the following ways:
The pipelined Kernel Property
To pipeline a streaming kernel, add the pipelined<N> property to your streaming kernel.
The pipelined<N> property takes the following values for N:
Property | Description |
---|---|
pipelined<-1> pipelined<> |
The compiler generates hardware that allows kernel invocations to execute in a pipelined fashion, while attempting to achieve the lowest possible II (initiation interval) at the targeted fMAX. This is the default behavior if no value for N is specified. |
pipelined<N> where N > 0 |
Where N represents the desired II value. The compiler generates hardware that allows kernel invocations to execute in a pipelined fashion, while attempting to achieve the specified II (N) at the targeted fMAX. |
pipelined<0> | The compiler does not generate hardware to allows kernel invocations to execute in a pipelined fashion. This is the equivalent of not specifying the pipelined kernel property. |
To use the pipelined kernel property:
- Include the following header file in your code:
sycl/ext/intel/fpga_extensions.hpp
- Label your kernel with the pipelined property as follows:
- Functor Model:
- Add a member function named get to the functor. Have the get function take an argument of type properties_tag and a return type auto.
- Create a properties object in the new function with the streaming_interface and pipelined properties and return it.
Functor Model pipelined Kernel Property Code Example#include <sycl/sycl.hpp> #include <sycl/ext/intel/fpga_extensions.hpp> using namespace sycl; using namespace sycl::ext::intel::experimental; using namespace sycl::ext::oneapi::experimental; struct MyFunctorIP { int *input_a, *input_b, *input_c; void operator()() const { *input_c = *input_a + *input_b } } auto get(properties_tag) { return properties{streaming_interface<>, pipelined<>}; } }; /* To exercise the pipelined nature of the kernel in simulation, you must queue up multiple instances of the functions before you call the wait() function. The following code example shows how to exercise a pipelined kernel: */ for (int i = 0; i < kN; i++) { q.single_task(MyFunctorIP{functor_input_A, functor_input_B, functor_input_C}); } q.wait();
- Lambda Model:
- Pass a properties object that contains the streaming_interface and pipelined properties to your q.single_task call.
Lambda Model pipelined Kernel Property Code Example#include <sycl/sycl.hpp> #include <sycl/ext/intel/fpga_extensions.hpp> using namespace sycl; using namespace sycl::ext::intel::experimental; using namespace sycl::ext::oneapi::experimental; class MyLambdaIP; // Create a properties object containing the kernel invocation // interface property properties kernel_properties{streaming_interface_remove_downstream_stall, pipelined<>}; ... /* To exercise the pipelined nature of the kernel in simulation, you must queue up multiple instances of the functions before you call the wait() function. The following code example shows how to exercise a pipelined kernel: */ for (int i = 0; i < kN; i++) { q.single_task<MyLambdaIP>(kernel_properties, [=] { lambda_input_C[i] = lambda_input_A[i] + lambda_input_B[i]; }); } q.wait();
- Functor Model:
(Deprecated) The streaming_pipelined_interface Macro
Deprecated: The streaming_pipelined_interface macro is deprecated and might be removed in a future release. Use the pipelined kernel property instead.
Streaming kernels can be optionally pipelined by using the streaming_pipelined_interface macro, as shown in the following example:
struct MyIP {
conduit int *input;
MyIP(int *inp_a_) : input(inp_a_) {}
streaming_pipelined_interface void operator()() const {
int temp = *input;
*input = something_complicated(temp);
}
};
/* To exercise the pipelined nature of the kernel in simulation,
you must queue up multiple instances of the functions before you
call the wait() function. The following code example shows how to
exercise a pipelined kernel: */
for (int i = 0; i < kN; i++) {
q.single_task(MyIP{&input_array[i]});
}
q.wait();