Pipelined Kernels

Developer Guide

Intel® oneAPI DPC++/C++ Compiler Handbook for FPGAs

Download PDF

ID 785441

Date 5/05/2025

Version

Public

Pipelined Kernels

By default, SYCL* task kernels are not pipelined. A kernel blocks further kernel invocations from starting until the current invocation has completed.

However, kernels with a ready/valid handshaking invocation interface can be pipelined. For information about kernels with a ready/valid handshaking invocation interface, refer to Ready/Valid Handshaking Kernel Invocation Interface.

You can pipeline kernels by using the pipelined kernel property.

IMPORTANT:

Separate invocations of a kernel are independent. The independence of separate kernel invocation means that you cannot make assumptions about memory ordering or memory dependencies between kernel invocations.

Ensure that you use synchronization mechanisms such as the .wait() function or SYCL* atomic operations to avoid race conditions.

If you want to guarantee sequential equivalence, write your kernel with a while(1) loop in the kernel body instead of using a pipelined kernel.

This restriction can affect you if you are migrating code from the Intel® HLS Compiler to SCYL* code. Repeatedly-invoked kernel code that would have worked correctly when compiled by the Intel® HLS Compiler might result in undefined behavior in SYCL* code and might not function as you expect.

The `pipelined` Kernel Property

To pipeline a streaming kernel, add the pipelined<N> property to your streaming kernel.

The pipelined<N> property takes the following values for N:

Property	Description
`pipelined<-1>` `pipelined<>`	The compiler generates hardware that allows kernel invocations to execute in a pipelined fashion, while attempting to achieve the lowest possible II (initiation interval) at the targeted f_MAX. This is the default behavior if no value for `N` is specified.
`pipelined<N>` where `N` > 0	Where `N` represents the desired II value. The compiler generates hardware that allows kernel invocations to execute in a pipelined fashion, while attempting to achieve the specified II (`N`) at the targeted f_MAX.
`pipelined<0>`	The compiler does not generate hardware to allows kernel invocations to execute in a pipelined fashion. This is the equivalent of not specifying the `pipelined` kernel property.

Property

Description

pipelined<-1>

pipelined<>

The compiler generates hardware that allows kernel invocations to execute in a pipelined fashion, while attempting to achieve the lowest possible II (initiation interval) at the targeted f_MAX.

This is the default behavior if no value for N is specified.

pipelined<N>

where N > 0

Where N represents the desired II value.

The compiler generates hardware that allows kernel invocations to execute in a pipelined fashion, while attempting to achieve the specified II (N) at the targeted f_MAX.

pipelined<0>

The compiler does not generate hardware to allows kernel invocations to execute in a pipelined fashion.

This is the equivalent of not specifying the pipelined kernel property.

To use the pipelined kernel property:

Include the following header file in your code:
```
sycl/ext/intel/fpga_extensions.hpp
```

Label your kernel with the pipelined property as follows:

Functor Model:

Add a member function named get to the functor. Have the get function take an argument of type properties_tag and a return type auto.
Create a properties object in the new function with the streaming_interface and pipelined properties and return it.

Functor Model pipelined Kernel Property Code Example

#include <sycl/sycl.hpp>
#include <sycl/ext/intel/fpga_extensions.hpp>

using namespace sycl;
using namespace sycl::ext::intel::experimental;
using namespace sycl::ext::oneapi::experimental;

struct MyFunctorIP {
  int *input_a, *input_b, *input_c;

  void operator()() const {
    *input_c = *input_a + *input_b
    }
  }

  auto get(properties_tag) {
    return properties{streaming_interface<>, pipelined<>};
  }
};

/* To exercise the pipelined nature of the kernel in simulation,
you must queue up multiple instances of the functions before you
call the wait() function. The following code example shows how to
exercise a pipelined kernel: */

for (int i = 0; i < kN; i++) {
  q.single_task(MyFunctorIP{functor_input_A, functor_input_B, functor_input_C});
}
q.wait();

Lambda Model:

Pass a properties object that contains the streaming_interface and pipelined properties to your q.single_task call.

Lambda Model pipelined Kernel Property Code Example

#include <sycl/sycl.hpp>
#include <sycl/ext/intel/fpga_extensions.hpp>

using namespace sycl;
using namespace sycl::ext::intel::experimental;
using namespace sycl::ext::oneapi::experimental;

class MyLambdaIP;

// Create a properties object containing the kernel invocation 
// interface property
properties kernel_properties{streaming_interface_remove_downstream_stall, pipelined<>};

...

/* To exercise the pipelined nature of the kernel in simulation,
you must queue up multiple instances of the functions before you
call the wait() function. The following code example shows how to
exercise a pipelined kernel: */

for (int i = 0; i < kN; i++) {
  q.single_task<MyLambdaIP>(kernel_properties, [=] {
    lambda_input_C[i] = lambda_input_A[i] + lambda_input_B[i];
  });
}
q.wait();

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® oneAPI DPC++/C++ Compiler Handbook for FPGAs

Pipelined Kernels

The `pipelined` Kernel Property

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® oneAPI DPC++/C++ Compiler Handbook for FPGAs

Pipelined Kernels

The pipelined Kernel Property

The `pipelined` Kernel Property