Pipelined Kernels

Developer Guide

Intel oneAPI FPGA Handbook

Download PDF

ID 785441

Date 2/07/2024

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Visible to Intel only — GUID: GUID-5F4625B8-55F7-4D1F-A36E-8DCDAF2EB978

View Details

Pipelined Kernels

By default, SYCL* task kernels are not pipelined. A kernel blocks further kernel invocations from starting until the current invocation has completed.

However, kernels with a ready/valid handshaking invocation interface can be pipelined. For information about kernels with a ready/valid handshaking invocation interface, refer to Ready/Valid Handshaking Kernel Invocation Interface.

You can pipeline kernels in the following ways:

Using the pipelined kernel property
(Deprecated)Using the streaming_pipelined_interface macro

The `pipelined` Kernel Property

To pipeline a streaming kernel, add the pipelined<N> property to your streaming kernel.

The pipelined<N> property takes the following values for N:

Property	Description
`pipelined<-1>` `pipelined<>`	The compiler generates hardware that allows kernel invocations to execute in a pipelined fashion, while attempting to achieve the lowest possible II (initiation interval) at the targeted f_MAX. This is the default behavior if no value for `N` is specified.
`pipelined<N>` where `N` > 0	Where `N` represents the desired II value. The compiler generates hardware that allows kernel invocations to execute in a pipelined fashion, while attempting to achieve the specified II (`N`) at the targeted f_MAX.
`pipelined<0>`	The compiler does not generate hardware to allows kernel invocations to execute in a pipelined fashion. This is the equivalent of not specifying the `pipelined` kernel property.

Property

Description

pipelined<-1>

pipelined<>

The compiler generates hardware that allows kernel invocations to execute in a pipelined fashion, while attempting to achieve the lowest possible II (initiation interval) at the targeted f_MAX.

This is the default behavior if no value for N is specified.

pipelined<N>

where N > 0

Where N represents the desired II value.

The compiler generates hardware that allows kernel invocations to execute in a pipelined fashion, while attempting to achieve the specified II (N) at the targeted f_MAX.

pipelined<0>

The compiler does not generate hardware to allows kernel invocations to execute in a pipelined fashion.

This is the equivalent of not specifying the pipelined kernel property.

To use the pipelined kernel property:

Include the following header file in your code:
```
sycl/ext/intel/fpga_extensions.hpp
```

Label your kernel with the pipelined property as follows:

Functor Model:

Add a member function named get to the functor. Have the get function take an argument of type properties_tag and a return type auto.
Create a properties object in the new function with the streaming_interface and pipelined properties and return it.

Functor Model pipelined Kernel Property Code Example

#include <sycl/sycl.hpp>
#include <sycl/ext/intel/fpga_extensions.hpp>

using namespace sycl;
using namespace sycl::ext::intel::experimental;
using namespace sycl::ext::oneapi::experimental;

struct MyFunctorIP {
  int *input_a, *input_b, *input_c;

  void operator()() const {
    *input_c = *input_a + *input_b
    }
  }

  auto get(properties_tag) {
    return properties{streaming_interface<>, pipelined<>};
  }
};

/* To exercise the pipelined nature of the kernel in simulation,
you must queue up multiple instances of the functions before you
call the wait() function. The following code example shows how to
exercise a pipelined kernel: */

for (int i = 0; i < kN; i++) {
  q.single_task(MyFunctorIP{functor_input_A, functor_input_B, functor_input_C});
}
q.wait();

Lambda Model:

Pass a properties object that contains the streaming_interface and pipelined properties to your q.single_task call.

Lambda Model pipelined Kernel Property Code Example

#include <sycl/sycl.hpp>
#include <sycl/ext/intel/fpga_extensions.hpp>

using namespace sycl;
using namespace sycl::ext::intel::experimental;
using namespace sycl::ext::oneapi::experimental;

class MyLambdaIP;

// Create a properties object containing the kernel invocation 
// interface property
properties kernel_properties{streaming_interface_remove_downstream_stall, pipelined<>};

...

/* To exercise the pipelined nature of the kernel in simulation,
you must queue up multiple instances of the functions before you
call the wait() function. The following code example shows how to
exercise a pipelined kernel: */

for (int i = 0; i < kN; i++) {
  q.single_task<MyLambdaIP>(kernel_properties, [=] {
    lambda_input_C[i] = lambda_input_A[i] + lambda_input_B[i];
  });
}
q.wait();

(Deprecated) The `streaming_pipelined_interface` Macro

Deprecated: The streaming_pipelined_interface macro is deprecated and might be removed in a future release. Use the pipelined kernel property instead.

Streaming kernels can be optionally pipelined by using the streaming_pipelined_interface macro, as shown in the following example:


struct MyIP {
  conduit int *input;
  MyIP(int *inp_a_) : input(inp_a_) {}
  streaming_pipelined_interface void operator()() const {
    int temp = *input;
    *input = something_complicated(temp);
  }
};
/* To exercise the pipelined nature of the kernel in simulation,
   you must queue up multiple instances of the functions before you
   call the wait() function. The following code example shows how to
   exercise a pipelined kernel: */
for (int i = 0; i < kN; i++) {
  q.single_task(MyIP{&input_array[i]});
}
q.wait();

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel oneAPI FPGA Handbook

Pipelined Kernels

The `pipelined` Kernel Property

(Deprecated) The `streaming_pipelined_interface` Macro

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel oneAPI FPGA Handbook

Pipelined Kernels

The pipelined Kernel Property

(Deprecated) The streaming_pipelined_interface Macro

The `pipelined` Kernel Property

(Deprecated) The `streaming_pipelined_interface` Macro