Developer Guide

FPGA Optimization Guide for Intel® oneAPI Toolkits

ID 767853
Date 7/13/2023
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

The pipe Class and its Use

The pipe API exposed by the FPGA implementation is equivalent to the following class declaration:

template <class name,
          class dataT,
          size_t min_capacity = 0>
class pipe {
public:
  // Blocking
  static dataT read();
  static void write(dataT data);
  // Non-blocking
  static dataT read(bool &success_code);
  static void write(dataT data, bool &success_code);
}

The following table describes the template parameters:

Template Parameters

Parameter

Description

name

The type that is the basis of a pipe identification. It is typically a user-defined class, in a user namespace. Forward declaration of the type is enough, and the type need not be defined.

dataT

The type of data packet contained within a pipe. This is the data type that is read during a successful pipe read() operation, or written during a successful pipe write() operation. The type must have a standard layout and be trivially copyable.

min_capacity

User-defined minimum number of words (in units of dataT) that the pipe must be able to store without any being read out. The compiler may create a pipe with a larger capacity due to performance considerations.

The pipe class exposes static methods for writing a data word to a pipe and reading a data word from a pipe. The reads and writes can be blocking or non-blocking depending on the parameters you pass to the read() and/or write() function.

NOTE:

A data word in this context is the data type that the pipe contains (dataT pipe template argument).

Example Code Using Blocking Inter-Kernel Pipes

When writing code with SYCL* pipes, use of the C++ type alias mechanism (using) is highly encouraged to avoid errors where slightly different pipe types inadvertently lead to unique pipes. The following code sample shows how to use pipes with blocking accessors to transfer data between two kernels:

#include <sycl/sycl.hpp>
using namespace sycl;
constexpr int N = 3;
// Specialize a pipe type
using my_pipe = ext::intel::pipe<class some_pipe, int, 8>;
void producer(const std::array<int, N> &src) {
  queue q;
  // Launch the producer kernel
  buffer<int> src_buf = {std::begin(src), std::end(src)};
  q.submit([&](handler &cgh) {
    // Get read access to src array
    accessor rd_src_buf(src_buf, cgh, read_only);
    cgh.single_task<class producer>([=]() {
      for (int i = 0; i < N; i++) {
        // Blocking write an int to the pipe
        my_pipe::write(rd_src_buf[i]);
      }
    });
  });
}
void consumer(std::array<int, N> &dst) {
  queue q;
  // Launch the consumer kernel
  buffer<int> dst_buf = {std::begin(dst), std::end(dst)};
  q.submit([&](handler &cgh) {
    // Get write access to dst array
    accessor wr_dst_buf(dst_buf, cgh, write_only); 
    cgh.single_task<class consumer>([=]() {
      for (int i = 0; i < N; i++) {
        // Blocking read an int from the pipe
        wr_dst_buf[i] = my_pipe::read();
      }
    });
  });
}

The pipe data packet is of type int and the pipe has a depth of 8, as specified by the template parameters of my_pipe type. The pipe read() call blocks only when the pipe is empty, and the pipe write() call blocks only when the pipe is full.

NOTE:

The SYCL specification does not guarantee concurrent kernel execution. However, the Intel® oneAPI DPC++/C++ Compiler supports concurrent execution of kernels. Hence, you can modify your host application and kernel program to take advantage of this capability. The modifications increase the throughput of your application.

Example Code Using Non-Blocking Inter-Kernel Pipes

The code samples (Sample 1 and Sample 2) in this section illustrate how to use pipes with non-blocking writes and reads to transfer data between two concurrently running kernels:

//Sample 1
// The Producer kernel reads data from a SYCL buffer and writes it to
// a pipe. This transfers the input data from the host to the Consumer kernel
// that is running concurrently.
event Producer(queue &q, buffer<int, 1> &input_buffer) {
  std::cout << "Enqueuing producer...\n";

  auto e = q.submit([&](handler &h) {
    accessor input_accessor(input_buffer, h, read_only);
    size_t num_elements = input_buffer.size();

    h.single_task<ProducerTutorial>([=]() {
      for (size_t i = 0; i < num_elements; ++i) {
        ProducerToConsumerPipe::write(input_accessor[i], valid);
      }
    });
  });

  return e;
}

For both pipes, the data packet is of type int. The pipes are different because the first template parameter is different. The non-blocking pipe write() and read() calls do not block. They respectively return a boolean value that indicates whether data is written successfully to the pipe (that is, the pipe is not full) or if the data is read successfully from the pipe (that is, the pipe is not empty).

Perform non-blocking pipe writes to facilitate applications where writes to a full FIFO buffer should not cause the kernel to stall until a slot in the FIFO buffer becomes free. Consider a scenario where your application has one data producer with two identical workers that consume the data. Assume the time each worker takes to process a message varies depending on the contents of the data. In this case, there might be a situation where one worker is busy while the other is free. A non-blocking write can facilitate work distribution such that both workers are busy. Like a non-blocking write, perform non-blocking reads to facilitate applications where data is not always available, and other operations need not wait for the data to become available.

NOTE:

You can mix blocking and non-blocking accessors for writing or reading data to or from pipes. For example, you can write data to a pipe using a blocking pipe write() call and read it from the other end using a non-blocking pipe read() call, and vice versa.

//Sample 2
#include <sycl/sycl.hpp>
using namespace sycl;
constexpr size_t N = 16;
// Specialize the two pipe types, differentiated based on their first template
// parameter
using pipe1 = ext::intel::pipe<class some_pipe, int>;
using pipe2 = ext::intel::pipe<class other_pipe, int>;
// the producer kernels are not shown
void consumer(const std::array<int, N> &dst) {
  queue q;
  // Launch the consumer kernel
  buffer<int> dst_buf = {std::begin(dst), std::end(dst)};
  q.submit([&](handler &cgh) {
    // Get write access to src array
    accessor wr_dst_buf(dst_buf, cgh, write_only);
    cgh.single_task<class consumer>([=]() {
      int = 0;
      while (i < N) {
        bool valid0 = false, valid1 = false;
        auto data0 = pipe1::read(valid0);
        auto data1 = pipe2::read(valid1);
        if (valid0) {
          wr_dst_buf[i++] = process(data0);
        }
        if (valid1) {
          wr_dst_buf[i++] = process(data1);
        }
      }
    });
  });
}
NOTE:

For additional information, refer to FPGA tutorial samples "Pipe Array" and "Pipes" listed in the Intel® oneAPI Samples Browser on Linux* or Windows*, or access the code samples in GitHub.