Developer Guide

FPGA Optimization Guide for Intel® oneAPI Toolkits

ID 767853
Date 7/13/2023
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Host Pipe API

Host pipes expose read and write interfaces that allow a single element to be read or written in FIFO order to the pipe. These read and write interfaces are static class methods on the templated classes described later in this section and in Host Pipe Declaration.

Blocking Write

The host pipe write interface writes a single element of the given data type (int in the examples that follow) to the host pipe. On the host side, this class method accepts a reference to a SYCL* device queue as its first argument and the element being written as its second argument.

queue q(...);
...
int data_element = ...;

// blocking write from host to pipe
MyPipeInstance::write(q, data_element);
...

In the kernel, writes to a host pipe accept a single argument, which is the element being written.

float data_element = ...;

// blocking write from device to pipe
AnotherPipeInstance::write(data_element);

Non-blocking Write

Non-blocking writes add a bool argument in both host and device APIs that is passed by reference and returns true in this argument if the write was successful, and false if it was unsuccessful.

On the host:

queue q(...);
...
int data_element = ...;

// variable to hold write success or failure
bool success = false;

// attempt non-blocking write from host to pipe until successful
while (!success) MyPipeInstance::write(q, data_element, success);

On the device:

float data_element = ...;
            
// variable to hold write success or failure
bool success = false;

// attempt non-blocking write from device to pipe until successful
while (!success) AnotherPipeInstance::write(data_element, success);

Blocking Read

The host pipe read interface reads a single element of a given data type from the host pipe. Like the write interface, the read interface on the host takes a SYCL* device queue as a parameter. The device read interface consists of the class method read call with no arguments.

On the host:

// blocking read in host code
float read_element = AnotherPipeInstance::read(q);

On the device:

// blocking read in device code
int read_element = FirstPipeInstance::read();

Non-blocking Read

Like non-blocking writes, non-blocking reads add a bool argument in both host and device APIs that is passed by reference and returns true in this argument if the read was successful and false if it was unsuccessful.

On the host:

// variable to hold read success or failure
bool success = false;

// attempt non-blocking read until successful in host code
float read_element;
while (!success) read_element = SecondPipeInstance::read(q, success);

On the device:

// variable to hold read success or failure
bool success = false;

// attempt non-blocking read until successful in device code
int read_element;
while (!success) read_element = FirstPipeInstance::read(success);

Host Pipe Connections

Host pipe connections for a particular host pipe are inferred by the compiler from the presence of read and write calls to that host pipe in your code.

A host pipe can be connected only between the host and a single kernel. You cannot call the same host pipe from different kernels.

Host pipes operate in only one direction. That is, host-to-kernel (only writes on the host, only reads on the device) or kernel-to-host (only writes on the device, only reads on the host).

Host code for a particular host pipe can contain either only all writes or only all reads to that pipe, and the corresponding kernel code for the same host pipe can consist only of the opposite transaction.

Kernel Invocation Order and Host Read/Writes

When you create a testbench for a oneAPI kernel that you intend to compile as an IP component, write all your data to the host pipe before invoking the kernel.

This order of operation helps ensure that the simulation gives you the most accurate estimate of your kernel performance.

By buffering all the writes to the pipe, the host can supply new data to the kernel every clock cycle when it begins running.

The following code example shows this recommended method. This method ensures that the host keeps up with the kernel in the simulation flow, and thus give a more accurate representation of the IP component performance.

queue q(...);
...
int data_element = ...;
// Host performs all writes to the host pipe
for (int i = 0; i < ITERATIONS; i++) {
	MyPipeInstance::write(q, data_element);
}
// Invoke the kernel which reads from the host pipe
q.single_task<class MyKernel>([=]() {
	...
	for (int i = 0; i < ITERATIONS; i++) {
		MyPipeInstance::read();
	}
	...
});

The following code example might cause the kernel to stall while it waits for the host to supply data, as the host produces data slower than the kernel can consume it. The simulation might not be representative of the maximum performance of the kernel.

queue q(...);
...
int data_element = ...;
// Invoke the kernel which reads from the host pipe
q.single_task<class MyKernel>([=]() {
	...
	for (int i = 0; i < ITERATIONS; i++) {
		MyPipeInstance::read();
	}
	...
});
// Host performs all writes to the host pipe
for (int i = 0; i < ITERATIONS; i++) {
	MyPipeInstance::write(q, data_element);
}