Intel® FPGA SDK for OpenCL™ Pro Edition: Programming Guide

ID 683846
Date 3/28/2022
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

5.4.4. Restrictions in the Implementation of Intel® FPGA SDK for OpenCL™ Channels Extension

There are certain design restrictions to the implementation of channels in your OpenCL™ application.

Multiple Channel Call Site

A kernel can read from the same channel multiple times. However, multiple kernels cannot read from the same channel. Similarly, a kernel can write to the same channel multiple times but multiple kernels cannot write to the same channel.
__kernel void k1() { read_channel_intel (channel1); read_channel_intel (channel1); read_channel_intel (channel1); }
The Intel® FPGA SDK for OpenCL™ Offline Compiler cannot compile the following code and issues an error:
__kernel void k1(){ write_channel_intel (channel1, 1); } __kernel void k2() { write_channel_intel (channel1, 2); }

Feedback and Feed-forward Channels

Performance of a kernel that has multiple accesses (reads or writes) to the same channel might be poor.

Static Indexing

The Intel® FPGA SDK for OpenCL™ channels extension does support indexing into arrays of channel IDs, but it leads to inefficient hardware.

Consider the following example:

channel int ch[WORKGROUP_SIZE]; __kernel void consumer() { int gid = get_global_id(0); int value = read_channel_intel(ch[gid]); //statements }

Compilation of this example generates the following warning message:

Compiler Warning: Dynamic access into channel array ch was expanded into predicated 
static accesses on every channel of the array.

If the access is dynamic and you know that only a subset of the channels in the array can be accessed, you can generate slightly more efficient hardware with a switch statement:

channel int ch[WORKGROUP_SIZE]; __kernel void consumer() { int gid = get_global_id(0); int value; switch(gid) { case 0: value = read_channel_intel(ch[0]); break; case 2: value = read_channel_intel(ch[2]); break; case 3: value = read_channel_intel(ch[3]); break; //statements case WORKGROUP_SIZE-1:read_channel_intel(ch[WORKGROUP_SIZE-1]); break; } //statements }

Kernel Vectorization Support

You cannot vectorize kernels that use channels; that is, do not include the num_simd_work_items kernel attribute in your kernel code. Vectorizing a kernel that uses channels creates multiple channel accesses inside the same kernel and requires arbitration, which negates the advantages of vectorization. As a result, the SDK's channel extension does not support kernel vectorization.

Instruction-Level Parallelism on read_channel_intel and write_channel_intel Calls

If no data dependencies exist between read_channel_intel and write_channel_intel calls, the offline compiler attempts to execute these instructions in parallel. As a result, the offline compiler might execute these read_channel_intel and write_channel_intel calls in an order that does not follow the sequence expressed in the OpenCL kernel code.

Consider the following code sequence:

in_data1 = read_channel_intel(channel1); in_data2 = read_channel_intel(channel2); in_data3 = read_channel_intel(channel3);

Because there are no data dependencies between the read_channel_intel calls, the offline compiler can execute them in any order.