Intel® FPGA SDK for OpenCL™ Standard Edition: Programming Guide

ID 683342
Date 4/22/2019
Public
Document Table of Contents

5.4.5.8.1. Defining Memory Consistency Across Kernels When Using Channels

According to the OpenCL™ Specification version 1.0, memory behavior is undefined unless a kernel completes execution. A kernel must finish executing before other kernels can visualize any changes in memory behavior. However, kernels that use channels can share data through common global memory buffers and synchronized memory accesses. To ensure that data written to a channel is visible to the read channel after execution passes a memory fence, define memory consistency across kernels with respect to memory fences.
To create a control flow dependency between the channel synchronization calls and the memory operations, add the CLK_GLOBAL_MEM_FENCE flag to the mem_fence call.
For example:

__kernel void producer( __global const uint * src,
                        const uint iterations )
{
    for(int i=0; i < iterations; i++)
    {
        write_channel_intel(c0, src[2*i]);
        mem_fence(CLK_CHANNEL_MEM_FENCE | CLK_GLOBAL_MEM_FENCE);
        write_channel_intel(c1, src[2*i+1]);
    }
}

In this kernel, the mem_fence function ensures that the write operation to c0 and memory access to src[2*i] occur before the write operation to c1 and memory access to src[2*i+1]. This allows data written to c0 to be visible to the read channel before data is written to c1.