Intel® FPGA SDK for OpenCL™ Pro Edition: Programming Guide

ID 683846
Date 3/28/2022
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

A.1.1. OpenCL 1.0 C Programming Language Implementation

OpenCL™ is based on C99 with some limitations. Section 6 of the OpenCL Specification version 1.0 describes the OpenCL C programming language. The Intel® FPGA SDK for OpenCL™ conforms with the OpenCL C programming language with clarifications and exceptions. The table below summarizes the support statuses of the features in the OpenCL programming language implementation. OpenCL programming language implementations that are supported with no additional clarifications are not shown.

Support Status column legend:

Symbol Description
The feature is supported, and there might be a clarification for the supported feature in the Notes column
The feature is supported with exceptions identified in the Notes column.
X The feature is not supported.
Section Feature Support Status Notes
6.1.1 Built-in Scalar Data Types
double precision float Preliminary support for all double precision float built-in scalar data type. This feature might not conform with the OpenCL Specification version 1.0.

Currently, the following double precision floating-point functions are expected to conform with the OpenCL Specification version 1.0:

add / subtract / multiply / divide / ceil / floor / rint / trunc / fabs / fmax / fmin / sqrt / rsqrt / exp / exp2 / exp10 / log / log2 / log10 / sin / cos / asin / acos / sinh / cosh / tanh / asinh / acosh / atanh / pow / pown / powr / tanh / atan / atan2 / ldexp / log1p / sincos

half precision float Support for scalar addition, subtraction and multiplication. Support for conversions to and from single-precision floating point. This feature might not conform with the OpenCL Specification version 1.0.

This feature is supported in the Emulator.

6.1.2 Built-in Vector Data Types

Preliminary support for vectors with three elements. Three-element vector support is a supplement to the OpenCL Specification version 1.0.

6.1.3 Other Built-in Data Types The SDK does not support image or sampler types because the SDK does not support images.
6.2.1 Implicit Conversions Refer to Section 6.2.6: Usual Arithmetic Conversions in the OpenCL Specification version 1.2 for an important clarification of implicit conversions between scalar and vector types.
6.2.2 Explicit Casts The SDK allows scalar data casts to a vector with a different element type.
6.5 Address Space Qualifiers Function scope__constant variables are not supported.
6.6 Image Access Qualifiers X The SDK does not support images.
6.7 Function Qualifiers
6.7.2 Optional Attribute Qualifiers Refer to the Intel® FPGA SDK for OpenCL™ Best Practices Guide for tips on using reqd_work_group_size to improve kernel performance.

The SDK parses but ignores the vec_type_hint and work_group_size_hint attribute qualifiers.

6.9 Preprocessor Directives and Macros
#pragma directive: #pragma unroll The Intel® FPGA SDK for OpenCL™ Offline Compiler supports only #pragma unroll. You may assign an integer argument to the unroll directive to control the extent of loop unrolling.

For example, #pragma unroll 4 unrolls four iterations of a loop.

By default, an unroll directive with no unroll factor causes the offline compiler to attempt to unroll the loop fully.

Refer to the Intel® FPGA SDK for OpenCL™ Best Practices Guide for tips on using #pragma unroll to improve kernel performance.

__ENDIAN_LITTLE__ defined to be value 1 The target FPGA is little-endian.
__IMAGE_SUPPORT__ X __IMAGE_SUPPORT__ is undefined; the SDK does not support images.
6.10 Attribute Qualifiers—The offline compiler parses attribute qualifiers as follows:
6.10.3 Specifying Attributes of Variablesendian X
6.10.4 Specifying Attributes of Blocks and Control-Flow-Statements X
6.10.5 Extending Attribute Qualifiers The offline compiler can parse attributes on various syntactic structures. It reserves some attribute names for its own internal use.

Refer to the Intel® FPGA SDK for OpenCL™ Best Practices Guide for tips on how to optimize kernel performance using these kernel attributes.

6.11.2 Math Functions
built-in math functions Preliminary support for double precision floating-point built-in math functions that might not conform with the OpenCL Specification version 1.0.
built-in half_ and native_ math functions Preliminary support for double precision floating-point built-in half_ and native_ math functions that might not conform with the OpenCL Specification version 1.0.
6.11.5 Geometric Functions Preliminary support for double precision floating-point built-in geometric functions. These functions might not conform with the OpenCL Specification version 1.0.

Refer to Argument Types for Built-in Geometric Functions for a list of built-in geometric functions supported by the SDK.

6.11.8 Image Read and Write Functions X The SDK does not support images.
6.11.9 Synchronization Functions—the barrier synchronization function Clarifications and exceptions:
  • If a kernel specifies the reqd_work_group_size or max_work_group_size attribute, barrier supports the corresponding number of work-items.
  • If neither attribute is specified, a barrier is instantiated with a default limit of 128 work-items.

The work-item limit is the maximum supported work-group size for the kernel; this limit is enforced by the runtime.

6.11.11 Async Copies from Global to Local Memory, Local to Global Memory, and Prefetch The implementation is naive:

Work-item (0,0,0) performs the copy and the wait_group_events is implemented as a barrier.

  • If a kernel specifies the reqd_work_group_size or max_work_group_size attribute, wait_group_events supports the corresponding number of work-items.
  • If neither attribute is specified, wait_group_events is instantiated with a default limit of 256 work-items.