Using Floating Point for Calculations

OpenCL™ Developer Guide for Intel® Processor Graphics

Download PDF

ID 773088

Date 3/20/2019

Version 2019.4

Public

Visible to Intel only — GUID: GUID-3BDAF6D4-93F2-465F-B966-9E43F0443A8E

View Details

Using Floating Point for Calculations

Intel® Graphics device is much faster for floating-point add, sub, mul and so on in compare to the int type.

For example, consider the following code that performs calculations in type int4:

__kernel void amp (__constant uchar4* src, __global uchar4* dst)
        …
        uint4 tempSrc = convert_uint4(src[offset]);//Load one RGBA8 pixel
        //some processing
        uint4 value = (tempSrc.z + tempSrc.y + tempSrc.x);
        uint4 tempDst = value + (tempSrc - value) * nSaturation;
        //store 
        dst[offset] = convert_uchar4(tempDst);
}

Below is its float4 equivalent:

__kernel void amp (__constant uchar4* src, __global uchar4* dst)
        …
        uint4 tempSrc = convert_uint4(src[offset]);//Load one RGBA8 pixel
        //some processing
        float4 value = (tempSrc.z + tempSrc.y + tempSrc.x);
        float4 tempDst = mad(tempSrc – value,  fSaturation, value);
        //store 
        dst[offset] = convert_uchar4(tempDst);
}

Intel® Advanced Vector Extensions (Intel® AVX) support (if available) accelerates floating-point calculations on the modern CPUs, so floating-point data type is preferable for the CPU OpenCL device as well.

NOTE:

The compiler can perform automatic fusion of multiplies and additions. Use compiler flag -cl-mad-enable to enable this optimization when compiling for both Intel® Graphics and CPU devices. However, explicit use of the "mad" built-in ensures that it is mapped directly to the efficient instruction.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

OpenCL™ Developer Guide for Intel® Processor Graphics

Using Floating Point for Calculations