Vectorization Basics for Intel® Architecture Processors

OpenCL™ Developer Guide for Intel® Core™ and Intel® Xeon® Processors

Download PDF

ID 773005

Date 10/30/2018

Version 2018

Public

Vectorization Basics for Intel® Architecture Processors

Intel® Architecture Processors provide performance acceleration using Single Instruction Multiple Data (SIMD) instruction sets, which include:

Intel® Streaming SIMD Extensions (Intel® SSE)
Intel® Advanced Vector Extensions (Intel® AVX) instructions
Intel® Advanced Vector Extensions 2 (Intel® AVX2) instructions
Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Foundation instructions, Intel® AVX-512 Conflict Detection instructions, Intel® AVX-512 Doubleword and Quadword instructions, Intel® AVX-512 Byte and Word instructions, and Intel® AVX-512 Vector Length Extensions for Intel® processors

By processing multiple data elements in a single instruction, these ISA extensions enable data parallelism.

When using SIMD instructions, vector registers can store a group of data elements of the same data type, such as float or char. The number of data elements that fit in one register depends on the microarchitecture and on the data type width: for example, in case CPU supports vector register width 512 bits, each vector (ZMM) register can store sixteen float numbers, sixteen 32-bit integer numbers, and so on.

When using the SPMD technique, the Intel® OpenCL™ implementation can map the work items to the hardware according to one of the following:

Scalar code, when work-items execute one-by-one
SIMD elements, when several work-items fit into one register to run simultaneously

The Intel® SDK for OpenCL™ Applications contains an implicit vectorization module, which implements the second method. Depending on the kernel code, this operation might have some limitations. If the vectorization module optimization is disabled, the Intel SDK for OpenCL Applications uses the first method.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

OpenCL™ Developer Guide for Intel® Core™ and Intel® Xeon® Processors

Vectorization Basics for Intel® Architecture Processors

See Also