Visible to Intel only — GUID: GUID-3AA51269-D1CA-4182-A6BD-6C173CA9B758
Legal Information
Getting Help and Support
Introduction
Check-list for OpenCL™ Optimizations
Tips and Tricks for Kernel Development
Application-Level Optimizations
Debugging OpenCL™ Kernels on Linux* OS
Performance Debugging with Intel® SDK for OpenCL™ Applications
Coding for the Intel® Architecture Processors
Why Optimizing Kernels Is Important?
Avoid Spurious Operations in Kernels
Avoid Handling Edge Conditions in Kernels
Use the Preprocessor for Constants
Prefer (32-bit) Signed Integer Data Types
Prefer Row-Wise Data Accesses
Use Built-In Functions
Avoid Extracting Vector Components
Task-Parallel Programming Model Hints
Common Mistakes in OpenCL™ Applications
Introduction for OpenCL™ Coding on Intel® Architecture Processors
Vectorization Basics for Intel® Architecture Processors
Vectorization: SIMD Processing Within a Work Group
Benefitting from Implicit Vectorization
Vectorizer Knobs
Targeting a Different CPU Architecture
Using Vector Data Types
Writing Kernels to Directly Target the Intel® Architecture Processors
Work-Group Size Considerations
Threading: Achieving Work-Group Level Parallelism
Efficient Data Layout
Using the Blocking Technique
Intel® Turbo Boost Technology Support
Global Memory Size
Visible to Intel only — GUID: GUID-3AA51269-D1CA-4182-A6BD-6C173CA9B758
Task-Parallel Programming Model Hints
Task-parallel programming model is the general-purpose that enables you to express parallelism by enqueuing multiple tasks. You can apply this model in the following scenarios:
- Performing different tasks concurrently by multiple threads. If you use this scenario, choose sufficient granularity of the tasks to enable optimal load balancing.
- Adding an extra queue (beside conventional data-parallel pipeline) for tasks that occur less frequently and in asynchronous manner, for example, some scheduled events.
If your tasks are independent, consider using out-of-order queue.