Intel® C++ Compiler Classic Developer Guide and Reference

ID 767249
Date 7/13/2023
Public
Document Table of Contents

Processor Targeting

The manual processor dispatch feature allows you to target processors manually. You can control processor dispatching in a number of ways, including:

  • Use the cpu_specific and cpu_dispatch keywords (attributes in Linux* or __declspecs in Windows*) to write one or more versions of a function that executes only on specified types of Intel® processors. You can also write a generic version that executes on other Intel or non-Intel processors. The Intel processor type is detected at runtime, and the corresponding function version is executed. This feature is available only for Intel processors based on IA-32 or Intel® 64 architecture. This feature is not available for non-Intel processors. Applications built using the manual processor dispatch feature may be more highly optimized for Intel processors than for non-Intel processors.

    For more information, see below.

  • Use the optimization_parameter pragma.

    For more information, see below.

  • On Linux, in addition to the Intel-defined attributes cpu_specific and cpu_dispatch, C++ compilations with GNU Compiler Collection (GCC*) compatibility 4.8 or higher support creation of multiple function versions using the target attribute.

    For more information, see the GCC documentation on Function Multiversioning.

Using cpu_dispatch for Manual Processor Dispatch Programming

Use the __declspec(cpu_dispatch(cpuid, cpuid,...)) syntax in your code to provide a list of targeted processors along with an empty function body/function stub. Use the __declspec(cpu_specific(cpuid)) in your code to declare each function version targeted at particular type[s] of processors.

For a list of the values for cpuid, see the list on cpu_dispatch, cpu_specific.

NOTE:
If no other matching Intel processor type is detected, the generic version of the function is executed. If you want the program to execute on non-Intel processors, a generic function version must be provided. You can control the degree of optimization of the generic function version and the processor features that it assumes.

The cpuid attributes are not case sensitive. The body of a function declared with __declspec(cpu_dispatch) must be empty, and is referred to as a stub (an empty-bodied function).

The following example illustrates how the cpu_dispatch and cpu_specific keywords can be used to create function versions for the 2nd generation Intel® Core™ processor family with support of Intel® Advanced Vector Extensions (Intel® AVX), for the Intel® Core™ processor family, for the Intel® Core™2 Duo processor family, and for other Intel and compatible, non-Intel processors. Each processor-specific function body might contain processor-specific intrinsic functions, or it might be placed in a separate source file and compiled with a processor-specific compiler option.

Example

#include <stdio.h> 
// need to create specific function versions for the following processors: 
__declspec(cpu_dispatch(core_2nd_gen_avx, core_i7_sse4_2, core_2_duo_ssse3, generic )) 
void dispatch_func() {};      //  stub that will call the appropriate specific function version   

__declspec(cpu_specific(core_2nd_gen_avx)) 
void dispatch_func() {  
  printf("\nCode for 2nd generation Intel Core processors with support for Intel AVX goes here\n"); 
} 

__declspec(cpu_specific(core_i7_sse4_2)) 
void dispatch_func() { 
  printf("\nCode for Intel Core processors with support for SSE4.2 goes here\n"); 
} 

__declspec(cpu_specific(core_2_duo_ssse3)) 
void dispatch_func() { 
  printf("\nCode for Intel Core 2 Duo processors with support for SSSE3 goes here\n"); 
} 

__declspec(cpu_specific(generic)) 
void dispatch_func() { 
  printf("\nCode for non-Intel processors and generic Intel processors goes here\n"); 
} 

int main() { 
  dispatch_func(); 
  printf("Return from dispatch_func\n"); 
  return 0;
}

Considerations

Before using manual dispatch, consider whether the benefits outweigh the additional effort and possible performance issues. You may encounter any one or all of the following issues when using manual processor dispatch in your code:

  • Your code and executable sizes will increase.

  • Additional performance overhead may be introduced because of additional function calls.

Test your application on all targeted platforms before release.

Using Pragmas to Target Processors Manually

You can use #pragma intel optimization_parameter target_arch to flag those routines in your code that you want to execute on specified types of processors. This pragma controls the -mor /arch option at a routine level, overriding the option values specified at the command-line, using the same values as the -m or /arch option to target processors. The following example illustrates how to use the pragma to target a routine bar() to execute only on Intel® AVX supported processors regardless of what the command-line has specified.

Example

#define N 1024

double x[N], y[N], z[N];
#pragma intel optimization_parameter target_arch=AVX

void bar() {
    int i;
    for (i = 0; i < N; i++) {
        z[i] = x[i] * y[i];
    }
} 

You can also use the _allow_cpu_features intrinsic to tell the compiler that the code region may be targeted for processors with specified features, and the _may_i_use_cpu_feature to query the processor dynamically at the source level to determine if processor-specific features are available.

Example

#include <immintrin.h>
#define N 1024
 
double x[N], y[N], z[N];
 
void VectorMultiply(int allow_avx)
{
    int i;
    if (allow_avx) {
        _allow_cpu_features(_FEATURE_AVX);
        for (i = 0; i < N; i++) {
            z[i] = x[i] * y[i];
        }
    }
    else {
        for (i = 0; i < N; i++) {
            z[i] = x[i] * y[i];
        }
    }
} 

Product and Performance Information

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.

Notice revision #20201201