Ahead of Time Compilation

Intel® oneAPI DPC++/C++ Compiler

Developer Guide and Reference

Download PDF

ID 767253

Date 3/31/2025

Version

Public

Ahead of Time Compilation

Ahead of Time (AOT) Compilation is a helpful feature for your development lifecycle or distribution time. The AOT feature provides the following benefits when you know beforehand what your target device is going to be at application execution time:

No additional compilation time is done when running your application.
No just-in-time (JIT) bugs encountered due to compilation for the target. Any bugs should be found during AOT and resolved.
Your final code, executing on the target device, can be tested as-is before you deliver it to end-users.

A program built with AOT compilation for specific target device(s) will not run on different device(s). You must detect the proper target device at runtime and report an error if the targeted device is not present. The use of exception handling with an asynchronous exception handler is recommended.

SYCL supports AOT compilation for the following targets: Intel® CPUs, Intel® Processor Graphics, and Intel® FPGA. For details on AOT compilation for Intel FPGAs, refer to the Intel® oneAPI FPGA Handbook.

OpenMP supports AOT compilation for the following targets: Intel® Processor Graphics.

For additional information, watch two videos for a quick overview on how to apply the JIT and AOT compilation options:

Prerequisites

To target a GPU with the AOT feature, you must have the OpenCL™ Offline Compiler (OCLOC) tool installed. OCLOC can generate binaries that use OpenCL™ (SYCL only) or the Intel® oneAPI Level Zero (Level Zero) backend.

OCLOC is not packaged with the compiler and must be installed separately. To install OCLOC, you need to install the GPU drivers (whether or not you have an Intel GPU on your system). Refer to the Installing GPU drivers for instructions.

Requirements for Accelerators

GPUs:

Intel® UDH Graphics for 11th generation Intel processors or newer
Intel® Iris® X^e graphics
Intel® Arc™ graphics
Intel® Data Center GPU Flex Series
Intel® Data Center GPU Max Series

AOT Compilation Supported Options for OpenMP

Use the following options to target a specific device for AOT compilation for OpenMP:

-fopenmp-target to specify the device target
-Xopenmp-target-backend to pass options to the backend tool

Option -Xopenmp-target-backend is a general device target option. If multiple targets are desired (for example: -fopenmp-targets=spir64,spir64_gen), the options specified with -Xopenmp-target-backend apply to all targets.

For multiple targets, you can add specificity by using, for example, Xopenmp-target-backend=spir64_gen <option>.

When using Ahead of Time (AOT) compilation, the options passed with -Xopenmp-target-backend are not compiler options, but rather options to pass to OCLOC.

To see a list of the options you can pass with -Xopenmp-target-backend when using AOT, specify -fsycl-help=gen on the command line.

AOT Compilation Supported Options for SYCL

Use the following options to target a specific device for AOT compilation for SYCL:

-fsycl-target to specify the device target
-Xsycl-target-backend to pass options to the backend tool

Option -Xsycl-target-backend is a general device target option. If multiple targets are desired (for example: -fopenmp-targets=spir64_gen,spir64_x86_64), the options specified with -Xsycl-target-backend apply to all targets.

For multiple targets, you can add specificity by using, for example, Xsycl-target-backend=spir64_gen <option>.

When using Ahead of Time (AOT) compilation, the options passed with -Xsycl-target-backend are not compiler options.

To see a list of the options you can pass with -Xsycl-target-backend when using AOT, specify -fsycl-help=gen, -fsycl-help=x86_64, or -fsycl-help=fpga on the command line.

Use AOT for the Target Device (Intel® CPUs)

NOTE:

SYCL compilation is only available with the C/C++ compiler.

However, you can link SYCL-generated objects with the Fortran compiler. The use of -fsycl with ifx allows this, though it is restricted to spir64, spir64_gen, and spir64_x86_64).

Use the following option arguments to specify Intel® CPUs as the target device for AOT compilation:

-fsycl-targets=spir64_x86_64

-Xsycl-target-backend "-march=<arch>", where <arch> is one of the following:

Switch	Display Name
avx	Intel® Advanced Vector Extensions (Intel® AVX)
avx2	Intel® Advanced Vector Extensions 2 (Intel® AVX2)
avx512	Intel® Advanced Vector Extensions 512 (Intel® AVX-512)
sse4.2	Intel® Streaming SIMD Extensions 4.2 (Intel® SSE4.2)

The following examples tell the compiler to generate code that uses Intel® AVX2 instructions:

Linux*

icpx -fsycl -fsycl-targets=spir64_x86_64 -Xsycl-target-backend  "-march=avx2" main.cpp

Windows*

icx -fsycl /EHsc -fsycl-targets=spir64_x86_64 -Xsycl-target-backend  "-march=avx2" main.cpp

Build an Application with Multiple Source Files for CPU Targeting

NOTE:

This section is for SYCL only.

Compile your normal files (with no SYCL kernels) to create host objects. Then compile the file with the kernel code and link it with the rest of the application.

Linux

The following shows an example of Linux compilation code:

icpx -c main.cpp      // This creates the host object that is used below
icpx -c -fsycl-targets=spir64_x86_64 -Xsycl-target-backend "-march=mavx2" mandel.cpp
icpx -fsycl-targets=spir64_x86_64 -Xsycl-target-backend "-march=mavx2" mandel.o main.o

Windows

The following shows an example of Windows compilation code:

icx /EHsc -c main.cpp
icx /EHsc -c -fsycl-targets=spir64_x86_64 -Xsycl-target-backend "-march=mavx2" mandel.cpp
icx -fsycl-targets=spir64_x86_64 -Xsycl-target-backend "-march=mavx2" mandel.obj main.obj

Use AOT for Integrated Graphics (Intel® GPU)

Use the following option arguments to specify Intel® GPU as the target device for AOT compilation:

OpenMP

Option -Xopenmp-target-backend is a general-purpose option, any arguments supplied with -Xopenmp-target-backend will be applied to all offline compilation invocations. These are the relevant options and arguments:

-Xopenmp-target-backend "-device <arch>", where <arch> is the target device
-fopenmp-targets=spir64_gen
-fopenmp-device-code-split=<value> to perform an OpenMP device code split. The <value> is:
- per_kernel, which creates a device code module for each OpenMP kernel

SYCL

Option -Xsycl-target-backend is a general-purpose option, any arguments supplied with -Xsycl-target-backend will be applied to all offline compilation invocations. These are the relevant options and arguments:

-Xsycl-target-backend "-device <arch>", where <arch> is the target device
-fsycl-targets=spir64_gen
-fsycl-device-code-split=<value> option to perform SYCL device code split. The <value> can be:
- per_kernel, which creates a device code module for each SYCL kernel
- per_source, which creates a device code module for each source (translation unit)
- off, which disables device code split
- auto, which tells the compiler to use a heuristic to select the best way of splitting device code
  
  This is the default, and it is the same as specifying -fsycl-device-code-split with no <value>.

To see the complete list of supported target device types for your installed version of OCLOC, run:

ocloc compile --help

To find supported devices look for -device <device_type> in the online help.

If multiple target devices are listed in the compile command, the compiler will compile for each of these targets and create a fat-binary that contains all the device binaries produced this way.

Examples of supported -device patterns:

OpenMP for Linux

To compile for a single target, using skl as an example, use:

icpx -fiopenmp -fopenmp-targets=spir64_gen -Xopenmp-target-backend "-device skl" vector-add.cpp

To compile for two targets, using skl and icllp as examples, use:

icpx -fiopenmp -fopenmp-targets=spir64_gen -Xopenmp-target-backend "-device skl,icllp" vector-add.cpp

To compile for all the targets known to OCLOC, use:

icpx -fiopenmp -fopenmp-targets=spir64_gen -Xopenmp-target-backend "-device *" vector-add.cpp

icpx -fiopenmp -fopenmp-targets=spir64_gen -Xopenmp-target-backend=spir64_gen "-device *" vector-add.cpp

SYCL for Linux

To compile for a single target, using skl as an example, use:

icpx -fsycl -fsycl-targets=spir64_gen -Xsycl-target-backend "-device skl" vector-add.cpp

To compile for two targets, using skl and icllp as examples, use:

icpx -fsycl -fsycl-targets=spir64_gen -Xsycl-target-backend "-device skl,icllp" vector-add.cpp

To compile for all the targets known to OCLOC, use:

icpx -fsycl -fsycl-targets=spir64_gen -Xsycl-target-backend "-device *" vector-add.cpp

To pass multiple options to use OCLOC, use:

-Xs options:

icpx -fsycl -fsycl-targets=spir64_gen -Xs "-device tgllp --format zebin -options <-user-option1> -options <-user-option2>" vector-add.cpp

-Xsycl-target-backend options:

icpx -fsycl -fsycl-targets=spir64_gen -Xsycl-target-backend=spir64_gen "-device tgllp --format zebin -options <-user-option1> -options <-user-option2>" vector-add.cpp

SYCL for Windows

To compile for a single target, using skl as an example, use:

icx -fsycl /EHsc -fsycl-targets=spir64_gen -Xsycl-target-backend "-device skl" vector-add.cpp

To compile for two targets, using skl and icllp as examples, use:

icx -fsycl /EHsc -fsycl-targets=spir64_gen -Xsycl-target-backend "-device skl,icllp" vector-add.cpp

To compile for all the targets known to OCLOC, use:

icx -fsycl /EHsc -fsycl-targets=spir64_gen -Xsycl-target-backend "-device *" vector-add.cpp

icx -fsycl /EHsc -fsycl-targets=spir64_gen -Xsycl-target-backend=spir64_gen "-device *" vector-add.cpp

Build an Application with Multiple Source Files for GPU Targeting

Compile your normal files (with no SYCL kernels) to create host objects. Then compile the file with the kernel code and link it with the rest of the application.

Linux

icpx -c main.cpp
icpx -fsycl -fsycl-targets=spir64_gen -Xsycl-target-backend=spir64_gen "-device *" mandel.o main.o

Windows

icx /c main.cpp
icx -fsycl /EHsc -fsycl-targets=spir64_gen -Xsycl-target-backend=spir64_gen "-device *" mandel.cpp main.obj

Use AOT in Microsoft Visual Studio

NOTE:

This section is for SYCL only.

You can use Microsoft Visual Studio for compiling and linking. Set the following flags to use AOT compilation for CPU or GPU:

CPU:

To compile, in the dialog box, select: Configuration Properties > DPC++ > General > Specify SYCL offloading targets for AOT compilation.
To link, in the dialog box, select: Configuration Properties > Linker > General > Specify CPU Target Device for AOT compilation.

GPU:

To compile, in the dialog box, select: Configuration Properties > DPC++ > General > Specify SYCL offloading targets for AOT compilation.
To link, in the dialog box, select: Configuration Properties > Linker > General > Specify GPU Target Device for AOT compilation.

Available GPU Platforms

GPU Model Name	Vertical Segment	Product Code Name	AOT Compilation Device Name	Compatible Targets
Intel® Arc™ 140T GPU (Integrated in Intel® Core™ Ultra 9 Processor 285H, Intel® Core™ Ultra 7 Processor 265H, Intel® Core™ Ultra 5 Processor 235H)	Mobile	Arrow Lake-H	arl-h
Intel® Graphics (Integrated in Intel® Core™ Ultra 9 Processor 285T, Intel® Core™ Ultra 9 Processor 285K, Intel® Core™ Ultra 9 Processor 285HX, Intel® Core™ Ultra 9 Processor 285, Intel® Core™ Ultra 7 Processor 265T, Intel® Core™ Ultra 7 Processor 265K, Intel® Core™ Ultra 7 Processor 265HX, Intel® Core™ Ultra 7 Processor 265, Intel® Core™ Ultra 5 Processor 245T, Intel® Core™ Ultra 5 Processor 245K, Intel® Core™ Ultra 5 Processor 245HX, Intel® Core™ Ultra 5 Processor 245, Intel® Core™ Ultra 5 Processor 235T, Intel® Core™ Ultra 5 Processor 235)	Desktop/Mobile	Arrow Lake-S	mtl-u (or arl-u, arl-s)	mtl
Intel® Graphics (Integrated in Intel® Core™ Ultra 7 Processor 265U, Intel® Core™ Ultra 5 Processor 235U)	Mobile	Arrow Lake-U	mtl-u (or arl-u, arl-s)	mtl
Intel® Arc™ B580 Graphics, Intel® Arc™ B570 Graphics	Desktop	Battlemage	bmg-g21	bmg
Intel® Arc™ graphics 140V (Integrated in Intel® Core™ Ultra 9 Processor 288V, Intel® Core™ Ultra 7 Processor 268V, Intel® Core™ Ultra 7 Processor 266V, Intel® Core™ Ultra 7 Processor 258V, Intel® Core™ Ultra 7 Processor 256V)	Mobile	Lunar Lake	lnl-m	bmg
Intel® Arc™ graphics 130V (Integrated in Intel® Core™ Ultra 5 Processor 238V, Intel® Core™ Ultra 5 Processor 236V, Intel® Core™ Ultra 5 Processor 228V, Intel® Core™ Ultra 5 Processor 226V)	Mobile	Lunar Lake	lnl-m	bmg
Intel® Arc™ graphics (Integrated in Intel® Core™ Ultra 9 Processor 185H, Intel® Core™ Ultra 7 Processor 165H, Intel® Core™ Ultra 7 Processor 155H, Intel® Core™ Ultra 5 Processor 135H, Intel® Core™ Ultra 5 Processor 125H)	Mobile	Meteor Lake-H	mtl-h	mtl
Intel® Arc™ graphics (Integrated in Intel® Core™ Ultra 7 Processor 165HL, Intel® Core™ Ultra 7 Processor 155HL, Intel® Core™ Ultra 5 Processor 135HL, Intel® Core™ Ultra 5 Processor 125HL)	Embedded	Meteor Lake-H	mtl-h	mtl
Intel® Graphics (Integrated in Intel® Core™ Ultra 7 Processor 165U, Intel® Core™ Ultra 7 Processor 164U, Intel® Core™ Ultra 7 Processor 155U, Intel® Core™ Ultra 5 Processor 135U, Intel® Core™ Ultra 5 Processor 134U, Intel® Core™ Ultra 5 Processor 125U)	Mobile	Meteor Lake-U	mtl-u (or arl-u, arl-s)	mtl
Intel® Graphics (Integrated in Intel® Core™ Ultra 7 Processor 165UL, Intel® Core™ Ultra 7 Processor 155UL, Intel® Core™ Ultra 5 Processor 135UL, Intel® Core™ Ultra 5 Processor 125UL, Intel® Core™ Ultra 3 Processor 105UL)	Embedded	Meteor Lake-U	mtl-u (or arl-u, arl-s)	mtl
Intel® MAX® 1550, Intel® MAX® 1100	Data Center	Ponte Vecchio	pvc
Intel® Flex 170	Data Center	Arctic Sound	ats-m150	dg2
Intel® Flex 140	Data Center	Arctic Sound	ats-m75	dg2
Intel® Arc™ A770, Intel® Arc™ A750, Intel® Arc™ A580	Desktop	Alchemist	acm-g10 (or dg2-g10, ats-m150)	dg2
Intel® Arc™ A770M, Intel® Arc™ A730M, Intel® Arc™ A550M	Mobile	Alchemist	acm-g10 (or dg2-g10, ats-m150)	dg2
Intel® Arc™ A380, Intel® Arc™ A310, Intel® Arc™ Pro A40/A50	Desktop	Alchemist	acm-g11 (or dg2-g11, ats-m75)	dg2
Intel® Arc™ A370M, Intel® Arc™ A350M, Intel® Arc™ Pro A30M	Mobile	Alchemist	acm-g11 (or dg2-g11, ats-m75)	dg2
Intel® Arc™ A380E, Intel® Arc™ A370E, Arc™ A350E, Intel® Arc™ A310E	Embedded	Alchemist	acm-g11 (or dg2-g11, ats-m75)	dg2
Intel® UHD Graphics	Mobile	Alder Lake-N	adl-n
Intel® UHD Graphics, Intel® Iris® X^e graphics	Mobile	Alder Lake-P	adl-p
Intel® UHD Graphics 770/730/710	Mobile	Alder Lake-S	adl-s
Intel® UHD Graphics 617/615	Mobile	Amber Lake	aml
Intel® HD Graphics, Intel® HD Graphics 505/500	Mobile	Apollo Lake, Broxton	apl (or bxt)
Intel® Iris® Plus graphics 655/645, Intel® UHD Graphics 630/610/P630	Mobile	Coffee Lake	cfl
Intel® UHD Graphics	Mobile	Comet Lake	cml
Intel® Iris® X^e MAX graphics, Intel® Iris® X^e graphics, Intel® Iris® X^e MAX 100, Intel® Server GPU SG-18M	Mobile/Server	DG1	dg1
Intel® UHD Graphics	Mobile	Elkhart Lake, Jasper Lake	ehl jsl
Intel® UHD Graphics 605/600	Mobile	Gemini Lake	glk
Intel® HD Graphics, Intel® UHD Graphics, Intel® Iris® Plus Graphics	Mobile	Ice Lake	icllp
Intel® HD Graphics 635, Intel® Iris® Plus Graphics 650/640, Intel® HD Graphics 630/620/P630/615/610, Intel® UHD Graphics 617/615	Mobile	Kaby Lake	kbl
Intel® UHD Graphics 750/730/P750	Mobile	Rocket Lake	rkl
Intel® Iris® X^e Graphics, Intel® UHD Graphics	Mobile	Raptor Lake-P	rpl-p
Intel® UHD Graphics 770/730/710	Mobile	Raptor Lake-S	rpl-s
Intel® HD Graphics 535/530/520/515/510/P530, Intel® Iris® Pro Graphics 580/P580, Intel® Iris® Graphics 555/550/540/P555	Mobile	Intel® microarchitecture code name Skylake	skl
Intel® UHD Graphics, Intel® Iris® X^e Graphics	Mobile	Tiger Lake	tgllp
Intel® UHD Graphics, Intel® UHD Graphics 620	Mobile	Whiskey Lake	whl

Use AOT with Non-Intel GPUs

SYCL

In addition to targeting Intel GPUs, SYCL applications can be compiled once and run on a variety of hardware, including AMD* and NVIDIA* GPUs. You can create a single binary that incorporates device code capable of running on AMD GPUs, NVIDIA GPUs, or any device supporting SPIR-V*, including Intel GPUs.

To see the environment setup:

See Set Up Your Environment for compiling with AMD GPUs.
See Set Up Your Environment for compiling with NVIDIA GPUs.

SYCL on NVIDIA GPUs for Linux

To compile and run SYCL for an NVIDIA GPU, using NVIDIA sm_80 GPU architecture, for example:

icpx -fsycl -fsycl-targets=nvptx64-nvidia-cuda -Xsycl-target-backend=nvptx64-nvidia-cuda --offload-arch=sm_80 -o sycl-app sycl-app.cpp

-Xsycl-target-backend=nvptx64-nvidia-cuda tells the flag parser that the following flag should be passed only to the compiler backend for the nvptx64-nvidia-cuda target.
--offload-arch=sm_80 specifies the NVIDIA GPU architecture (compute capability) sm_80 for the AOT compilation.

NOTE:

The --offload-arch=<arch> syntax, used here, is different from the -device <intel-arch> syntax, which is required for the compiler toolchain used for Intel targets.

SYCL on AMD GPUs for Linux

To compile and run SYCL for an AMD GPU, using AMD GPU gfx90a architecture, for example:

icpx -fsycl -fsycl-targets=amdgcn-amd-amdhsa -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx90a -o sycl-app sycl-app.cpp

-Xsycl-target-backend=amdgcn-amd-amdhsa tells the flag parser that the following flag should be passed only to the compiler backend for the amdgcn-amd-amdhsa target.
--offload-arch=gfx90a specifies the AMD GPU architecture gfx90a for the AOT compilation.

Using Alias Targets

The compiler driver offers alias targets for each target and architecture pair to make the command line shorter and more human-readable. The -Xsycl-target-backend flags may be omitted using aliases. The following example shows how you can output a single binary, including device code, that can run directly from the binary without any JIT compilation on AMD GPUs, NVIDIA GPUs, and Ponte Vecchio Intel GPUs, or on any device that supports SPIR-V with JIT compilation (for example, on Intel integrated GPUs).

icpx -fsycl -fsycl-targets=intel_gpu_pvc,amd_gpu_gfx90a,nvidia_gpu_sm_80 \ 
      -o sycl-app sycl-app.cpp

The previous command is equivalent to:

icpx -fsycl -fsycl-targets=spir64_gen,amdgcn-amd-amdhsa,nvptx64-nvidia-cuda \ 
-Xsycl-target-backend=spir64_gen '-device pvc' \  
-Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx90a \ 
-Xsycl-target-backend=nvptx64-nvidia-cuda --offload-arch=sm_80 \ 
-o sycl-app sycl-app.cpp

Parent topic: Compilation

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Developer Guide and Reference

Ahead of Time Compilation

Prerequisites

Requirements for Accelerators

AOT Compilation Supported Options for OpenMP

AOT Compilation Supported Options for SYCL

Use AOT for the Target Device (Intel® CPUs)

Use AOT for Integrated Graphics (Intel® GPU)

Use AOT in Microsoft Visual Studio

Available GPU Platforms

Use AOT with Non-Intel GPUs

See Also