Visible to Intel only — GUID: GUID-BC0F5DE9-976C-46F7-BF8F-023681810C0F
Visible to Intel only — GUID: GUID-BC0F5DE9-976C-46F7-BF8F-023681810C0F
Ahead of Time Compilation
Ahead of Time (AOT) Compilation is a helpful feature for your development lifecycle or distribution time. The AOT feature provides the following benefits when you know beforehand what your target device is going to be at application execution time:
No additional compilation time is done when running your application.
No just-in-time (JIT) bugs encountered due to compilation for the target. Any bugs should be found during AOT and resolved.
Your final code, executing on the target device, can be tested as-is before you deliver it to end-users.
A program built with AOT compilation for specific target device(s) will not run on different device(s). You must detect the proper target device at runtime and report an error if the targeted device is not present. The use of exception handling with an asynchronous exception handler is recommended.
SYCL supports AOT compilation for the following targets: Intel® CPUs, Intel® Processor Graphics, and Intel® FPGA. For details on AOT compilation for Intel FPGAs, refer to the Intel® oneAPI FPGA Handbook.
OpenMP supports AOT compilation for the following targets: Intel® Processor Graphics.
For additional information, watch two videos for a quick overview on how to apply the JIT and AOT compilation options:
- Debug Just-in-Time and Ahead-of-Time GPU Code with Intel® Distribution for GDB*
- Compilation Options and Debugging: Just-in-Time and Ahead-of-Time GPU Code with Intel® Distribution for GDB*
Prerequisites
To target a GPU with the AOT feature, you must have the OpenCL™ Offline Compiler (OCLOC) tool installed. OCLOC can generate binaries that use OpenCL™ (SYCL only) or the Intel® oneAPI Level Zero (Level Zero) backend.
OCLOC is not packaged with the compiler and must be installed separately. To install OCLOC, you need to install the GPU drivers (whether or not you have an Intel GPU on your system). Refer to the Installing GPU drivers for instructions.
Requirements for Accelerators
GPUs:
Intel® UDH Graphics for 11th generation Intel processors or newer
Intel® Iris® Xe graphics
Intel® Arc™ graphics
Intel® Data Center GPU Flex Series
Intel® Data Center GPU Max Series
AOT Compilation Supported Options for OpenMP
Use the following options to target a specific device for AOT compilation for OpenMP:
-fopenmp-target to specify the device target
-Xopenmp-target-backend to pass options to the backend tool
Option -Xopenmp-target-backend is a general device target option. If multiple targets are desired (for example: -fopenmp-targets=spir64,spir64_gen), the options specified with -Xopenmp-target-backend apply to all targets.
For multiple targets, you can add specificity by using, for example, Xopenmp-target-backend=spir64_gen <option>.
When using Ahead of Time (AOT) compilation, the options passed with -Xopenmp-target-backend are not compiler options, but rather options to pass to OCLOC.
To see a list of the options you can pass with -Xopenmp-target-backend when using AOT, specify -fsycl-help=gen on the command line.
AOT Compilation Supported Options for SYCL
Use the following options to target a specific device for AOT compilation for SYCL:
-fsycl-target to specify the device target
-Xsycl-target-backend to pass options to the backend tool
Option -Xsycl-target-backend is a general device target option. If multiple targets are desired (for example: -fopenmp-targets=spir64,spir64_gen), the options specified with -Xsycl-target-backend apply to all targets.
For multiple targets, you can add specificity by using, for example, Xsycl-target-backend=spir64_gen <option>.
When using Ahead of Time (AOT) compilation, the options passed with -Xsycl-target-backend are not compiler options.
To see a list of the options you can pass with -Xsycl-target-backend when using AOT, specify -fsycl-help=gen on the command line.
Use AOT for the Target Device (Intel® CPUs)
SYCL compilation is only available with the C/C++ compiler.
However, you can link SYCL-generated objects with the Fortran compiler. The use of -fsycl with ifx allows this, though it is restricted to spir64, spir64_gen, and spir64_x86_64).
Use the following option argument to specify Intel® CPUs as the target device for AOT compilation:
-fsycl-targets=spir64_x86_64
The following examples tell the compiler to generate code that uses Intel® AVX2 instructions:
Linux
ifx -fsycl -fsycl-targets=spir64_x86_64 -Xsycl-target-backend "-march=avx2" main.o
Windows
ifx -fsycl -fsycl-targets=spir64_x86_64 -Xsycl-target-backend=spir64_x86_64 "-march=avx2" main.obj
Build an Application with Multiple Source Files for CPU Targeting
Compile your normal files (with no SYCL kernels) to create host objects. Then compile the file with the kernel code and link it with the rest of the application.
Linux
The following shows an example of C/C++ Linux* compilation code:
icpx -c main.cpp // This creates the host object that is used below. icpx -c -fsycl-targets=spir64_x86_64 -Xsycl-target-backend "-march=mavx2" mandel.cpp
For C/C++, this would be the next step:
icpx -fsycl-targets=spir64_x86_64 -Xsycl-target-backend "-march=mavx2" mandel.o main.o
Note that Fortran can use the -c compiled variant as follows:
ifx -fsycl -fsycl-targets=spir64_x86_64 -Xsycl-target-backend "-march=mavx2" mandel.o main.o
Windows
The following shows an example of C/C++ Windows* compilation code:
icx /EHsc -c main.cpp icx /EHsc -c -fsycl-targets=spir64_x86_64 -Xsycl-target-backend "-march=mavx2" mandel.cpp
For C/C++, this would be the next step:
icx -fsycl-targets=spir64_x86_64 -Xsycl-target-backend "-march=mavx2" mandel.obj main.obj
Note that Fortran can use the -c compiled variant as follows:
ifx -fsycl -fsycl-targets=spir64_x86_64 -Xsycl-target-backend "-march=mavx2" mandel.obj main.obj
Use AOT for Integrated Graphics (Intel® GPU)
Use the following option arguments to specify Intel® GPU as the target device for AOT compilation:
OpenMP
Option -Xopenmp-target-backend is a general-purpose option, any arguments supplied with -Xopenmp-target-backend will be applied to all offline compilation invocations. These are the relevant options and arguments:
-Xopenmp-target-backend "-device <arch>", where <arch> is the target device
-fopenmp-targets=spir64_gen
-fopenmp-device-code-split=<value> to perform an OpenMP device code split. The <value> is:
per_kernel, which creates a device code module for each OpenMP kernel
SYCL
Option -Xsycl-target-backend is a general-purpose option, any arguments supplied with -Xsycl-target-backend will be applied to all offline compilation invocations. These are the relevant options and arguments:
-Xsycl-target-backend "-device <arch>", where <arch> is the target device
-fsycl-targets=spir64_gen
-fsycl-device-code-split=<value> option to perform SYCL device code split. The <value> can be:
per_kernel, which creates a device code module for each SYCL kernel
per_source, which creates a device code module for each source (translation unit)
off, which disables device code split
auto, which tells the compiler to use a heuristic to select the best way of splitting device code
This is the default, and it is the same as specifying -fsycl-device-code-split with no <value>.
To see the complete list of supported target device types for your installed version of OCLOC, run:
ocloc compile --help
To find supported devices look for -device <device_type> in the online help.
If multiple target devices are listed in the compile command, the compiler will compile for each of these targets and create a fat-binary that contains all the device binaries produced this way.
Examples of supported -device patterns:
OpenMP for Linux
- To compile for a single target, using skl as an example, use:
ifx -fiopenmp -fopenmp-targets=spir64_gen -Xopenmp-target-backend "-device skl" vector-add.f90
- To compile for two targets, using skl and icllp as examples, use:
ifx -fiopenmp -fopenmp-targets=spir64_gen -Xopenmp-target-backend "-device skl,icllp" vector-add.f90
- To compile for all the targets known to OCLOC, use:
ifx -fiopenmp -fopenmp-targets=spir64_gen -Xopenmp-target-backend=spir64_gen "-device *" vector-add.f90
SYCL for Linux
Consider the following C/C++ command:
icpx -fsycl -fsycl-targets=spir64_gen -Xsycl-target-backend "-device *" vector-add.cpp
If vector-add.cpp is compiled with option -c to create vector-add.obj, then Fortran can use this SYCL-based fat object with the following command:
ifx -fsycl -fsycl-targets=spir64_gen -Xsycl-target-backend "-device *" vector-add.obj
SYCL for Windows
Consider the following C/C++ command:
icpx -fsycl -fsycl-targets=spir64_gen -Xsycl-target-backend "-device *" vector-add.cpp
If vector-add.cpp is compiled with option -c to create vector-add.obj, then Fortran can use this SYCL-based fat object with the following command:
ifx -fsycl -fsycl-targets=spir64_gen -Xsycl-target-backend "-device *" vector-add.obj
Build an Application with Multiple Source Files for GPU Targeting
Compile your normal files (with no SYCL kernels) to create host objects. Then compile the file with the kernel code and link it with the rest of the application.
Linux
Consider the following C/C++ command:
icpx -fsycl -fsycl-targets=spir64_gen -Xsycl-target-backend=spir64_gen "-device *" mandel.cpp main.o
Assuming that mandel.o has been built by the C/C++ compiler, Fortran can use this SYCL-based fat object with the following command:
ifx -fsycl -fsycl-targets=spir64_gen -Xsycl-target-backend=spir64_gen "-device *" main.o mandel.o
Windows
Consider the following C/C++ command:
icx -fsycl /EHsc -fsycl-targets=spir64_gen -Xsycl-target-backend=spir64_gen "-device *" -c mandel.cpp
Assuming that mandel.o has been built by the C/C++ compiler, Fortran can use this SYCL-based fat object with the following command:
ifx -fsycl -fsycl-targets=spir64_gen -Xsycl-target-backend=spir64_gen "-device *" mandel.obj main.obj
Available GPU Platforms
GPU Model Name | Vertical Segment | Product Code Name | AOT Compilation Device Name | Compatible Targets |
---|---|---|---|---|
Intel® Arc™ graphics 140V (Integrated in Intel® Core™ Ultra 9 Processor 288V, Intel® Core™ Ultra 7 Processor 268V, Intel® Core™ Ultra 7 Processor 266V, Intel® Core™ Ultra 7 Processor 258V, Intel® Core™ Ultra 7 Processor 256V) |
Mobile |
Lunar Lake |
lnl-m |
|
Intel® Arc™ graphics 130V (Integrated in Intel® Core™ Ultra 5 Processor 238V, Intel® Core™ Ultra 5 Processor 236V, Intel® Core™ Ultra 5 Processor 228V, Intel® Core™ Ultra 5 Processor 226V) |
Mobile |
Lunar Lake |
lnl-m |
|
Intel® Arc™ graphics (Integrated in Intel® Core™ Ultra 9 Processor 185H, Intel® Core™ Ultra 7 Processor 165H, Intel® Core™ Ultra 7 Processor 155H, Intel® Core™ Ultra 5 Processor 135H, Intel® Core™ Ultra 5 Processor 125H) |
Mobile |
Meteor Lake-H |
mtl-h |
mtl |
Intel® Arc™ graphics (Integrated in Intel® Core™ Ultra 7 Processor 165HL, Intel® Core™ Ultra 7 Processor 155HL, Intel® Core™ Ultra 5 Processor 135HL, Intel® Core™ Ultra 5 Processor 125HL) |
Embedded |
Meteor Lake-H |
mtl-h |
mtl |
Intel® Graphics (Integrated in Intel® Core™ Ultra 7 Processor 165U, Intel® Core™ Ultra 7 Processor 164U, Intel® Core™ Ultra 7 Processor 155U, Intel® Core™ Ultra 5 Processor 135U, Intel® Core™ Ultra 5 Processor 134U, Intel® Core™ Ultra 5 Processor 125U) |
Mobile |
Meteor Lake-U, Arrow Lake-U/S |
mtl-u (or arl-u, arl-s) |
mtl |
Intel® Graphics (Integrated in Intel® Core™ Ultra 7 Processor 165UL, Intel® Core™ Ultra 7 Processor 155UL, Intel® Core™ Ultra 5 Processor 135UL, Intel® Core™ Ultra 5 Processor 125UL, Intel® Core™ Ultra 3 Processor 105UL) |
Embedded |
Meteor Lake-U, Arrow Lake-U/S |
mtl-u (or arl-u, arl-s) |
mtl |
Intel® MAX® 1550, Intel® MAX® 1100 |
Data Center |
Ponte Vecchio |
pvc |
|
Intel® Flex 170 |
Data Center |
Arctic Sound |
ats-m150 |
dg2 |
Intel® Flex 140 |
Data Center |
Arctic Sound |
ats-m75 |
dg2 |
Intel® Arc™ A770, Intel® Arc™ A750, Intel® Arc™ A580 |
Desktop |
Alchemist |
acm-g10 (or dg2-g10, ats-m150) |
dg2 |
Intel® Arc™ A770M, Intel® Arc™ A730M, Intel® Arc™ A550M |
Mobile |
Alchemist |
acm-g10 (or dg2-g10, ats-m150) |
dg2 |
Intel® Arc™ A380, Intel® Arc™ A310, Intel® Arc™ Pro A40/A50 |
Desktop | Alchemist |
acm-g11 (or dg2-g11, ats-m75) |
dg2 |
Intel® Arc™ A370M, Intel® Arc™ A350M, Intel® Arc™ Pro A30M |
Mobile |
Alchemist |
acm-g11 (or dg2-g11, ats-m75) |
dg2 |
Intel® Arc™ A380E, Intel® Arc™ A370E, Arc™ A350E, Intel® Arc™ A310E |
Embedded |
Alchemist |
acm-g11 (or dg2-g11, ats-m75) |
dg2 |
Intel® UHD Graphics |
Mobile |
Alder Lake-N |
adl-n |
|
Intel® UHD Graphics, Intel® Iris® Xe graphics |
Mobile |
Alder Lake-P |
adl-p |
|
Intel® UHD Graphics 770/730/710 |
Mobile |
Alder Lake-S |
adl-s |
|
Intel® UHD Graphics 617/615 |
Mobile |
Amber Lake |
aml |
|
Intel® HD Graphics, Intel® HD Graphics 505/500 |
Mobile |
Apollo Lake, Broxton |
apl (or bxt) |
|
Intel® Iris® Plus graphics 655/645, Intel® UHD Graphics 630/610/P630 |
Mobile |
Coffee Lake |
cfl |
|
Intel® UHD Graphics |
Mobile |
Comet Lake |
cml |
|
Intel® Iris® Xe MAX graphics, Intel® Iris® Xe graphics, Intel® Iris® Xe MAX 100, Intel® Server GPU SG-18M |
Mobile/Server |
DG1 |
dg1 |
|
Intel® UHD Graphics |
Mobile |
Elkhart Lake, Jasper Lake |
ehl jsl |
|
Intel® UHD Graphics 605/600 |
Mobile |
Gemini Lake |
glk |
|
Intel® HD Graphics, Intel® UHD Graphics, Intel® Iris® Plus Graphics |
Mobile |
Ice Lake |
icllp |
|
Intel® HD Graphics 635, Intel® Iris® Plus Graphics 650/640, Intel® HD Graphics 630/620/P630/615/610, Intel® UHD Graphics 617/615 |
Mobile |
Kaby Lake |
kbl |
|
Intel® UHD Graphics 750/730/P750 |
Mobile |
Rocket Lake |
rkl |
|
Intel® Iris® Xe Graphics, Intel® UHD Graphics |
Mobile |
Raptor Lake-P |
rpl-p |
|
Intel® UHD Graphics 770/730/710 |
Mobile |
Raptor Lake-S |
rpl-s |
|
Intel® HD Graphics 535/530/520/515/510/P530, Intel® Iris® Pro Graphics 580/P580, Intel® Iris® Graphics 555/550/540/P555 |
Mobile |
Intel® microarchitecture code name Skylake |
skl |
|
Intel® UHD Graphics, Intel® Iris® Xe Graphics |
Mobile |
Tiger Lake |
tgllp |
|
Intel® UHD Graphics, Intel® UHD Graphics 620 |
Mobile |
Whiskey Lake |
whl |