Visible to Intel only — GUID: GUID-FA36D5B9-8291-4D34-97C7-ED0F5F9CB228
Visible to Intel only — GUID: GUID-FA36D5B9-8291-4D34-97C7-ED0F5F9CB228
Separating Device and Host Code Compilation
The Intel® oneAPI DPC++/C++ Compiler supports only the ahead-of-time (AoT) compilation for FPGA hardware and simulation, which means that an FPGA device image is generated at compile time. The FPGA device image generation process can take hours to complete.
If you make a change exclusive to the host code, then recompile only your host code by reusing the existing FPGA device image and circumventing the time-consuming device compilation process.
The Intel® oneAPI DPC++/C++ Compiler provides the following mechanisms to separate device code and host code compilation:
Using the -reuse-exe Flag
If the device code and options affecting the device have not changed since the previous compilation, passing the -reuse-exe=<exe_name> flag instructs the compiler to extract the compiled FPGA hardware or simulation image from the existing executable and package it into the new executable, saving the device compilation time.
Sample use:
# Initial compilation
icpx -fintelfpga -Xshardware <files.cpp> -o out.fpga
The initial compilation generates an FPGA device image, which takes several hours. Suppose you now make some changes to the host code.
# Subsequent recompilation
icpx -fintelfpga -Xshardware <files.cpp> -o out.fpga -reuse-exe=out.fpga
One of the following actions are taken by the command:
If the out.fpga file does not exist, the -reuse-exe flag is ignored, and the FPGA device image is regenerated. This is always the case the first time you compile a project.
If the out.fpga file is found, the compiler verifies no change that affects the FPGA device code is made since the last compilation. If no change is detected in the device code, the compiler then reuses the existing FPGA device image and recompiles only the host code. The recompilation process takes a few minutes to complete.
If the out.fpga file is found, but the compiler cannot prove that the FPGA device code will yield a result identical to the last compilation, a warning is printed, and the FPGA device code is fully recompiled. Since the compiler checks must be conservative, spurious recompilations can sometimes occur when using the -reuse-exe flag.
Using the Device Link Method
Suppose the program is separated into two files, main.cpp and kernel.cpp, where only the kernel.cpp file contains the device code.
In the normal compilation process, FPGA device image generation happens at link time.
# normal compile command
icpx -fintelfpga -Xshardware main.cpp kernel.cpp -o link.fpga
As a result, any change to either the main.cpp or kernel.cpp triggers the regeneration of an FPGA hardware image.
The following graph depicts this compilation process:
If you want to iterate on the host code and avoid a long compile time for your FPGA device, consider using a device link to separate the device and host compilation:
# device link command
icpx -fintelfpga -fsycl-link=image <input files> [options]
The compilation is a three-step process as listed in the following:
- Compile the device code.
icpx -fintelfpga -Xshardware -fsycl-link=image kernel.cpp -o dev_image.a
Input files must include all files that contain the device code. This step might take several hours to complete.
- Compile the host code.
icpx -fintelfpga main.cpp -c -o host.o
Input files should include all source files that contain only the host code. These files must not contain any source code that executes on the device but may contain setup and tear-down code, for example, parsing command-line options and reporting results. This step takes seconds to complete.
- Create the device link.
icpx -fintelfpga host.o dev_image.a -o fast_recompile.fpga
This step takes seconds to complete. The input should include one or more host object files (.o) and exactly one device image file (.a). When linking a static library (.a file), always include the static library after its use. Otherwise, the library’s functions are discarded. For additional information about static library linking, refer to Library order in static linking.
The following diagram illustrates the device link process:
Refer to the fast_recompile tutorial for an example using the device link method.
Using the -fsycl-device-code-split[=value] Option
The -fsycl-device-code-split[=value] option informs the compiler how to separate your design into device code modules.
For details about this option, refer to the -fsycl-device-code-split option description in the Other Supported FPGA Flags tables in "FPGA Compilation Flags".
Which Mechanism to Use?
Of the mechanisms described above, the -reuse-exe flag mechanism is easier to use than the device link mechanism. The flag also allows you to keep your host and device code as a single source, which is preferred for small programs. For larger and more complex projects, the device link method gives you more control over the compiler’s behavior.
However, there are some drawbacks of the -reuse-exe flag when compared to compiling separate files. Consider the following when using the -reuse-exe flag:
The compiler must spend time partially recompiling and then analyzing the device code to ensure that it is unchanged. This takes several minutes for larger designs. Compiling separate files does not incur this extra time.
You might occasionally encounter a false positive where the compiler incorrectly believes it must recompile your device code. In a single source file, the device and host code are coupled, so certain changes to the host code can change the compiler’s view of the device code. The compiler always behaves conservatively and triggers a full recompilation if it cannot prove that reusing the previous FPGA binary is safe. Compiling separate files eliminates this possibility.