Visible to Intel only — GUID: GUID-FA8B7CE5-7242-406D-A8C4-A82A1584C64D
Visible to Intel only — GUID: GUID-FA8B7CE5-7242-406D-A8C4-A82A1584C64D
Graph Compiler
oneDNN Graph Compiler is an experimental backend for oneDNN Graph API. It can generate optimized implementations for complex computational graphs including multi-head attention (MHA), multi-layer perceptron (MLP), and convolution residual blocks over typical data types for both inference and training. It also brings improved performance by providing more flexible operator fusion.
Use of oneDNN Graph Compiler is transparent for applications, as it does not involve API or programming model changes.
Build-Time Controls
The following build time options only work when ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_BACKEND is ON.
CMake Option |
Supported values (defaults in bold) |
Description |
---|---|---|
ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_CPU_JIT |
llvm, c, builtin |
Selects the CPU codegen and JIT to be built by graph compiler backend. Multiple codegen approaches can be used simultaneously. See the example for setting multiple codegen methods. |
ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_CPU_LLVM_CONFIG |
AUTO , path to llvm-config binary |
Defines the method for detecting and configuring LLVM. |
Codegen and JIT Options
Graph compiler backend supports several different codegen and JIT options including C, LLVM, and builtin (xbyak). Users can choose to build a subset of available options by setting the ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_CPU_JIT option.
cmake .. -DONEDNN_BUILD_GRAPH=ON -DONEDNN_EXPERIMENTAL_GRAPH_COMPILER_BACKEND=ON -DONEDNN_EXPERIMENTAL_GRAPH_COMPILER_CPU_JIT="c;builtin"
This will only build c and builtin codegen options.
cmake .. -DONEDNN_BUILD_GRAPH=ON -DONEDNN_EXPERIMENTAL_GRAPH_COMPILER_BACKEND=ON -DONEDNN_EXPERIMENTAL_GRAPH_COMPILER_CPU_JIT="llvm;c;builtin"
This will build all three codegen options.
C
C codegen generates temporary cpp files and adopts g++ to compile them into the executable. It can be used for debugging purposes as the generated code is more friendly and readable to developers.
LLVM
LLVM codegen generates LLVM-IR in memory. It provides the best performance among all supported codegen methods. When LLVM codegen is chosen, extra LLVM dependency is required. If LLVM does not exist in this case, a CMake error will occur.
Users can set ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_CPU_LLVM_CONFIG to specify the LLVM to be integrated. By default, ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_CPU_LLVM_CONFIG is set to AUTO, which auto-detects existing LLVM in the environment. If auto-detection fails or user wants to explicitly specify the version of LLVM, a specific path to llvm-config binary shall be set.
Users can follow the guidelines to build and install LLVM from source, or download and install the pre-built binary from here.
Builtin
Builtin codegen and JIT method is implemented with xbyak technology inside. Compared with C or LLVM codegen, it has no extra dependency.
Environment Variables
The following environment variables are introduced by the graph compiler backend.
Environment Variable |
Value |
Description |
---|---|---|
ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_CPU_JIT |
llvm builtin c |
Uses LLVM as codegen and JIT method Uses builtin as codegen and JIT method Uses C as codegen and JIT method |
ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_OPT_LEVEL |
0 1,2, 3 |
Turns off optimization passes and sets the compilation optimization level to be 0 in C and LLVM JIT Sets the compilation optimization level of C and LLVM JIT |
ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_KERNEL_TRACE |
0 1, stderr or filename.json |
No kernel execution trace output Generates kernel execution trace to the file specified by the given filename with chrome tracing format |
ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_PRINT_PASS_RESULT |
0 |
No IR output after each graph or tensor IR pass |
ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_PRINT_PASS_RESULT |
1 |
Prints the output IR of each graph and tensor IR passes |
ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_VERBOSE |
0 1 2 |
No verbose output Prints warning messages during compilation Prints warning messages and info logs (e.g. fusion-related information) during compilation |
ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_DUMP_GENCODE |
path_to_dump |
Dumps the generated kernel in C |
ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_C_INCLUDE |
path_to_c_codegen_header |
Specifies the C codegen header for JIT compilation |
Enable Tracing
ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_KERNEL_TRACE=1 ./application
This will produce a kernel execution trace in JSON format that will be stored to the default destination: ./sctrace.json.
ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_KERNEL_TRACE=1,stderr ./application
This will dump a kernel execution trace to the stderr stream.
ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_KERNEL_TRACE=1,/tmp/filename.json ./application
This will produce a kernel execution trace in JSON format that will be stored to the user specified path /tmp/filename.json.
Switch Between Different Codegen Methods
By default, codegen methods have priorities ranked from higher to lower as llvm, c, builtin. When multiple codegen and JIT methods are enabled at build stage, the method with the highest priority is adopted at runtime by default.
Users can switch to a different codegen method at runtime by setting ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_CPU_JIT.
ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_CPU_JIT=builtin ./application
This will switch the CPU codegen and JIT method to builtin (xbyak).
ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_CPU_JIT=c ./application
This will switch the CPU codegen and JIT method to c.
When using C codegen option, the generated C code will rely on existing runtime function declarations in cpu_include.hpp. ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_C_INCLUDE environment variable is used to specify the corresponding include path. Normally, the include path is automatically set at CMake build stage. But if the following error message occurs environment variable ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_C_INCLUDE is not set, users shall manually set ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_C_INCLUDE to /path_to_onednn_repo/src/graph/backend/graph_compiler/core/src.
Enable Code Dumping
Users can use ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_DUMP_GENCODE variable to generate offline C kernels.
ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_DUMP_GENCODE="./dump_code" ./application
This will dump the generated C kernels to dump_code folder.