Where to Find the Release
Intel® Distribution for GDB* is available as part of the Intel® oneAPI Base Toolkit. To download the Intel® oneAPI Base Toolkit and learn more about toolkits, visit the Intel® Toolkits main page.
Visit Intel® Toolkit and Component Versioning Schema for more information about semantic versioning and how it is used with Intel oneAPI.
Release Notes for Linux* OS
Major Features
- Multi-target: The debugger can orchestrate multiple targets of different architectures. This feature allows you to debug the "host" portion and the "kernel" of a DPC++ program in the same GDB* session.
- Auto-attach: The debugger automatically creates an inferior that attaches itself to the Intel® Graphics Technology target to be able to receive events and control the GPU for native debugging.
- Thread SIMD lanes: The debugger displays SIMD lane information for the GPU threads on the command line interface. You can switch between lanes.
- Support for debugging a kernel offloaded to a CPU, GPU, or FPGA-emulation device.
- The debugger is based on the GDB* 10.2 release.
Key Capabilities
- Support for Intel® Iris® Xe graphics.
- Automatically detecting JIT-compiled, or dynamically loaded, kernel code for debugging.
- Defining breakpoints (both inside and outside of a kernel) to halt the execution of the program.
- Inspecting and changing the values of variables in the application.
- Inspecting and changing register values.
- Listing the threads; switching the current thread context.
- Listing active SIMD lanes; switching the current SIMD lane context per thread.
- Evaluating and printing the values of expressions in multiple threads and SIMD lane contexts.
- Disassembling the machine instructions.
- Displaying and navigating the function call-stack.
- Source- and instruction-level stepping.
- Non-stop and all-stop debug mode.
- Recording the execution using Intel® Processor Trace.
- Printing of Intel® Processor Trace PTWRITE payloads in the instruction history and function-call history
- Converting Intel® Transactional Synchronization Extensions (Intel® TSX) abort reasons into human-readable format.
- Displaying the shadow stack backtrace and Intel® Control-flow Enforcement Technology Debugging status information.
System Requirements
- General hardware requirements: Intel® oneAPI Base Toolkit System Requirements.
- Specific system requirements: Intel® Distribution for GDB* System Requirements.
Documentation
-
To set up the debugger, refer to the Get Started with Debugging Data Parallel C++.
-
To follow basic debugging scenarios, refer to the Tutorial.
-
To see common Intel® Distribution for GDB* commands, refer to the Reference Sheet.
Changes in the 2021.4 Release
- The debugger has been rebased on the GDB 10.2 release.
- In the output of the info threads command, GPU threads that are running or have exited appear to have a program counter (i.e. instruction pointer) of 0x0000.
- Two pseudo-registers, ip and framedesc, are available in the info registers command output.
- The inferior that represents the GPU kernel no longer appears if the GPU is not used by the program.
- The "default" thread that used to appear in the GPU inferior (for example, the thread with id 1610612736) is eliminated.
- Additional virtual debug registers can be inspected for GPU offload, for example, btbase and scrbase. These registers do not have a hardware counterpart and expose an internal GPU state.
- Basic debugging is supported for OpenMP* Fortran CPU/GPU offload scenario.
- For OpenMP CPU offload when Intel® oneAPI Level Zero (Level Zero) loader libraries are installed, the following environment variable needs to be set to use the OpenCL™ platform CPU back end: LIBOMPTARGET_PLUGIN=opencl.
- A new convenience variable, $_simd_lane, defines the selected SIMD lane of the current thread.
- A new code sample Jacobi is available. There are a few bugs that were added intentionally. You can use the debugger to find these bugs.
- Debugging a kernel offloaded to a GPU in Microsoft Visual Studio Code* is experimentally supported.
- Various bugs are fixed:
-
Scoped local variables inside inlined functions are no longer displayed multiple times.
-
A host thread no longer disappears during stepping.
-
The "jump" command is no longer refused when used inside a kernel.
-
- Intel® Advanced Matrix Extensions (Intel® AMX) support in GDB for formerly code named Golden Cove CPU and Sapphire Rapids platforms. This support enables Intel® AMX registers to be observed as matrices in GDB. For more information about Intel® AMX please go to the Intel® AMX chapter in the Intel® Architecture Instruction Set Extensions Programming Reference. Known Intel® AMX limitations:
- JUMP command, inferior call, or any manual change of RIP within AMX execution might not work.
- For code compiled with -m32 flag, the core files have three missing registers: fs_base, gs_base, and orig_rax.
- Support for debugging programs that offload kernels to multiple GPUs. The multi-gpu feature is not available in the non-stop mode.
- Support for producing and debugging Intel® Control-Flow Enforcement Technology (Intel® CET) core dumps. For more details on Intel CET, see the Intel® 64 and IA-32 Architectures Software Developer Manuals.
Known Issues and Limitations
- For OpenMP #pragma omp single blocks in C++, private variables cannot be inspected and breakpoints may occasionally not hit the expected line.
- During the expression evaluation, an element of an accessor object cannot be accessed using the multi-dimensional access syntax. See an error example below:
(gdb) print anAccessor[5][3][4]
Cannot resolve function operator[] to any overloaded instance
You can use the id object instead:
(gdb) print workItemId
$1 = cl::sycl::id<3> = {5, 3, 4}
(gdb) print anAccessor[workItemId]
$2 = 1234
On GPU devices:
- On Intel® Core™ processors with Intel® Iris® Xe graphics device, when EU fusion takes place, only one of the fused two threads is available for debugging. EU fusion can be disabled by using the following environment variables:
$ export NEOReadDebugKeys=1
$ export CFEFusedEUDispatch=1
For details about these environment variables, please see Frequently asked questions on compute-runtime.
- Stepping over the last instruction of a GPU kernel may occasionally cause slow or no response of the debugger.
- Inferior calls (invocation of kernel functions from inside the debugger for expression evaluation) are not supported.
- GDB might occasionally return the message "Cannot execute this command while the target is running". Ignore the message as it should not affect further debugging.
- If you define a breakpoint at a location before a kernel (inside the host code), the breakpoint is also defined at the start of the kernel. This is similar to defining a breakpoint at a comment line or an empty line: in these cases, the breakpoint is defined for the next source line.
- If the current SIMD lane becomes inactive within a thread, the thread might silently switch its current SIMD lane to its first active SIMD lane.
- For Gen12 devices, the number of SIMD lanes is incorrectly shown.
- The next and step commands may take several seconds to complete execution.
- Multi-GPU debugging is not supported in the non-stop mode.
- Inspecting shared-local-memory (SLM) is not supported.
- Applications that use unified shared memory (USM) may appear as raising a SIGSEGV when a USM-allocated memory is being accessed. This is a mechanism used by the runtime to trigger memory migration. In such cases, sending the signal back to the application resumes the program. For this, use GDB's signal SIGSEGV command.
Release Notes for Windows* OS
Major Features
-
Support for debugging a kernel offloaded to a CPU, GPU, or FPGA-emulation device.
-
Integration into Microsoft Visual Studio* interface to enable GPU remote debugging. This feature allows you to debug the "host" portion and the "kernel" of a DPC++ program in the same Visual Studio remote debugging session.
Key Capabilities
- Inserting a breakpoint inside a kernel and stopping when the breakpoint is hit by a thread
- Inspecting local variables
- Source-level stepping
- Backtracing function calls
- Examining threads
- Reading registers
- Disassembling
- Automatic launch of gdbserver-gt
- Support for debugging OpenCL and DPC++ programs
- Support for debugging OpenMP C++ kernel code for CPU and GPU offloads.
System Requirements
- General hardware requirements: Intel® oneAPI Base Toolkit System Requirements.
- Specific system requirements: Intel® Distribution for GDB* System Requirements.
- You can download it at Intel® Graphics - Windows® 10 DCH Driver.
Documentation
To set up the debugger, refer to the Get Started with Debugging Data Parallel C++. If you are more comfortable with the video format, refer to the getting started video.
Changes in the 2021.3 Release
- Additional virtual debug registers can be inspected for GPU offload, for example, btbase and scrbase. These registers do not have a hardware counterpart and expose an internal GPU state.
- A new convenience variable, $_simd_lane, defines the selected SIMD lane of the current thread.
- A new code sample Jacobi is available. There are a few bugs that were added intentionally. You can use the debugger to find these bugs.
- Various bugs are fixed:
- In the Visual Studio plugin, a message box is no longer displayed after the kernel offload.
- In the Visual Studio plugin, for remote offload debugging, the connection failure message box is shown on the top now. Previously, it was hidden in the Windows taskbar.
- Fix undefined pointer/array for allocated arrays issue in FEE Visual Studio extension.
- Add support for MMX, SSE registers (XMM0..7) in FEE Visual Studio extension. The only limitation so far is that SSE registers' values are not expandable in the Watch window.
Known Issues and Limitations
On GPU devices:
- Stepping over the last instruction of a GPU kernel may occasionally cause slow or no response of the debugger.
- The target system might lose the network connection after the debugging session is started. In this case, reboot the target system.
- Microsoft Visual Studio* interface does not support viewing SIMD lanes.
- In Microsoft Visual Studio*, the following message might be displayed "Cannot execute this command while the target is running". Ignore the message as it should not affect further debugging.
- If you define a breakpoint at a location before the kernel (inside the host code), the breakpoint is also defined at the start of the kernel. This is similar to defining a breakpoint at a comment line or an empty line: in these cases, the breakpoint is defined for the next source line.
- After hitting a breakpoint defined before a kernel, not all threads hit a breakpoint defined inside the kernel.
- Single steps on the last line of a kernel lead to the termination of the program being debugged.
- For debugging on a GPU, you should use a grid size of 256 or fewer work items.
- Inferior calls (invocation of kernel functions from inside the debugger for expression evaluation) are not supported.
- Kernel functions are inlined by the compiler. Breakpoints on calls to inlined functions may not be hit. Try placing breakpoints before the call or inside the called function
- For OpenMP you need to start the gdbserver-gt manually with IPv4:
$ gdbserver-gt --hostpid=1 --attach 127.0.0.1:1234 1
On CPU and GPU devices:
- Debugging OpenMP kernel code for Fortran (#pragma omp target) is not supported.
- For OpenMP #pragma omp single blocks in C++, private variables cannot be inspected and breakpoints may occasionally not hit the expected line.
On FPGA emulator devices:
-
For debugging on FPGA emulator, ahead-of-time (AoT) compilation is not supported. Ensure that in Project Properties > DPC++ Enable FPGA Workflows is set to no.
- For debugging on an FPGA emulator, no support for OpenMP.
On CPU devices and FPGA emulator devices:
- It is recommended to disable the GPU debugger for smooth debugging on CPU and FPGA emulator devices.
Notices and Disclaimers
Intel technologies may require enabled hardware, software or service activation.
No product or component can be absolutely secure.
Your costs and results may vary.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.