Debug Options
Auto-Attach
The auto-attach feature enables listening to debug events from the GPU.
This feature allows the debugger to launch a gdbserver-ze process to listen to GPU debug events and connects the gdbserver-ze to the debugger. For each device on the system, an inferior is created on the gdbserver-ze connection. This feature is designed to improve your debugging experience and ensure that you can debug the kernel offloaded to GPU.
The auto-attach feature is enabled by default.
The feature does not affect the debugging capability on the CPU device. However, to eliminate the extra output this feature creates, you can turn it off with the INTELGT_AUTO_ATTACH_DISABLE environment variable. To do it, execute the following command on the shell before starting gdb-oneapi:
export INTELGT_AUTO_ATTACH_DISABLE=1
To enable the feature again:
unset INTELGT_AUTO_ATTACH_DISABLE
Reducing Overhead
Executing the info threads command may take a noticeable amount of time to complete, because it needs to fetch the data of a large number of threads when debugging GPUs. To reduce the overhead, it is recommended to run the info threads command without printing the frame arguments. This can be achieved by changing the setting globally using
set print frame-arguments none
or by using
with print frame-arguments none -- info threads
for a single command execution.
Pretty-Printing
The pretty-printing feature simplifies the display of complex objects. If a pretty-printer is registered for the type of value you are going to print, the debugger simplifies the output. Otherwise, the debugger prints the value normally.
Intel® Distribution for GDB* supports pretty-printing for the SYCL* types accessor, buffer, device, exception, handler, id, item, local_accessor, queue and range from the sycl namespace.
You can write your own pretty-printer for any type. Refer to the Writing a Pretty Printer for more information.
To display the list of pretty-printers available, run the following command:
info pretty-print
Example output:
- global pretty-printers:
-
- builtin
-
workitem
- libsycl
-
sycl::_V1::accessor sycl::_V1::buffer sycl::_V1::device sycl::_V1::exception sycl::_V1::handler sycl::_V1::id sycl::_V1::item sycl::_V1::local_accessor sycl::_V1::queue sycl::_V1::range
Pretty-printing is enabled by default. For example, when you print a value of the sycl::id<2> wiID variable:
print wiID
The output is the following:
$1 = sycl::id = {29, 16}
To disable pretty-printing and display raw content instead, use the /r flag:
print /r wiID
Example output:
$2 = {<sycl::_V1::detail::array<2>> = {common_array = {29, 16}}, static dimensions = <optimized out>}
To disable all pretty-printers, use the following command:
disable pretty-printer
To enable pretty-printers:
enable pretty-printer
Prettify Frames
Some C++ templates/SYCL constructs make it difficult to view the output of info threads and backtrace given multi-line function names.
You can use the concept of frame filters to change the visibility of a printed frame with the ‘backtrace’ command. For details refer to the GDB Documentation Frame Filter API.
It is also possible to change the visibility of a printed frame globally. Consider the setting print frame-info:
(gdb) set print frame-info source-line
(gdb) info threads -stopped
Id Target Id Frame
<...>
2.1:[0-15] ZE 0.0.0.0 53 size_t id0 = GetDim(index, 0);
* 2.9:[*0 1-15] ZE 0.0.1.0 53 size_t id0 = GetDim(index, 0);
2.33:[0-15] ZE 0.0.4.0 53 size_t id0 = GetDim(index, 0);
2.41:[0-15] ZE 0.0.5.0 53 size_t id0 = GetDim(index, 0);
<...>
Refer to the GDB documentation for more information.
Print settings for kernel data
Given the sample program array-transform.cpp:
18 using namespace std;
19 using namespace sycl;
[...]
26 int main(int argc, char *argv[]) {
27 constexpr size_t length = 64;
28 int input[length];
29 int output[length];
30
31 // Initialize the input
32 for (int i = 0; i < length; i++)
33 input[i] = i + 100;
34
35 try {
36 queue q(default_selector_v, dpc_common::exception_handler);
[...]
43 range data_range{length};
44 buffer buffer_in{input, data_range};
45 buffer buffer_out{output, data_range};
46
47 q.submit([&](auto &h) {
48 accessor in(buffer_in, h, read_only);
49 accessor out(buffer_out, h, write_only);
50
51 // kernel-start
52 h.parallel_for(data_range, [=](id<1> index) {
53 size_t id0 = GetDim(index, 0);
54 int element = in[index]; // breakpoint-here
Inside the kernel we need to use a sycl::accessor to access data in the host’s sycl::buffer. To review the contents of the sycl::buffer object buffer_in of length 64 from inside the kernel, one can simply print the sycl::accessor object in whose pretty-printer provides easy access to the data:
(gdb) p in
$1 = sycl::accessor read range 64 = {100, 101, 102, 103, 104, 105, 106,
107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120,
121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134,
135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148,
149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162,
163}
Use the print setting print elements to further configure number of printed elements:
(gdb) set print elements 10
(gdb) p in
$2 = sycl::accessor read range 64 = {100, 101, 102, 103, 104, 105, 106,
107, 108, 109...}
Consider the print setting print repeats in case the output contains repeated elements:
(gdb) show print repeats
Threshold for repeated print elements is 10.