Visible to Intel only — GUID: GUID-30D1544C-5104-4410-8A92-FF16DC831527
Visible to Intel only — GUID: GUID-30D1544C-5104-4410-8A92-FF16DC831527
Call Description Line
Call Description Line for CPU
In Intel® oneAPI Math Kernel Library (oneMKL) Verbose mode, each verbose-enabled function called from your application prints a call description line. The line begins with the MKL_VERBOSE character string and uses spaces as delimiters. The format of the rest of the line is subject to change in a future release.
The following table lists information contained in a call description line for Verbose with CPU applications and provides available links for more information:
Information |
Description |
Related Links |
---|---|---|
The name of the function. |
Although the name printed may differ from the name used in the source code of the application (for example, the cblas_ prefix of CBLAS functions is not printed), you can easily recognize the function by the printed name. |
|
Values of the arguments. |
|
|
Time taken by the function. |
|
|
Value of the MKL_CBWR environment variable. |
The value printed is prefixed with CNR: |
|
Value of the MKL_DYNAMIC environment variable. |
The value printed is prefixed with Dyn: |
|
Status of the Intel® oneAPI Math Kernel Library (oneMKL)memory manager. |
The value printed is prefixed with FastMM: |
Avoiding Memory Leaks in oneMKLfor a description of the Intel® oneAPI Math Kernel Library (oneMKL)memory manager |
OpenMP* thread number of the calling thread. |
The value printed is prefixed with TID: |
|
Values of Intel® oneAPI Math Kernel Library (oneMKL) environment variables defining the general and domain-specific numbers of threads, separated by a comma. |
The first value printed is prefixed with NThr: |
The following is an example of a call description line (with OpenMP threading):
MKL_VERBOSE DGEMM(n,n,1000,1000,240,0x7ffff708bb30,0x7ff2aea4c000,1000,0x7ff28e92b000,240,0x7ffff708bb38,0x7ff28e08d000,1000) 1.66ms CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:16
The following is an example of a call description line (with TBB threading):
MKL_VERBOSE DGEMM(n,n,1000,1000,240,0x7ffff708bb30,0x7ff2aea4c000,1000,0x7ff28e92b000,240,0x7ffff708bb38,0x7ff28e08d000,1000) 1.66ms CNR:OFF Dyn:1 FastMM:1
The following information is not printed because of limitations of Intel® oneAPI Math Kernel Library (oneMKL) Verbose mode:
Input values of parameters passed by reference if the values were changed by the function.
For example, if a LAPACK function is called with a workspace query, that is, the value of the lwork parameter equals -1 on input, the call description line prints the result of the query and not -1.
Return values of functions.
For example, the value returned by the function ilaenv is not printed.
Floating-point scalars passed by reference.
Call Description Line for GPU
In Intel® oneAPI Math Kernel Library (oneMKL) Verbose mode, each verbose-enabled function called from your application prints a call description line. The line begins with the MKL_VERBOSE character string and uses spaces as delimiters. The format of the rest of the line may change in a future release.
The following table lists information contained in a call description line for verbose with GPU applications.
Information | Description |
---|---|
The name of the function | Although the name printed may differ from the name used in the source code of the application, you can easily recognize the function by the printed name. |
The values of the arguments |
|
Time taken by the function |
|
Device index | The index of the GPU device on which the kernel is being executed will be printed after the character string "GPU" (e.g. GPU0, GPU1, GPU2, etc). Use the index and refer to the GPU information lines for more information about the specific device. If the kernel is executed on the host CPU, this field will be empty. |
The following is an example of a call description line:
MKL_VERBOSE FFT(dcfi64) 224.30us GPU0
For GPU applications, the call description lines may be printed out-of-order (the order of the call description lines printed in the verbose output may not be the order in which the kernels are submitted in the functions) for the following two cases:
- Verbose is enabled without timing and the kernel executions stay asynchronous.
- The kernel is not executed on one of the GPU devices, but on the host CPU (the device index will not be printed in this case).