Developer Guide

Developer Guide for Intel® oneAPI Math Kernel Library Windows*

ID 766692
Date 10/31/2024
Public
Document Table of Contents

Using oneMKL Verbose Mode

When building applications that call Intel® oneAPI Math Kernel Library (oneMKL) functions, it may be useful to determine:

  • which computational functions are called
  • what parameters are passed to them
  • how much time is spent to execute the functions
  • (for GPU applications) which GPU device the kernel is executed on

You can get an application to print this information to a standard output device by enabling Intel® oneAPI Math Kernel Library (oneMKL) Verbose. Functions that can print this information are referred to as verbose-enabled functions.

When Verbose mode is active in an Intel® oneAPI Math Kernel Library (oneMKL) domain, every call of a verbose-enabled function finishes with printing a human-readable line describing the call. However, if your application gets terminated for some reason during the function call, no information for that function will be printed. The first call to a verbose-enabled function also prints a version information line.

For GPU applications, additional information (one or more GPU information lines) will also be printed by the first call to a verbose-enabled function, following the version information line which will be printed for the host CPU. If there is more than one GPU detected, each GPU device will be printed in a separate line.

We have different implementations for verbose with CPU applications and verbose with GPU applications. The Intel® MKL Verbose mode has 2 modes when used with CPU applications: disabled (default) and enabled. The Intel® MKL Verbose mode has three modes when used with GPU applications: disabled (default), enabled without timing, and enabled with synchronous timing.

To change the verbose mode, either set the environment variable MKL_VERBOSE:

  CPU application GPU application
Set MKL_VERBOSE to 0 to disable Verbose to disable Verbose
Set MKL_VERBOSE to 1 to enable Verbose to enable Verbose without timing
Set MKL_VERBOSE to 2 to enable Verbose to enable Verbose with synchronous timing

or call the support function mkl_verbose(int mode):

  CPU application GPU application
Call mkl_verbose(0) to disable Verbose to disable Verbose
Call mkl_verbose(1) to enable Verbose to enable Verbose without timing
Call mkl_verbose(2) to enable Verbose to enable Verbose with synchronous timing

Verbose with CPU Applications

Verbose output will be consisted of version information line and call description lines for CPU.

For CPU applications, you can enable oneMKL Verbose mode in these domains:

  • BLAS (and BLAS-like extensions)
  • LAPACK
  • ScaLAPACK (selected functionality)
  • FFT
  • RNG

Verbose with GPU Applications

The verbose feature is enabled for GPU applications that uses DPC++ API or C/Fortran API with OpenMP offload. When used with GPU applications, verbose allows the measurement of execution time to be enabled or disabled with verbose mode. Timing is taken synchronously, so if verbose is enabled with timing, kernel executions will become synchronous (previous kernel will block later kernels)

Verbose output will be consisted of version information line, GPU information lines, and call description lines for GPU.

NOTE:
Timing for GPU applications is reported for overall execution. For selected functionality, device execution time can also be reported if the input queue was created with profiling information (see the oneAPI GPU Optimization Guide).

For GPU applications, you can enable oneMKL Verbose mode in these domains:

  • BLAS (and BLAS-like extensions)
  • LAPACK
  • FFT
  • RNG (DPC++ API only)

For Both CPU and GPU Verbose

Both enabling and disabling of the Verbose mode using the function call takes precedence over the environment setting. For a full description of the mkl_verbose function, see either the Intel® oneAPI Math Kernel Library Developer Reference for C or the Intel® oneAPI Math Kernel Library Developer Reference for Fortran. Both references are available in the Intel® Software Documentation Library.

Intel® oneAPI Math Kernel Library (oneMKL) Verbose mode is not a thread-local but a global state. In other words, if an application changes the mode from multiple threads, the result is undefined.

warning:

The performance of an application may degrade with the Verbose mode enabled, especially when the number of calls to verbose-enabled functions is large, because every call to a verbose-enabled function requires an output operation.