Developer Guide

Developer Guide for Intel® oneAPI Math Kernel Library Windows*

ID 766692
Date 6/24/2024
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

MKL_NUM_STRIPES

The MKL_NUM_STRIPESenvironment variable controls the Intel® oneAPI Math Kernel Library (oneMKL) threading algorithm for?gemm functions. When MKL_NUM_STRIPES is set to a positive integer value nstripes, Intel® oneAPI Math Kernel Library (oneMKL) tries to use a number of partitions equal tonstripes along the leading dimension of the output matrix.

The following table explains how the value nstripes of MKL_NUM_STRIPESdefines the partitioning algorithm used by Intel® oneAPI Math Kernel Library (oneMKL) for?gemm output matrix; max_threads_for_mkldenotes the maximum number of OpenMP threads for Intel® oneAPI Math Kernel Library (oneMKL):

Value of

MKL_NUM_STRIPES

Partitioning Algorithm

1 < nstripes < (max_threads_for_mkl/2)

2D partitioning with the number of partitions equal to nstripes:

  • Horizontal, for column-major ordering.
  • Vertical, for row-major ordering.

nstripes = 1

1D partitioning algorithm along the opposite direction of the leading dimension.

nstripes ≥ (max_threads_for_mkl /2)

1D partitioning algorithm along the leading dimension.

nstripes < 0

The default Intel® oneAPI Math Kernel Library (oneMKL) threading algorithm.

The following figure shows the partitioning of an output matrix for nstripes = 4 and a total number of 8 OpenMP threads for column-major and row-major orderings:



You can use support functions mkl_set_num_stripes and mkl_get_num_stripes to set and query the number of stripes, respectively.

Product and Performance Information

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.

Notice revision #20201201