Visible to Intel only — GUID: GUID-4AA9082D-F781-4145-950F-091994B2A3A7
Getting Help and Support
What's New
Notational Conventions
Related Information
Getting Started
Structure of the Intel® oneAPI Math Kernel Library
Linking Your Application with the Intel® oneAPI Math Kernel Library
Managing Performance and Memory
Language-specific Usage Options
Obtaining Numerically Reproducible Results
Coding Tips
Managing Output
Working with the Intel® oneAPI Math Kernel Library Cluster Software
Managing Behavior of the Intel® oneAPI Math Kernel Library with Environment Variables
Configuring Your Integrated Development Environment to Link with Intel® oneAPI Math Kernel Library
Intel® oneAPI Math Kernel Library Benchmarks
Appendix A: Intel® oneAPI Math Kernel Library Language Interfaces Support
Appendix B: Support for Third-Party Interfaces
Appendix C: Directory Structure in Detail
Notices and Disclaimers
OpenMP* Threaded Functions and Problems
Functions Threaded with Intel® Threading Building Blocks
Avoiding Conflicts in the Execution Environment
Techniques to Set the Number of Threads
Setting the Number of Threads Using an OpenMP* Environment Variable
Changing the Number of OpenMP* Threads at Run Time
Using Additional Threading Control
Calling oneMKL Functions from Multi-threaded Applications
Using Intel® Hyper-Threading Technology
Managing Multi-core Performance
Managing Performance with Heterogeneous Cores
Getting Started with Conditional Numerical Reproducibility
Specifying Code Branches
Reproducibility Conditions
Setting the Environment Variable for Conditional Numerical Reproducibility
Code Examples
C Example of CNR
Fortran Example of CNR
Use of CNR with Unaligned Data in C
Use of CNR with Unaligned Data in Fortran
Overview of the Intel® Distribution for LINPACK* Benchmark
Contents of the Intel® Distribution for LINPACK* Benchmark
Building the Intel® Distribution for LINPACK* Benchmark for a Customized MPI Implementation
Building the Netlib HPL from Source Code
Configuring Parameters
Ease-of-use Command-line Parameters
Running the Intel® Distribution for LINPACK* Benchmark
Heterogeneous Support in the Intel® Distribution for LINPACK* Benchmark
Environment Variables
Improving Performance of Your Cluster
Static Libraries in the <span class='filepath'>lib</span> <span class='filepath'>/</span> <span class='filepath'>ia32</span> <span class='filepath'>_lin</span> Directory
Dynamic Libraries in the <span class='filepath'>lib</span> <span class='filepath'>/</span> <span class='filepath'>ia32</span> <span class='filepath'>_lin</span> Directory
Static Libraries in the <span class='filepath'>lib</span> <span class='filepath'>/</span> <span class='filepath'>intel64</span> <span class='filepath'>_lin</span> Directory
Dynamic Libraries in the <span class='filepath'>lib</span> <span class='filepath'>/</span> <span class='filepath'>intel64</span> <span class='filepath'>_lin</span> Directory
Visible to Intel only — GUID: GUID-4AA9082D-F781-4145-950F-091994B2A3A7
Code Examples
The following simple programs show how to obtain reproducible results from run to run of Intel® oneAPI Math Kernel Library functions. See theIntel® oneAPI Math Kernel Library Developer Reference for more examples.
C Example of CNR
#include <mkl.h> int main(void) { int my_cbwr_branch; /* Align all input/output data on 64-byte boundaries */ /* "for best performance of Intel® oneAPI Math Kernel Library */ void *darray; int darray_size=1000; /* Set alignment value in bytes */ int alignment=64; /* Allocate aligned array */ darray = mkl_malloc (sizeof(double)*darray_size, alignment); /* Find the available MKL_CBWR_BRANCH automatically */ my_cbwr_branch = mkl_cbwr_get_auto_branch(); /* User code without oneMKL calls */ /* Piece of the code where CNR of oneMKL is needed */ /* The performance of oneMKL functions might be reduced for CNR mode */ /* If the "IF" statement below is commented out, Intel® oneAPI Math Kernel Library will run in a regular mode, */ /* and data alignment will allow you to get best performance */ if (mkl_cbwr_set(my_cbwr_branch)) { printf("Error in setting MKL_CBWR_BRANCH! Aborting…\n”); return; } /* CNR calls to oneMKL + any other code */ /* Free the allocated aligned array */ mkl_free(darray); }
Fortran Example of CNR
PROGRAM MAIN INCLUDE 'mkl.fi' INTEGER*4 MY_CBWR_BRANCH ! Align all input/output data on 64-byte boundaries ! "for best performance of Intel® oneAPI Math Kernel Library ! Declare oneMKL memory allocation routine #ifdef _IA32 INTEGER MKL_MALLOC #else INTEGER*8 MKL_MALLOC #endif EXTERNAL MKL_MALLOC, MKL_FREE DOUBLE PRECISION DARRAY POINTER (P_DARRAY,DARRAY(1)) INTEGER DARRAY_SIZE PARAMETER (DARRAY_SIZE=1000) ! Set alignment value in bytes INTEGER ALIGNMENT PARAMETER (ALIGNMENT=64) ! Allocate aligned array P_DARRAY = MKL_MALLOC (%VAL(8*DARRAY_SIZE), %VAL(ALIGNMENT)); ! Find the available MKL_CBWR_BRANCH automatically MY_CBWR_BRANCH = MKL_CBWR_GET_AUTO_BRANCH() ! User code without oneMKL calls ! Piece of the code where CNR of oneMKL is needed ! The performance of oneMKL functions may be reduced for CNR mode ! If the "IF" statement below is commented out, Intel® oneAPI Math Kernel Library will run in a regular mode, ! and data alignment will allow you to get best performance IF (MKL_CBWR_SET (MY_CBWR_BRANCH) .NE. MKL_CBWR_SUCCESS) THEN PRINT *, 'Error in setting MKL_CBWR_BRANCH! Aborting…' RETURN ENDIF ! CNR calls to oneMKL + any other code ! Free the allocated aligned array CALL MKL_FREE(P_DARRAY) END
Use of CNR with Unaligned Data in C
#include <mkl.h> int main(void) { int my_cbwr_branch; /* If it is not possible to align all input/output data on 64-byte boundaries */ /* to achieve performance, use unaligned IO data with possible performance */ /* penalty */ /* Using unaligned IO data */ double *darray; int darray_size=1000; /* Allocate array, malloc aligns data on 8/16-byte boundary only */ darray = (double *)malloc (sizeof(double)*darray_size); /* Find the available MKL_CBWR_BRANCH automatically */ my_cbwr_branch = mkl_cbwr_get_auto_branch(); /* User code without oneMKL calls */ /* Piece of the code where CNR of oneMKL is needed */ /* The performance of oneMKL functions might be reduced for CNR mode */ /* If the "IF" statement below is commented out, oneMKL will run in a regular mode, */ /* and you will NOT get best performance without data alignment */ if (mkl_cbwr_set(my_cbwr_branch)) { printf("Error in setting MKL_CBWR_BRANCH! Aborting…\n"); return; } /* CNR calls to oneMKL + any other code */ /* Free the allocated array */ free(darray);
Use of CNR with Unaligned Data in Fortran
PROGRAM MAIN INCLUDE 'mkl.fi' INTEGER*4 MY_CBWR_BRANCH ! If it is not possible to align all input/output data on 64-byte boundaries ! to achieve performance, use unaligned IO data with possible performance ! penalty DOUBLE PRECISION, DIMENSION(:), ALLOCATABLE :: DARRAY INTEGER DARRAY_SIZE, STATUS PARAMETER (DARRAY_SIZE=1000) ! Allocate array with undefined alignment ALLOCATE(DARRAY(DARRAY_SIZE)); ! Find the available MKL_CBWR_BRANCH automatically MY_CBWR_BRANCH = MKL_CBWR_GET_AUTO_BRANCH() ! User code without oneMKL calls ! Piece of the code where CNR of oneMKL is needed ! The performance of oneMKL functions might be reduced for CNR mode ! If the "IF" statement below is commented out, oneMKL will run in a regular mode, ! and you will NOT get best performance without data alignment IF (MKL_CBWR_SET(MY_CBWR_BRANCH) .NE. MKL_CBWR_SUCCESS) THEN PRINT *, 'Error in setting MKL_CBWR_BRANCH! Aborting…' RETURN ENDIF ! CNR calls to oneMKL + any other code ! Free the allocated array DEALLOCATE(DARRAY) END
Parent topic: Obtaining Numerically Reproducible Results