Visible to Intel only — GUID: GUID-D5E9F1A6-6194-4835-BC16-726CC6AA2F50
Getting Help and Support
What's New
Notational Conventions
Related Information
Getting Started
Structure of the Intel® oneAPI Math Kernel Library
Linking Your Application with the Intel® oneAPI Math Kernel Library
Managing Performance and Memory
Language-specific Usage Options
Obtaining Numerically Reproducible Results
Coding Tips
Managing Output
Working with the Intel® Math Kernel Library Cluster Edition Software
Managing Behavior of the Intel® oneAPI Math Kernel Library with Environment Variables
Configuring Your Integrated Development Environment to Link with Intel® oneAPI Math Kernel Library
Intel® oneAPI Math Kernel Library Benchmarks
Appendix A: Intel® oneAPI Math Kernel Library Language Interfaces Support
Appendix B: Support for Third-Party Interfaces
Appendix C: Directory Structure in Detail
Notices and Disclaimers
OpenMP* Threaded Functions and Problems
Functions Threaded with Intel® Threading Building Blocks
Avoiding Conflicts in the Execution Environment
Techniques to Set the Number of Threads
Setting the Number of Threads Using an OpenMP* Environment Variable
Changing the Number of OpenMP* Threads at Run Time
Using Additional Threading Control
Calling oneMKL Functions from Multi-threaded Applications
Using Intel® Hyper-Threading Technology
Managing Multi-core Performance
Managing Performance with Heterogeneous Cores
Overview of the Intel® Distribution for LINPACK* Benchmark
Overview of the Intel® Optimized HPL-AI* Benchmark
Contents of the Intel® Distribution for LINPACK* Benchmark and the Intel® Optimized HPL-AI* Benchmark
Building the Intel® Distribution for LINPACK* Benchmark and the Intel® Optimized HPL-AI* Benchmark for a Customized MPI Implementation
Building the Netlib HPL from Source Code
Configuring Parameters
Ease-of-use Command-Line Parameters
Running the Intel® Distribution for LINPACK* Benchmark and the Intel® Optimized HPL-AI* Benchmark
Heterogeneous Support in the Intel® Distribution for LINPACK* Benchmark
Environment Variables
Improving Performance of Your Cluster
Overview of the Intel Optimized HPCG
Versions of the Intel® CPU Optimized HPCG
Versions of the Intel® GPU Optimized HPCG
Getting Started with Intel® CPU Optimized HPCG
Getting Started with Intel® GPU Optimized HPCG
Choosing the Best Configuration and Problem Sizes for CPUs
Choosing the Best HPCG Configuration for GPUs
Visible to Intel only — GUID: GUID-D5E9F1A6-6194-4835-BC16-726CC6AA2F50
Example of Data Alignment
Needs for best performance with Intel® oneAPI Math Kernel Library (oneMKL) or for reproducible results from run to run of Intel® oneAPI Math Kernel Library (oneMKL) functions require alignment of data arrays. The following example shows how to align an array on 64-byte boundaries. To do this, usemkl_malloc() in place of system provided memory allocators, as shown in the code example below.
Aligning Addresses on 64-byte Boundaries
// ******* C language ******* ... #include <stdlib.h> #include <mkl.h> ... void *darray; int workspace; // Set value of alignment int alignment=64; ... // Allocate aligned workspace darray = mkl_malloc( sizeof(double)*workspace, alignment ); ... // call the program using oneMKL mkl_app( darray ); ... // Free workspace mkl_free( darray );
! ******* Fortran language ******* ... ! Set value of alignment integer alignment parameter (alignment=64) ... ! Declare oneMKL routines integer*8 mkl_malloc external mkl_malloc, mkl_free, mkl_app ... double precision darray pointer (p_wrk,darray(1)) integer workspace ... ! Allocate aligned workspace p_wrk = mkl_malloc( %val(8*workspace), %val(alignment) ) ... ! call the program using oneMKL call mkl_app( darray ) ... ! Free workspace call mkl_free(p_wrk)
Parent topic: Coding Tips