Intel® VTune™ Profiler

Performance Analysis Tutorial for Linux* OS

ID 762029
Date 12/20/2024
Public

Resolve Memory Access Issue

Edit the source code to resolve the memory access issue that was identified in the previous step. Then, recompile the matrix application. These steps are explained in detail, but before you proceed, make note of the optimization setting for your compiler.

Compiler Optimization

This tutorial uses the Intel® oneAPI DPC++/C++ Compiler. Your choice of a different compiler can affect your results and your workflow.

The procedures in this section call for the compiler optimization level to be set to Maximum Optimization (Favor Speed) (-O2).

Edit and Recompile the Sample Code

To edit and recompile the sample code using the Intel® oneAPI DPC++/C++ Compiler:

  1. In the /opt/intel/oneapi/compiler/latest/env folder, set the environment variables for the compiler. Run this command:

    source env.vars
  2. Locate the matrix sample application folder on your machine. By default, it is placed in:

    $HOME/intel/vtune/samples/matrix

  3. Using a text editor of your choice, open the Makefile in the ../matrix/linux/ folder.

  4. Change line 41 from:

    ICC = icc

    to

    ICC = icx
  5. Change line 42 from:

    CFLAGS  = -g -O3 -fno-asm

    to

    CFLAGS  = -g -O2
  6. Change line 43 from:

    OPTFLAGS = -xSSE3 

    to

    OPTFLAGS = 
  7. Save and close the Makefile.

  8. Open the multiply.h header file located in ../matrix/src folder with a text editor.

  9. Change line 36 from:

    #define MULTIPLY multiply1

    to

    #define MULTIPLY multiply2

    This changes the program to use the multiply2 function from the multiply.c source file. This change implements the loop interchange technique that resolves the memory access problem.

  10. Save and close the multiply.h file.

  11. Open the ../matrix/linux folder.

    Recompile the application. Run this command:

    make icc