Intel® VTune™ Profiler

Performance Analysis Tutorial for Windows* OS

ID 762031
Date 12/20/2024
Public

Resolve Memory Access Issue

Edit the source code to resolve the memory access issue that was identified in the previous step. Then, recompile the matrix application. These steps are explained in detail, but before you proceed, make note of the optimization setting for your compiler.

Compiler Optimization

This tutorial uses the Intel® oneAPI DPC++/C++ Compiler. Your choice of a different compiler can affect your results and your workflow.

The procedures in this section call for the compiler optimization level to be set to Maximum Optimization (Favor Size) (-O1).

Profiling software performance with maximum optimization may seem desirable. However, the default version of the matrix sample sets the optimization option to /Od (or disabled). This setting is used as an example to demonstrate how you use Intel® VTune™ Profiler to detect issues related to unobvious behavior of compiler options.

These issues can occur in real-world applications. Causes for these issues can range from a simple typographical error to complicated situations where limited understanding of compiler options can influence performance negatively.

Edit and Recompile the Sample Code

To edit and recompile the sample code in Microsoft Visual Studio* using the Intel® oneAPI DPC++/C++ Compiler:

  1. Locate the matrix sample application folder on your machine. The default location is

    [Documents]\VTune\samples\matrix

  2. In the ..\matrix\vc15 folder, open the matrix.sln Visual Studio solution.

  3. Make sure you build the application with the Release configuration and x64 platform enabled.

  4. In the Solution Explorer, right-click the matrix project and select Properties.

  5. Set these options:

    Menu Option Setting
    Configuration Properties > General Platform Toolset Intel C++ Compiler <version>
    Configuration Properties > vcpkg Use Vcpkg No
    C/C++ > General Debug Information Format Program Database (/Zi)
    C/C++ > Optimization Optimization Maximum Optimizations (Favor Size) (/O1)
    C/C++ > Diagnostics [Intel C++] Optimization Diagnostic Level Level 2 (/Qopt-report:2)

  6. In the multiply.h header file, on line 36, change the following line:

    #define MULTIPLY multiply1

    to

    #define MULTIPLY multiply2

    This change causes the program to use the multiply2 function from the multiply.c source file. Using the multiply2 function implements the loop interchange technique that resolves the memory access problem. The use of /O1 also enables automatic vectorization.

  7. Build the project.