Intel® VTune™ Profiler

User Guide

ID 766319
Date 3/22/2024
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Compiler Switches for Performance Analysis on Windows* Targets

Intel® VTune™ Profiler can analyze most native binaries on Windows* target systems. However, the settings below are recommended to make the performance analysis more productive and easier:

Use This Switch

To Do This

/Zi (highly recommended)

Enable generating the symbol information required to associate addresses with source lines and to properly walk the call stack in user-mode sampling and tracing analysis types (Hotspots and Threading).

Release build (highly recommended)

Enable maximum compiler optimization to focus VTune Profiler on performance problems that cannot be optimized with the compiler.

/MD or /MDd

Enable identifying the C runtime calls as system functions and differentiating them from the user code when a proper Call stack mode is applied to the VTune Profiler collection result.

/D "TBB_USE_THREADING_TOOLS"

Enable full support for Intel® oneAPI Threaded Building Blocks(oneTBB) in VTune Profiler.

Without TBB_USE_THREADING_TOOLS set, the VTune Profiler will not properly identify concurrency issues related to using Intel TBB constructs.

/Qopenmp (highly recommended)

(Intel C++ Compiler)

Enable the VTune Profiler to identify parallel regions due to OpenMP* pragmas.

/Qopenmp-link:dynamic

(Intel C++ Compiler)

Enable the Intel Compiler to choose the dynamic version of the OpenMP runtime libraries which has been instrumented for the VTune Profiler. Usually, this option is enabled for the Intel Compiler by default.

/Qparallel-source-info=2

(Intel C++ Compiler)

Enable/disable source location emission when OpenMP or auto-parallelism code is generated. 2 is the level of source location emission that tells the compiler to emit path, file, routine name, and line information.

-gline-tables-only

-fdebug-info-for-profiling

Intel oneAPI DPC++ Compiler

Enable generating debug information for GPU analysis of a SYCL application.

-Xsprofile

Intel oneAPI DPC++ Compiler

Enable source-level mapping of performance data for FPGA application analysis.

Explore the list of libraries recommended or not recommended for the user-mode sampling and tracing analysis types:

Library

Recommended

Not Recommended

OpenMP Runtime (supplied by the Intel Compiler)

libiomp5md.dll or libguide40.dll

libiomp5mt.lib, libguide.lib, vcomp80.dll/vcomp90.dll, or vcomp80d.dll/vcomp90d.dll

C Runtime

msvcr90.dll, msvcr80.dll , msvcr90d.dll, or msvcr80d.dll

libcmt.lib

Avoid These Switches

The following compiler settings are NOT recommended:

Do Not Use This Switch

Because Of This

debug:parallel

Enables the Intel® Parallel Debugger Extension for the Intel Compiler, which is not used for the VTune Profiler.

/Qopenmp-link:static

Chooses the static version of the OpenMP runtime libraries for the Intel Compiler. This version of the OpenMP runtime library does not contain the instrumentation data required for the VTune Profiler analysis.

/Qopenmp_stubs

Prevents OpenMP code from being parallel.

Product and Performance Information

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.

Notice revision #20201201