Intel® Advisor User Guide

ID 766448
Date 3/22/2024
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Miscellaneous Techniques to Minimize Analysis Overhead

Issue

Running your target application with the Intel® Advisor can take substantially longer than running your target application without the Intel® Advisor. Depending on an accuracy level and analyses you choose for a perspective, different overhead is added to your application execution time. For example:

Runtime Overhead / Analysis

Survey

Characterization

Dependencies

MAP

Target application runtime with Intel® Advisor compared to runtime without Intel® Advisor

1.1x longer

2 - 55x longer

5 - 100x longer

5 - 20x longer

Solutions

The following techniques may help minimize overhead without limiting collection scope.

Disable Cache Simulation

Minimize collection overhead.

Applicable analyses:

  • Memory Access Patterns (base simulation functionality)

  • Characterization with Trip Counts and FLOP collection enabled (enhanced simulation functionality)

Implement these techniques when cache modeling information is not important to you:

NOTE:

The default setting for all the properties/options in the table below is disabled.

From the Analysis Workflow pane, disable Characterization > Enable CPU cache simulation for the Characterization analysis.

From the Project Properties:

Path: Project Properties > Analysis Target...

CLI Action Options

Description

Disable the Memory Access Patterns Analysis > Advanced > Enable Memory-Level Roofline with cache simulation checkbox.

--no-enable-cache-simulation

Do not model cache misses, cache misses and cache line utilization, or cache misses and loop footprint.

Disable the Trip Counts and FLOP Analysis > Advanced > Enable CPU cache simulation checkbox.

--no-enable-cache-simulation

Do not:

  • Model multiple levels of cache for data, such as counts of loaded of stored bytes for each loop.

  • Create simulations for specific cache hierarchy configurations.

Limit Reported Data

Applicable analysis: Memory Access Patterns.

Implement these techniques when the additional data is not important to you.

NOTE:

The default setting for all the properties/options in the table below is enabled.

Project Properties > Analysis Target > Memory Access Patterns Analysis > Advanced

CLI Action Options

Description

Disable the Report stack variables checkbox.

--no-record-stack-frame

Do not report stack variables for which memory access strides are detected.

Disable the Report heap allocated variables checkbox.

--no-record-stack-frame

Do not report heap-allocated variables for which memory access strides are detected.

Minimize Data Set

Minimize collection overhead.

Applicable analyses: All, but especially Dependencies, Memory Access Patterns.

When you run an analysis, the Intel® Advisor executes the target against the supplied data set. Data set size and workload have a direct impact on target application execution time and analysis speed

For example, it takes longer to process a 1000x1000 pixel image than a 100x100 pixel image. A possible reason: You may have loops with an iteration space of 1...1000 for the larger image, but only 1...100 for the smaller image. The exact same code paths may be executed in both cases. The difference is the number of times these code paths are repeated.

You can control analysis cost without sacrificing completeness by minimizing this kind of unnecessary repetition from target application execution.

Instead of choosing large, repetitive data sets, choose small, representative data sets that minimize the number of instructions executed within a loop while thoroughly exercising target application control flow paths.

Your objective: In as short a runtime period as possible, execute as many paths as you can afford, while minimizing the repetitive computation within each task to the bare minimum needed for good code coverage.

Data sets that run in about ten seconds or less are ideal. You can always create additional data sets to ensure all your code is checked.

Temporarily Disable Finalization

Minimize finalization overhead.

Applicable analyses: Survey, Characterization with Trip Counts and FLOP collection enabled.

Use when you plan to view collected analysis data on a different machine. Finalization automatically occurs when a result is opened in the GUI or a report is generated from the result.

Note: In the commands below, make sure to replace the myApplication with your application executable path and name before executing a command. If your application requires additional command line options, add them after the executable name.

To implement, do one of the following while running an analysis:

  • When the analysis Finalizing data... phase begins, click the associated Cancel button.


  • Use the advisor CLI action option --no-auto-finalize when you run the desired analysis. For example:

    advisor --collect=survey --project-dir=./advi_results --no-auto-finalize -- ./myApplication

When you open the result in GUI next time, the result is refinalized automatically.

To refinalize the result when running the analysis from CLI, use the refinalize-survey option with --report action. For example:

advisor --report=survey --search-dir src:=./src bin:=./bin --refinalize-survey --project-dir=./advi_results -- ./myApplication