Intel® Advisor User Guide

ID 766448
Date 3/22/2024
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Customize Offload Modeling Perspective

Customize the perspective flow to better fit your goal and your application.

If you change any of the analysis settings from the Analysis Workflow tab, the accuracy level changes to Custom automatically. With this accuracy level, you can customize the perspective flow and/or analysis properties.

To change the properties of a specific analysis:

  1. Expand the analysis details on the Analysis Workflow pane with .
  2. Select desired settings.
  3. For more detailed customization, click the gear icon. You will see the Project Properties dialog box open for the selected analysis.
  4. Select desired properties and click OK.

For a full set of available properties, click the icon on the left-side pane or go to File > Project Properties.

The following tables cover project properties applicable to the analyses in the Offload Modeling perspective.

Common Properties

Use This

To Do This

Inherit settings from Visual Studio project checkbox and field (Visual Studio* IDE only)

Inherit Intel Advisor project properties from the Visual Studio* startup project (enable).

If enabled, the Application, Application parameters, and Working directory fields are pre-filled and cannot be modified.

NOTE:
In Visual Studio* 2022, Intel Advisor provides lightweight integration. You can configure and compile your application and open the standalone Intel Advisor interface from the Visual Studio for further analysis. All your settings will be inherited by the standalone Intel Advisor project.

Application field and Browse... button

Select an analysis target executable or script.

If you specify a script in this field, consider specifying the executable in the Advanced > Child application field (required for Dependencies analysis).

Application parameters field and Modify... button

Specify runtime arguments to use when performing analysis (equivalent to command line arguments).

Use application directory as working directory checkbox

Automatically use the value in the Application directory to pre-fill the Working directory value (enable).

Working directory field and Browse... button

Select the working directory.

User-defined environment variables field and Modify... button

Specify environment variables to use during analysis.

Managed code profiling mode drop-down

  • Automatically detect the type of target executable as Native or Managed, and switch to that mode (choose Auto).

  • Collect data for native code and do not attribute data to managed code (choose Native).

  • Collect data for both native and managed code, and attribute data to managed code as appropriate (choose Mixed). Consider using this option when analyzing a native executable that makes calls to managed code.

  • Collect data for both native and managed code, resolve samples attributed to native code, and attribute data to managed source only (choose Managed). The call stack in the analysis result displays data for managed code only.

Child application field

Analyze a file that is not the starting application. For example: Analyze an executable (identified in this field) called by a script (identified in the Application field).

Invoking these properties could decrease analysis overhead.

NOTE:

For the Dependencies Analysis Type: If you specify a script file in the Application field, you must specify the target executable in the Child application field.

Modules radio buttons, field, and Modify... button

  • Analyze specific modules and disable analysis of all other modules (click the Include only the following module(s) radio button and choose the modules).

  • Disable analysis of specific modules and analyze all other modules (click the Exclude only the following module(s) radio button and choose the modules).

Including/excluding modules could minimize analysis overhead.

Use MPI launcher checkbox

Generate a command line (enable) that appears in the Get command line field based on the following parameters:

  • Select MPI Launcher - Intel or another vendor

  • Number of ranks - Number of instances of the application

  • Profile ranks - All or a range of ranks to profile

Automatically stop collection after (sec) checkbox and field

Stop collection after a specified number of seconds (enable and specify seconds).

Invoking this property could minimize analysis overhead.

Survey Analysis Properties

Use This

To Do This

Automatically resume collection after (sec) checkbox and field

Start running your target application with collection paused, then resume collection after a specified number of seconds (enable and specify seconds).

Invoking this property could decrease analysis overhead.

TIP:

The corresponding CLI action option is --resume-after=<integer>, where the integer argument is in milliseconds, not seconds.

Sampling Interval selector

Set the wait time between each analysis collection CPU sample while your target application is running.

Increasing the wait time could decrease analysis overhead.

Collection data limit, MB selector

Set the amount of collected raw data if exceeding a size threshold could cause issues. Not available for hardware event-based analyses.

Decreasing the limit could decrease analysis overhead.

Callstack unwinding mode drop-down list

Set to After collection if:

  • Survey analysis runtime overhead exceeds 1.1x.

  • A large quantity of data is allocated on the stack, which is a common case for Fortran applications or applications with a large number of small, parallel, OpenMP* regions.

Otherwise, set to During Collection. This mode improves stack accuracy but increases overhead.

Stitch stacks checkbox

Restore a logical call tree for Intel® oneAPI Threading Building Blocks (oneTBB) or OpenMP* applications by catching notifications from the runtime and attaching stacks to a point introducing a parallel workload (enable).

Disable if Survey analysis runtime overhead exceeds 1.1x.

Analyze MKL Loops and Functions checkbox

Show Intel® oneAPI Math Kernel Library (oneMKL) loops and functions in Intel Advisor reports (enable).

Enabling could increase analysis overhead.

Analyze Python loops and functions checkbox

Show Python* loops and functions in Intel Advisor reports (enable).

Enabling could increase analysis overhead.

Analyze loops that reside in non-executed code paths checkbox

Collect a variety of data during analysis for loops that reside in non-executed code paths, including loop assembly code, instruction set architecture (ISA), and vector length (enable).

Enabling could increase analysis overhead.

NOTE:

Analyzing non-executed code paths in binaries that target multiple ISAs (contain multiple code paths) is available only for binaries compiled using the -ax (Linux* OS) / Qax (Windows* OS) option with an Intel compiler.

Enable registry spill/fill analysis checkbox

Calculate the number of consecutive load/store operations in registers and related memory traffic (enable).

Enabling could increase analysis overhead.

Enable static instruction mix analysis checkbox

Statically calculate the number of specific instructions present in the binary (enable).

Enabling could increase analysis overhead.

Source caching drop-down list

  • Delete source code cache from a project with each analysis run (default; choose Clear cached files).

  • Keep source code cache within the project (choose Keep cached files).

Trip Counts and FLOP Analysis Properties

Use This

To Do This

Inherit settings from the Survey Hotspots Analysis Type checkbox

Copy similar settings from Survey analysis properties (enable).

When enabled, this option disables application parameters controls.

Automatically resume collection after (sec) checkbox and field

Start running your target application with collection paused, then resume collection after a specified number of seconds (enable and specify seconds).

Invoking this property could decrease analysis overhead.

TIP:

The corresponding CLI action option is --resume-after=<integer>, where the integer argument is in milliseconds, not seconds.

Trip Counts / Collect information about Loop Trip Counts checkbox

Measure loop invocation and execution (enable).

FLOP / Collect information about FLOP, L1 memory traffic, and AVX-512 mask usage checkbox

Measure floating-point operations, integer operations, and memory traffic (enable).

Callstacks / Collect callstacks checkbox

Collect call stack information when performing analysis (enable).

Enabling could increase analysis overhead.

Capture metrics for dynamic loops and functions checkbox

Collect metrics for dynamic Just-In-Time (JIT) generated code regions.

Capture metrics for stripped binaries checkbox

Collect metrics for stripped binaries.

Enabling could increase analysis overhead.

Cache Simulation / Enable Memory-Level Roofline with cache simulation checkbox

Model multiple levels of cache for data, such as counts of loaded or stored bytes for each loop, to plot the Roofline chart for all memory levels (enable).

Enabling could increase analysis overhead.

Cache simulator configuration field

Specify a cache hierarchy configuration to model (enable and specify hierarchy).

The hierarchy configuration template is:

[num_of_level1_caches]:[num_of_ways_level1_connected]:[level1_cache_size]:[level1_cacheline_size]/

[num_of_level2_caches]:[num_of_ways_level2_connected]:[level2_cache_size]:[level2_cacheline_size]/

[num_of_level3_caches]:[num_of_ways_level3_connected]:[level3_cache_size]:[level3_cacheline_size]

For example: 4:8w:32k:64l/4:4w:256k:64l/1:16w:6m:64l is the hierarchy configuration for:

  • Four eight-way 32-KB level 1 caches with line size of 64 bytes

  • Four four-way 256-KB level 2 caches with line size of 64 bytes

  • One sixteen-way 6-MB level 3 cache with line size of 64 bytes

Data Transfer Simulation / Data transfer simulation mode drop-down

Select a level of details for data transfer simulation:

  • Off - Disable data transfer simulation analysis.
  • Light - Model data transfers between host and device memory.
  • Medium - Model data transfers, attribute memory objects to loops that accessed the objects, and track accesses to stack memory.
  • Full - Model data transfers, attribute memory objects, track accesses to stack memory, and identify where data can be potentially reused if transferred between host and target.

Dependencies Analysis Properties

Use This

To Do This

Inherit settings from the Survey Hotspots Analysis Type checkbox

Copy similar settings from Survey analysis properties (enable).

When enabled, this option disables application parameters controls.

Suppression mode radio buttons

  • Report possible dependencies in system modules (choose the Show problems in system modules radio button).
  • Do not report possible dependencies in system modules (choose the Suppress problems in system modules radio button).

Loop call count limit selector

Choose the maximum number of instances each marked loop is analyzed. 0 = analyze all loop instances.

Supplying a non-zero value could decrease analysis overhead.

Instance of interest selector

Analyze the nth child process, where 1 = the first process of the specified name in the application process tree. 0 = analyze all processes.

Supplying a non-zero value could decrease analysis overhead.

Analyze stack variables checkbox

Analyze parallel data sharing for stack variables (enable).

Enabling could increase analysis overhead.

Filter stack variables by scope checkbox

Enable to report:

  • Variables initiated inside the loop as potential dependencies (warning)
  • Variables initialized outside the loop as dependencies (error)

Enabling could increase analysis overhead.

Reduction Detection / Filter reduction variables checkbox

Mark all potential reductions by a specific diagnostic (enable).

Enabling could increase analysis overhead.

Markup type checkbox

Select loops/functions by pre-defined markup algorithm. Supported algorithms are:

  • GPU generic - Select loops executed on a GPU.

  • OpenMP - Select OpenMP* loops.

  • SYCL - Select SYCL loops.

  • OpenCL - Select OpenCL™ loops.

  • DAAL - Select Intel® oneAPI Data Analytics Library loops.

  • TBB - Select Intel® oneAPI Threading Building Blocks loops.

NOTE:
This option is available only from the Analysis Workflow pane for the Offload Modeling perspective.

Performance Modeling Properties

Use This

To Do This

Assume Dependencies checkbox

Assume that loops have dependencies if their dependency type is unknown.

NOTE:
This option is available only from the Analysis Workflow pane.

Single Kernel Launch Tax checkbox

Assume the invocation tax is paid only for the first kernel launch when estimating invocation taxes.

NOTE:
This option is available only from the Analysis Workflow pane.

Data Reuse Analysis checkbox

Analyze potential data reuse between code regions for the data transferred between host and target platforms.

NOTE:
This option is available only from the Analysis Workflow pane.
NOTE:
Make sure to set the Data Transfer Analysis to the Full mode in the Characterization step to analyze data reuse.

Target Config drop-down

Select a pre-defined hardware configurations from a drop-down list to model application performance on.

Other parameters field

Enter a space-separated list of command-line parameters. See Command Option Reference for available parameters.

Baseline Device drop-down

Select a baseline device that your application runs on for the Intel® Advisor to collect performance data.

Custom Device Configuration field

Specify the absolute path or name for a custom TOML configuration file with additional modeling parameters.