Visible to Intel only — GUID: GUID-C2D055C6-136B-465C-8DF3-3267C505B7CD
Visible to Intel only — GUID: GUID-C2D055C6-136B-465C-8DF3-3267C505B7CD
Miscellaneous Techniques to Minimize Analysis Overhead
Issue
Running your target application with the Intel® Advisor can take substantially longer than running your target application without the Intel® Advisor. Depending on an accuracy level and analyses you choose for a perspective, different overhead is added to your application execution time. For example:
Runtime Overhead / Analysis |
Survey |
Characterization |
Dependencies |
MAP |
---|---|---|---|---|
Target application runtime with Intel® Advisor compared to runtime without Intel® Advisor |
1.1x longer |
2 - 55x longer |
5 - 100x longer |
5 - 20x longer |
Solutions
The following techniques may help minimize overhead without limiting collection scope.
Disable Cache Simulation
Minimize collection overhead.
Applicable analyses:
Memory Access Patterns (base simulation functionality)
Characterization with Trip Counts and FLOP collection enabled (enhanced simulation functionality)
Implement these techniques when cache modeling information is not important to you:
The default setting for all the properties/options in the table below is disabled.
From the Analysis Workflow pane, disable Characterization > Enable CPU cache simulation for the Characterization analysis.
From the Project Properties:
Path: Project Properties > Analysis Target... |
CLI Action Options |
Description |
---|---|---|
Disable the Memory Access Patterns Analysis > Advanced > Enable Memory-Level Roofline with cache simulation checkbox. |
--no-enable-cache-simulation | Do not model cache misses, cache misses and cache line utilization, or cache misses and loop footprint. |
Disable the Trip Counts and FLOP Analysis > Advanced > Enable CPU cache simulation checkbox. |
--no-enable-cache-simulation | Do not:
|
Limit Reported Data
Applicable analysis: Memory Access Patterns.
Implement these techniques when the additional data is not important to you.
The default setting for all the properties/options in the table below is enabled.
Project Properties > Analysis Target > Memory Access Patterns Analysis > Advanced |
CLI Action Options |
Description |
---|---|---|
Disable the Report stack variables checkbox. |
--no-record-stack-frame | Do not report stack variables for which memory access strides are detected. |
Disable the Report heap allocated variables checkbox. |
--no-record-stack-frame | Do not report heap-allocated variables for which memory access strides are detected. |
Minimize Data Set
Minimize collection overhead.
Applicable analyses: All, but especially Dependencies, Memory Access Patterns.
When you run an analysis, the Intel® Advisor executes the target against the supplied data set. Data set size and workload have a direct impact on target application execution time and analysis speed
For example, it takes longer to process a 1000x1000 pixel image than a 100x100 pixel image. A possible reason: You may have loops with an iteration space of 1...1000 for the larger image, but only 1...100 for the smaller image. The exact same code paths may be executed in both cases. The difference is the number of times these code paths are repeated.
You can control analysis cost without sacrificing completeness by minimizing this kind of unnecessary repetition from target application execution.
Instead of choosing large, repetitive data sets, choose small, representative data sets that minimize the number of instructions executed within a loop while thoroughly exercising target application control flow paths.
Your objective: In as short a runtime period as possible, execute as many paths as you can afford, while minimizing the repetitive computation within each task to the bare minimum needed for good code coverage.
Data sets that run in about ten seconds or less are ideal. You can always create additional data sets to ensure all your code is checked.
Temporarily Disable Finalization
Minimize finalization overhead.
Applicable analyses: Survey, Characterization with Trip Counts and FLOP collection enabled.
Use when you plan to view collected analysis data on a different machine. Finalization automatically occurs when a result is opened in the GUI or a report is generated from the result.
Note: In the commands below, make sure to replace the myApplication with your application executable path and name before executing a command. If your application requires additional command line options, add them after the executable name.
To implement, do one of the following while running an analysis:
When the analysis Finalizing data... phase begins, click the associated Cancel button.
Use the advisor CLI action option --no-auto-finalize when you run the desired analysis. For example:
advisor --collect=survey --project-dir=./advi_results --no-auto-finalize -- ./myApplication
When you open the result in GUI next time, the result is refinalized automatically.
To refinalize the result when running the analysis from CLI, use the refinalize-survey option with --report action. For example:
advisor --report=survey --search-dir src:=./src bin:=./bin --refinalize-survey --project-dir=./advi_results -- ./myApplication