Visible to Intel only — GUID: GUID-BD71674A-C0CD-488A-8E4F-F526A58994CF
Visible to Intel only — GUID: GUID-BD71674A-C0CD-488A-8E4F-F526A58994CF
Loop Markup to Minimize Analysis Overhead
Issue
Running your target application with the Intel® Advisor can take substantially longer than running your target application without the Intel® Advisor. Depending on an accuracy level and analyses you choose for a perspective, different overhead is added to your application execution time. For example:
Runtime Overhead / Analysis |
Survey |
Characterization |
Dependencies |
MAP |
---|---|---|---|---|
Target application runtime with Intel® Advisor compared to runtime without Intel® Advisor |
1.1x longer |
2 - 55x longer |
5 - 100x longer |
5 - 20x longer |
Solutions
Use the following techniques to skip uninteresting loops and analyze only interesting loops.
Select Loops by ID
Goal: Minimize collection overhead.
Applicable analyses: Characterization with Trip Counts and FLOP collection enabled, Dependencies, Memory Access Patterns.
Use when...
You want to perform a deeper analysis on only a few loops.
CLI environment: You cannot identify source file/line numbers, such as when you are analyzing a target application for which you do not have access to source code.
Note: In the commands below, make sure to replace the myApplication with your application executable path and name before executing a command. If your application requires additional command line options, add them after the executable name.
Prerequisites:
Run a Survey analysis.
advisor CLI environment: Identify the loop IDs for the loops of interest.
advisor --report=survey --project-dir=./advi_results -- ./myApplication
In the report, the first column is the loop IDs.
Intel® Advisor reports tend to be very wide. Do one of the following to generate readable reports:
Set your console width appropriately to avoid line wrapping.
Pipe your report using the appropriate truncation command if you care only about the first few report columns.
After performing the prerequisites, do one of the following:
For Vectorization and CPU Roofline: Mark the loop(s) of interest by enabling the associated checkbox on the Survey Report.
Then run a Characterization with Trip Counts and FLOP collection enabled, Dependencies, or Memory Access Patterns analysis.
For Offload Modeling: Go to Project Properties > Performance Modeling and enter the CLI action option --select=<string> in the Other parameters field. For example, --select=5,10,12.
Mark the loop(s) of interest using the CLI action option --select=<string> (recommended) or --mark-up-list=<string> when running a Characterization with Trip Counts and FLOP collection enabled, Dependencies, or Memory Access Patterns analysis. For example, with the --select option:
advisor --collect=tripcounts --flop --project-dir=./advi_results --select=5,10,12 -- ./myApplication
Then run a Characterization with Trip Counts and FLOP collections enabled, Dependencies, or Memory Access Patterns analysis.
There are different ways to select loops is in the CLI environment:
The advisor CLI action options --mark-up-list=<string> and --select=<string> merely simulate enabling a GUI checkbox when used within -collect action. They are active only for the duration of the --collect command.
The same options used with advisor CLI action --mark-up-loops actually enable a GUI checkbox. They are active beyond the duration of the -mark-up-loops command and applies to all downstream analyses, such as Characterization with Trip Counts and FLOP collection enabled, Dependencies, Memory Access Patterns.
Select Loops by Source File/Line Number
Minimize collection overhead.
Applicable analyses: Characterization with Trip Counts and FLOP collection enabled, Dependencies, Memory Access Patterns.
Use when...
You want to perform a deeper analysis on only a few loops.
CLI environment: You are analyzing a target application for which you have access to source code and can identify source file/line numbers.
Note: In the commands below, make sure to replace the myApplication with your application executable path and name before executing a command. If your application requires additional command line options, add them after the executable name.
Prerequisites:
Run a Survey analysis.
advisor CLI environment: If necessary, identify the source file and line number for the loops of interest.
advisor --report=survey --project-dir=./advi_results -- ./myApplication
After performing the prerequisites, do one of the following:
For Vectorization and CPU Roofline: Mark the loop(s) of interest by enabling the associated checkbox on the Survey report.
Then run a Characterization with Trip Counts and FLOP collection enabled, Dependencies, or Memory Access Patterns analysis.
For Offload Modeling: Go to Project Properties > Performance Modeling and enter the CLI action option --select=<string> in the Other parameters field. For example, --select=foo.cpp:34,bar.cpp:192.
Mark the loop(s) of interest using the CLI action option --select=<string> (recommended) or --mark-up-list=<string> for a Characterization with Trip Counts and FLOP collection enabled, Dependencies, or Memory Access Patterns analysis. For example, with the -select option:
advisor --collect=tripcounts --flop --project-dir=./advi_results --select=foo.cpp:34,bar.cpp:192 -- ./bin/myApplication
Mark the loop(s) of interest by enabling the associated checkbox on the Survey Report.
Then run a Characterization with Trip Counts and FLOP collection enabled, Dependencies, or Memory Access Patterns analysis.
Mark the loop(s) of interest using the advisor CLI action --mark-up-loops and action option --select=<string>. For example:
advisor --mark-up-loops --select=foo.cpp:34,bar.cpp:192 --project-dir=./advi_results -- ./myApplication
Then run a Characterization with Trip Counts and FLOP collection enabled, Dependencies, or Memory Access Patterns analysis.
There is essentially no difference between selecting loops by ID and selecting loops by source file/line in the GUI environment. The difference is in the advisor CLI environment:
The advisor CLI action option--mark-up-list=<string> merely simulates enabling a GUI checkbox; therefore it persists only for the duration of the --collect command.
The advisor CLI action--mark-up-loops and action option --select=<string> actually enables a GUI checkbox; therefore it persists beyond the duration of the --mark-up-loops command and applies to downstream analyses, such as Characterization with Trip Counts and FLOP collection enabled, Dependencies, and Memory Access Patterns.
If you use the --mark-up-loops CLI action to mark up loops, you can append and remove source file/line numbers for an analysis run after it using the advisor CLI action option --append=<string> and --remove=<string> respectively.
Select Loops by Criteria
Goal: Minimize collection overhead.
Applicable analyses: Dependencies, Memory Access Patterns.
Use when you want to perform a deeper analysis on loops chosen by criteria instead of by human input, such as when you are running the Intel® Advisor with a collection preset or using automated scripts.
To implement in the advisor CLI environment, run the commands similar to the following one by one from the command line or create a script similar to the following examples and run it to execute the commands automatically. Use the --select (recommended) or --loops option to select loops by criteria.
Note: In the commands below, make sure to replace the myApplication with your application executable path and name before executing a command. If your application requires additional command line options, add them after the executable name.
For example, to analyze loop-carried dependencies in loops/functions that have the Assumes dependency present issue, use one of the following:
Example 1:
advisor --collect=survey --project-dir=./advi_results -- ./bin/myApplication
advisor --collect=dependencies --project-dir=./advi_results -- ./myApplicaton
Example 2:
advisor --collect=survey --project-dir=./advi_results -- ./bin/myApplication
advisor --collect=dependencies select="scalar,has-issue" --project-dir=./advi_results -- ./myApplicaton
Select Loops by Markup Algorithm
Goal: Minimize collection overhead.
Applicable analyses: Characterization with Trip Counts and FLOP collection enabled, Dependencies, Memory Access Patterns.
Use --select=r:markup=<algorithm> when you want to perform a deeper analysis on loops chosen by a pre-defined markup algorithm based on a programming model used and/or estimated offload profitability.
If you analyze an application that runs on a CPU, use the gpu_generic algorithm. This algorithm selects all potentially profitable loops/functions for additional analyses to collect more data and make sure they can be safely offloaded.
If you analyze code regions that are already offloaded and use a specific programming model, use one of the following algorithms:
omp - Select OpenMP* loops.
icpx -fsycl - Select SYCL loops.
ocl - Select OpenCL™ loops.
daal - Select Intel® oneAPI Data Analytics Library loops.
tbb - Select Intel® oneAPI Threading Building Blocks loops.
Note: In the commands below, make sure to replace the myApplication with your application executable path and name before executing a command. If your application requires additional command line options, add them after the executable name.
For example, to run the Offload Modeling and analyze potentially profitable code regions in details:
Example 1. Use the --select=r:markup=<algorithm> option with the --collect action option to select loops only for the specific analysis.
advisor --collect=survey --project-dir=./advi_results --static-instruction-mix -- ./myApplication
advisor --collect=tripcounts --project-dir=./advi_results --flop --cache-simulation=single --target-device=xehpg_512xve --stacks --data-transfer=light -- ./myApplication
advisor --collect=dependencies --filter-reductions --loop-call-count-limit=16 --select markup=gpu_generic --project-dir=./advi_results -- ./myApplication
advisor --collect=projection --project-dir=./advi_results
Example 2. Use the --select=r:markup=<algorithm> option with the --mark-up-loops action option in a separate step to select loops for all analysis executed after this command.
advisor --collect=survey --project-dir=./advi_results --static-instruction-mix -- ./myApplication
advisor --collect=tripcounts --project-dir=./advi_results --flop --cache-simulation=single --target-device=xehpg_512xve --stacks --data-transfer=light -- ./myApplication
advisor --mark-up-loops --project-dir=./advi_results --select markup=gpu_generic -- ./myApplication
advisor --collect=dependencies --filter-reductions --loop-call-count-limit=16 --project-dir=./advi_results -- ./myApplication
advisor --collect=projection --project-dir=./advi_results