Offload Modeling Accuracy Levels in Command Line

Intel® Advisor User Guide

Download PDF

ID 766448

Date 10/31/2024

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Offload Modeling Accuracy Levels in Command Line

For each perspective, Intel® Advisor has several levels of collection accuracy. Each accuracy level is a set of analyses and properties that control what data is collected and the level of collection details. The higher accuracy value you choose, the higher runtime overhead is added.

You can generate commands for a desired accuracy level from the Intel Advisor GUI. See Generate Command Lines from GUI for details.

NOTE:

There is a variety of techniques available to minimize data collection, result size, and execution overhead. Check Minimize Analysis Overhead.

CPU-to-GPU Modeling

For the CPU-to-GPU modeling, the following accuracy levels are available:

Comparison / Accuracy Level	Low	Medium	High
Overhead	5 - 10x	15 - 50x	50 - 80x
Goal	Model performance of an application that is mostly compute bound and does not have dependencies	Model application performance considering memory traffic for all cache and memory levels	Model application performance with all potential limitations for offload candidates
Analyses	Survey + Characterization (Trip Counts and FLOP) + Performance Modeling with no assumed dependencies	Survey + Characterization (Trip Counts and FLOP with cache simulation for the selected target device, callstacks, and light data transfer simulation) + Performance Modeling with no assumed dependencies	Survey + Characterization (Trip Counts and FLOP with cache simulation for the selected target device, callstacks, and medium data transfer simulation) + Dependencies + Performance Modeling with assumed dependencies
Result	Basic Offload Modeling report that shows potential speedup and performance metrics estimated on a target considering memory traffic from execution units to L1 cache only. The result might be inaccurate for memory-bound applications.	Offload Modeling report extended with data transfers estimated between host and device platforms considering memory traffic for all cache and memory levels	Offload Modeling report with detailed data transfer estimations and automated check for loop-carried dependencies for more accurate search for the most profitable regions to offload

Note: In the commands below, make sure to replace the myApplication with your application executable path and name before executing a command. If your application requires additional command line options, add them after the executable name.

NOTE:

Families of Intel® X^e graphics products starting with Intel® Arc™ Alchemist (formerly DG2) and newer generations feature GPU architecture terminology that shifts from legacy terms. For more information on the terminology changes and to understand their mapping with legacy content, see GPU Architecture Terminology for Intel® X^e Graphics.

Low Accuracy

To model application performance for with low accuracy for a default target device, run the following command:

advisor --collect=offload --accuracy=low --project-dir=./advi_results -- ./myApplication

This command runs the following analyses one by one:

Survey analysis:

advisor --collect=survey --auto-finalize --static-instruction-mix --project-dir=./advi_results -- ./myApplication

Characterization analysis to collect trip count and FLOP data

advisor --collect=tripcounts --flop --auto-finalize --target-device=xehpg_512xve --project-dir=./advi_results -- ./myApplication

Performance modeling:

advisor --collect=projection --no-assume-dependencies --config=xehpg_512xve --project-dir=./advi_results

Medium Accuracy

This accuracy is set by default. To model application performance with medium accuracy for a default target device, run the following command:

advisor --collect=offload --project-dir=./advi_results -- ./myApplication

This command runs the following analyses one by one:

Survey analysis:

advisor --collect=survey --auto-finalize --static-instruction-mix --project-dir=./advi_results -- ./myApplication

Characterization analysis to collect trip count and FLOP data

advisor --collect=tripcounts --flop --stacks --auto-finalize --cache-simulation=single --data-transfer=light --target-device=xehpg_512xve --project-dir=./advi_results -- ./myApplication

Performance modeling:

advisor --collect=projection --no-assume-dependencies --config=xehpg_512xve --project-dir=./advi_results

High Accuracy

To model application performance with high accuracy for a default target device, run the following command:

advisor --collect=offload --accuracy=high --project-dir=./advi_results -- ./myApplication

This command runs the following analyses one by one:

Survey analysis:

advisor --collect=survey --auto-finalize --static-instruction-mix --project-dir=./advi_results -- ./myApplication

Characterization analysis to collect trip count and FLOP data

advisor --collect=tripcounts --flop --stacks --auto-finalize --cache-simulation=single --target-device=xehpg_512xve --data-transfer=medium --project-dir=./advi_results -- ./myApplication

Dependencies analysis:

advisor --collect=dependencies --filter-reductions --loop-call-count-limit=16 --select markup=gpu_generic --project-dir=./advi_results -- ./myApplication

Performance modeling:

advisor --collect=projection --config=xehpg_512xve --project-dir=./advi_results

See Check How Dependencies Affect Modeling for a recommended strategy to check for loop-carried dependencies.

GPU-to-GPU Modeling

For the GPU-to-GPU modeling, the following accuracy levels are available:

Comparison / Accuracy Level	Low	Medium	High
Overhead	5 - 10x	15 - 50x	15 - 50x
Goal	Model performance of an application that is mostly compute bound	Model application performance considering memory traffic for all cache and memory levels	Model application performance with all potential limitations for offload candidates
Analyses	Survey + Characterization (Trip Counts and FLOP) + Performance Modeling	Survey + Characterization (Trip Counts and FLOP with light data transfer simulation) + Performance Modeling	Survey + Characterization (Trip Counts and FLOP with medium data transfer simulation) + Performance Modeling
Result	Basic Offload Modeling report that shows potential speedup and performance metrics estimated on a target considering memory traffic from execution units to L1 cache only. The result might be inaccurate for memory-bound applications.	Offload Modeling report extended with data transfers estimated between host and device platforms	Offload Modeling report with detailed data transfer estimations for more accurate search for the most profitable regions to offload

Low Accuracy

To model application performance for with low accuracy for a default target device, run the following command:

advisor --collect=offload --accuracy=low --gpu --project-dir=./advi_results -- ./myApplication

This command runs the following analyses one by one:

Survey analysis:

advisor --collect=survey --auto-finalize --static-instruction-mix --profile-gpu --project-dir=./advi_results -- ./myApplication

Characterization analysis to collect trip count and FLOP data

advisor --collect=tripcounts --flop --auto-finalize --target-device=xehpg_512xve --profile-gpu --project-dir=./advi_results -- ./myApplication

Performance modeling:

advisor --collect=projection --no-assume-dependencies --config=xehpg_512xve --profile-gpu --project-dir=./advi_results

Medium Accuracy

This accuracy is set by default. To model application performance with medium accuracy for a default target device, run the following command:

advisor --collect=offload --gpu --project-dir=./advi_results -- ./myApplication

This command runs the following analyses one by one:

Survey analysis:

advisor --collect=survey --auto-finalize --static-instruction-mix --profile-gpu --project-dir=./advi_results -- ./myApplication

Characterization analysis to collect trip count and FLOP data

advisor --collect=tripcounts --flop --auto-finalize --data-transfer=light --target-device=xehpg_512xve --profile-gpu --project-dir=./advi_results -- ./myApplication

Performance modeling:

advisor --collect=projection --no-assume-dependencies --config=xehpg_512xve --profile-gpu --project-dir=./advi_results

High Accuracy

To model application performance with high accuracy for a default target device, run the following command:

advisor --collect=offload --accuracy=high --gpu --project-dir=./advi_results -- ./myApplication

This command runs the following analyses one by one:

Survey analysis:

advisor --collect=survey --auto-finalize --static-instruction-mix --profile-gpu --project-dir=./advi_results -- ./myApplication

Characterization analysis to collect trip count and FLOP data

advisor --collect=tripcounts --flop --auto-finalize --target-device=xehpg_512xve --profile-gpu --data-transfer=medium --project-dir=./advi_results -- ./myApplication

Performance modeling:

advisor --collect=projection --config=xehpg_512xve --profile-gpu --project-dir=./advi_results

After you run the perspective, you can view the results in the Intel Advisor GUI, in CLI, or an interactive HTML report.

Parent topic: Run Offload Modeling Perspective from Command Line

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® Advisor User Guide

Offload Modeling Accuracy Levels in Command Line

CPU-to-GPU Modeling

GPU-to-GPU Modeling

See Also