Visible to Intel only — GUID: GUID-76D69E31-481F-4490-A924-1D81BFE874FC
Visible to Intel only — GUID: GUID-76D69E31-481F-4490-A924-1D81BFE874FC
Run GPU Roofline Insights Perspective from Command Line
To plot a Roofline chart, the Intel® Advisor runs two steps:
- Collect OpenCL™ kernels timings and memory data using the Survey analysis with GPU profiling.
- Measure the hardware limitations and collect floating-point and integer operations data using the Characterization analysis with GPU profiling.
Intel® Advisor calculates compute operations (FLOP and INTOP) as a weighted sum of the following groups of instructions: BASIC COMPUTE, FMA, BIT, DIV, POW, MATH.
Intel Advisor automatically determines data type in the collected operations using the dst register.
For convenience, Intel Advisor has the shortcut --collect=roofline command line action, which you can use to run both Survey and Characterization analyses with a single command. This shortcut command is recommended to run the GPU Roofline Insights perspective.
Prerequisites
- Configure your system to analyze GPU kernels.
- Set Intel Advisor environment variables with an automated script to enable the advisor command line interface (CLI).
Run the GPU Roofline Insights Perspective
There are two methods to run the GPU Roofline analysis. Use one of the following:
- Run the shortcut --collect=roofline command line action to execute the Survey and Characterization analyses for GPU kernels with a single command. This method is recommended to run the CPU / Memory Roofline Insights perspective, but it does not support MPI applications.
- Run the Survey and Characterization analyses for GPU kernels with the --collect=survey and --collect=tripcounts command actions separately one by one. This method is recommended if you want to analyze an MPI application.
Optionally, you can also run the Performance Modeling analysis as part of the GPU Roofline Insights perspective. If you select this analysis, it models your application performance on a baseline GPU device as a target to compare it with the actual application performance. This data is used to suggest more recommendations for performance optimization.
Note: In the commands below, make sure to replace the myApplication with your application executable path and name before executing a command. If your application requires additional command line options, add them after the executable name.
Method 1. Run the Shortcut Command
- Collect data for a GPU Roofline chart with a shortcut.
advisor --collect=roofline --profile-gpu --project-dir=./advi_results -- ./myApplication
This command collects data both for GPU kernels and CPU loops/functions in your application. For kernels running on GPU, it generates a Memory-Level Roofline.
- Run Performance Modeling for the GPU that the application runs on.
advisor --collect=projection --profile-gpu --model-baseline-gpu --project-dir=./advi_results
IMPORTANT:Make sure to use the --model-baseline-gpu option for Performance Modeling to work correctly.This command models your application potential performance on a baseline GPU as a target to determine additional optimization recommendations.
Method 2. Run the Analyses Separately
Use this method if you want to analyze an MPI application.
- Run the Survey analysis.
advisor --collect=survey --profile-gpu --project-dir=./advi_results -- ./myApplication
- Run the Characterization analysis to collect trip counts and FLOP data:
advisor --collect=tripcounts --flop --profile-gpu --project-dir=./advi_results -- ./myApplication
These commands collect data both for GPU kernels and CPU loops/functions in your application. For kernels running on GPU, it generates a Memory-Level Roofline.
- Run Performance Modeling for the GPU that the application runs on.
advisor --collect=projection --profile-gpu --model-baseline-gpu --project-dir=./advi_results
IMPORTANT:Make sure to use the --model-baseline-gpu option for Performance Modeling to work correctly.This command models your application potential performance on a baseline GPU as a target to determine additional optimization recommendations.
You can view the results in the Intel Advisor graphical user interface (GUI) or in CLI, or generate an interactive HTML report. See View the Results below for details.
Analysis Details
The CPU / Memory Roofline Insights workflow includes only the Roofline analysis, which sequentially runs the Survey and Characterization (trip counts and FLOP) analyses.
The analysis has a set of additional options that modify its behavior and collect additional performance data.
Consider the following options:
Roofline Options
To run the Roofline analysis, use the following command line action: --collect=roofline.
Recommended action options:
Options |
Description |
---|---|
--profile-gpu |
Analyze GPU kernels. This option is required for each command. |
--target-gpu |
Select a target GPU adapter to collect profiling data. The adapter configuration should be in the following format <domain>:<bus>:<device-number>.<function-number>. Only decimal numbers are accepted. Use this option if you have more than one GPU adapter on your system. The default is the latest GPU architecture version found on your system.
TIP:
To see a list of GPU adapters available on your system, run advisor --help target-gpu and see the option description.
|
--gpu-sampling-interval=<double> |
Set an interval (in milliseconds) between GPU samples. By default, it is set to 1. |
--enable-data-transfer-analysis |
Model data transfer between host memory and device memory. Use this option if you want to run the Performance Modeling analysis. |
--track-memory-objects |
Attribute memory objects to the analyzed loops that accessed the objects. Use this option if you want to run the Performance Modeling analysis. |
--data-transfer=<level> |
Set the level of details for modeling data transfers during Characterization. Use this option if you want to run the Performance Modeling analysis. Use one of the following values:
|
See advisor Command Option Reference for more options.
Performance Modeling Options
To run the Performance Modeling analysis, use the following command line action: --collect=projection.
The action options in the table below are required to use when you run the Performance Modeling analysis as part of the GPU Roofline Insights perspective:
Options |
Description |
---|---|
--profile-gpu |
Analyze GPU kernels. This option is required for each command. |
--enforce-baseline-decomposition |
Use the same local size and SIMD width as measured on the baseline. This option is required. |
--model-baseline-gpu |
Use the baseline GPU configuration as a target device for modeling. This option is required. This option automatically enables the --enforce-baseline-decomposition option, so you can use only --model-baseline-gpu. |
See advisor Command Option Reference for more options.
Next Steps
Continue to explore GPU Roofline results. For details about the metrics reported, see Accelerator Metrics.