Analyzing Applications
Prerequisites
(Optional) Use the following software to get an advanced metric set when running Application Performance Snapshot:
- Recommended compilers: Intel® C++ Compiler Classic, Intel® oneAPI DPC++/C++ Compiler or Intel® Fortran Compiler Classic and Intel® Fortran Compiler. Although you can use other compilers for this purpose, information about OpenMP* imbalance is available from the Intel OpenMP library only.
Set up Application Performance Snapshot before you begin to analyze applications. In a terminal window, run this command to set the environment variables:
source /opt/intel/oneapi/vtune/latest/apsvars.sh
You can open apsvars.sh with the --help option to see a list of available setup options.
Analyzing Shared Memory Applications
Run a data collection on your application using Application Performance Snapshot:
$ aps <my app> [<app parameters>]
where <my app> is the path to your application and <app parameters> are your application parameters.
After the analysis completes, a report appears in the command window. You can also open an HTML report with the same information in a supported browser. The path to the HTML report is included in the command window. For example:
firefox ./aps_report_01012017_1234.html &
Analyze the data shown in the report. Hover over a metric in the HTML report for more information.
Determine appropriate next steps based on result analysis. Common next steps may include application tuning or using another performance analysis tool for more detailed information, such as Intel® VTune™ Profiler or Intel® Advisor.
Analyzing MPI Applications
Run the following command to collect data about your MPI application:
$ <mpi launcher> <mpi parameters> aps <my app> [<app parameters>]
where:
<mpi launcher> is an MPI job launcher, such as mpirun, srun, or aprun.
<mpi parameters> are the MPI launcher parameters.
NOTE:aps must be the last <mpi launcher> parameter.
<my app> is the path to your application.
<app parameters> are your application parameters.
Application Performance Snapshot launches the application and runs the data collection. After the analysis completes, a aps_result_<date> directory is created.
Run this command to complete the analysis:
$ aps --report aps_result_<date>
After the analysis completes, a report appears in the command window. You can also open an HTML report with the same information in a web browser.
Analyze the data shown in the report. Hover over a metric in the HTML report for more information.
Use the analysis result to identify your next steps.
You can continue the analysis by tuning with one of these tools:
- The mpitune utility
- Intel® Trace Analyzer and Collector
- Intel® VTune™ Profiler
To learn more additional profiling capabilities, open Detailed MPI Analysis.
When you use Application Performance Snapshot to analyze MPI applications, you can save results in two formats:
- Simple Format
- Compact Format
Simple Format | Compact Format | |
---|---|---|
Purpose |
Use this format when the number of application ranks is small (<1000) or is limited by only one node. |
Use this format when the number of nodes and ranks per node are large enough to slow down report generation in Application Performance Snapshot significantly. |
Usage |
Default storage format for MPI analyses. To enable this format explicitly, do one of the following:
|
Must be enabled explicitly for MPI analyses. To enable this format, do one of the following:
|
Requirements |
None |
If these requirements are not fully met, the storage format defaults to the simple format. |
Action |
Saving trace data in this format creates a single trace file in the result directory for each MPI rank / process. |
Each MPI rank / process writes its trace to a temporary folder that is local to the node. When all ranks on a node are finalized for the MPI application, one of the ranks merges the trace files from the temporary file system into a single file. This single file is then written to the specified result directory. |
Analysis of Results |
To analyze performance traces, use aps-report or aps --report switches. You can use these switches to load traces collected using older versions of Application Performance Snapshot, if those traces were collected in the simple format. |
To analyze performance traces, use aps-report or aps --report switches. You can use these switches to load traces collected using older versions of Application Performance Snapshot, if those traces were collected in the compact format. Versions of Application Performance Snapshot older than 2025.0 cannot load traces collected in the compact format. |
Analyze Python Applications with MPI
You can use Application Performance Snapshot to run data collections on Python applications written in MPI, such as PyTorch or AI applications. To do this,
- Get the path to your MPI library:
LD_DEBUG=libs mpirun -np 2 aps -c mpi python3 application.py 2> >(grep /libmpi) 62663: trying file=/lib/x86_64-linux-gnu/libmpich.so.12 62664: trying file=/lib/x86_64-linux-gnu/libmpich.so.12 62663: calling init: /lib/x86_64-linux-gnu/libmpich.so.12 62664: calling init: /lib/x86_64-linux-gnu/libmpich.so.12
Specify the path to the MPI library in your LD_PRELOAD command. For example,
LD_PRELOAD=/lib/x86_64-linux-gnu/libmpich.so.12 mpirun -np 2 aps -c mpi python3 application.py
where application.py is the Python application you want to profile.