Visible to Intel only — GUID: GUID-AC9D3674-DC91-42DE-A825-1D64CE2E2A68
Visible to Intel only — GUID: GUID-AC9D3674-DC91-42DE-A825-1D64CE2E2A68
OpenSHMEM* Code Analysis with Fabric Profiler
Fabric Profiler (preview feature) is a performance tool that you can use to identify detailed characteristics of the runtime behavior for an OpenSHMEM application.
This is a PREVIEW FEATURE. A preview feature may or may not appear in a future production release. It is available for your use in the hopes that you will provide feedback on its usefulness and help determine its future. Data collected with a preview feature is not guaranteed to be backward compatible with future releases.
The application consists of two parts:
Data collector monitors application and network behavior while the OpenSHMEM application is running.
Analyzer is a collection of tools that runs on a Linux* or Windows* workstation after the application has completed. These tools display profiling results with interactive features that allow you to explore a multitude of communication-centric behaviors.
The Fabric Profiler tool is distributed as part of Intel® VTune™ Profiler. Full documentation of the tool, examples, and pre-collected trace files are available in the Fabric Profiler package.
Set Up the Data Collector
The Fabric Profiler data collector is implemented as a library that intercepts the OpenSHMEM calls of the application and monitors network activity. It populates binary trace files with this information.
Prerequisites: Load the esp module by running: module load esp. The data collector package is installed in the ESP_ROOT environment variable .
The data collector requires two third party libraries:
PAPI is used to gather system metrics at runtime. To add PAPI to your environment you may need to run module load papi, or download it from https://icl.utk.edu/papi/ and build it.
OTF2 is used to generate trace files. You can obtain OTF2 at score-p.org.
Set Up the Analyzer
The analyzer is a collection of MATLAB* programs that run in the MATLAB runtime environment. They read the trace files and display results.
Prerequisites: You must have the MATLAB Runtime Environment to install the analyzer. This is a free download available at https://www.mathworks.com/products/compiler/mcr.html. Select a version that is R2018a(9.4) or newer.
The analyzer is located in the release directory in esp/bin/analyzer. It is a MATLAB program named fabric_profiler_v100.
To start the analyzer, run the fpro script.
Fabric Profiler Workflow
In the Fabric Profiler workflow, you perform these steps:
- Build and run an application using the data collector.
- Generate trace files.
- View trace files using the analyzer.
Build and Run an Application
Once you have installed Fabric Profiler on a Linux or Windows machine, complete these steps to build and run an application.
Define Fabric Profiler regions in the source code.
A named region is highlighted in analyzer displays and improves analysis.
- Include the header file esp.h.
- Mark regions of interest:
esp_enter("<region_name>"); exit_exit("<region_name>");
- Rebuild the application.
NOTE:You cannot nest or interleave regions.Build a statically-linked application with Fabric Profiler instrumentation.
When you load the Fabric Profiler module (esp), environment variables define important flags for you. Use these variables to link the Fabric Profiler data collector library into your code before the SHMEM library.
For example, to build the fixed-round example (from the examples directory) using Cray SHMEM, type:
CC -static -o fixed-round $ESP_CFLAGS fixed-round.c $ESP_LDFLAGS $ESP_LDADD
Make sure you adhere to these changes from your normal build:
Use the C++ compiler, even if the C-language application does not require it. The data collector library uses C++ and will not link without it.
Use $ESP_CFLAGS to add the path to esp.h. It also adds -g which improves the quality of the trace files.
Use $ESP_LDFLAGS to add the path to the data collector library.
Use $ESP_LDADD to add the data collector library.
Build a dynamically-linked application with Fabric Profiler instrumentation.
Fabric Profiler uses LD_PRELOAD at run-time to link in the data collector library before the SHMEM library. Therefore, you do not need to rebuild your application unless you added Fabric Profiler regions to your source code.
For example, the fixed-round.c application (in the examples directory) is written in C. Unlike the case of static linking above, you do not need to use the C++ compiler to build this C-language application for use with Fabric Profiler instrumentation.
cc -o fixed-round $ESP_CFLAGS fixed-round.c -dynamic
$ESP_CFLAGS sets the path to esp.h and adds -g.
Run an application with Fabric Profiler instrumentation.
The data collector library uses the PAPI library and the OTF2 library. If you are using the shared library, you may need to run module load papi, or add PAPI to your library paths. You can download OTF2 at score-p.org.
Load the Fabric Profiler module:
module load esp
There are many Fabric Profiler configuration parameters. The module sets them to default values which are sufficient when you run your application for the first time. The configuration parameters are described in a separate section.
For a dynamic application, add the data collector library to the LD_PRELOAD variable.
For example:
export LD_PRELOAD=$ESP_ROOT/lib/libesp.so:$LD_PRELOAD srun --export=LD_PRELOAD,ALL <rest of srun command>
If you have loaded the esp module, the environment variable ESP_LIB contains the path to libesp.so. See the sample job scripts *.slurm and *.lsf in the examples directory.
Generate Trace Files
Once you run the data collector, it monitors the execution of your application as well as network activity. It writes trace files when the application has finished executing. Add 10% to your wall time for writing output to the trace files.
See the application output to verify successful code instrumentation by the data collector. To verify, check these actions:
Ensure that the ESP_VERBOSITY_LEVEL environment variable is set to 1 and not 0.
Call shmem_init. The start banner of Fabric Profiler displays.
Call shmem_finalize. The stop banner of Fabric Profiler displays.
If the ESP_VERBOSITY_LEVEL environment variable is set correctly and the banners do not display on function call, contact esp-support@intel.com for further assistance.
Merge the trace files.
The Fabric Profiler banner lists the path to the trace files. To merge traces, run esp_merge_traces.sh script:
$ESP_ROOT/bin/esp_merge_traces.sh \ <path to application executable> <path to trace directory> <number of PEs>
- Copy the trace files in the root level of the traces directory to the machine where you have installed the analyzer.
View Trace Files using the Analyzer
There are five types of analyzers which read trace files. All of them are located in esp/bin/analyzer in the Fabric Profiler package. The analyzers are:
espba - Barrier analyzer
espfbla - Function backlog analyzer
espla - Function latency analyzer
espmsa - Message straggler analyzer
espr - A report that contains a summary of results
You can use the traces generated in the previous step or open pre-collected sample traces from esp/examples/samples/trace. Each of these traces corresponds to a SHMEM application in the esp/examples directory.
espr is a general report that summarizes all of the trace data in HTML format. Each sample application in the examples directory includes this report so you can view the report for the sample application without running the SHMEM application or MATLAB runtime. The esp/examples/samples/html directory contains files named {app name}_{number of PEs}.htmland associated directories named {app name}_{number of PEs}_html_files. Open the HTML file in a browser to view the report generated by the analyzer from the corresponding trace files in esp/examples/output/samples/trace.
During the operation of Fabric Profiler, when your application calls shmem_finalize, the data collector writes five trace files that contain information about application behavior.
Trace File | Format | Contents |
---|---|---|
{trace-file-prefix}.uc1.func | Binary |
Information about every profiled SHMEM function call. Each process writes out a separate function trace file. After job completion, the individual function trace files are merged into a single file with the esp/bin/collector/esp_merge_traces.sh script. The merged file is required by the analyzers. |
{trace-file-prefix}.uc1.hfi | Binary |
When the SHMEM application is running, Fabric Profiler monitors send and receive counters on the host fabric interface card. The HFI file contains these time-stamped counter values. |
{trace-file-prefix}.uc1.profile | Binary |
When the SHMEM application is running, Fabric Profiler monitors system performance counters and gathers system information. This data is written to the profile file. Each process writes out a separate profile file. When the job completes, the individual profile trace files are merged into a single file with the esp/bin/collector/esp_merge_traces.sh script. The merged file is required by the analyzers. |
{trace-file-prefix}.uc1.put | Binary |
Fabric Profiler monitors the amount of data injected into the network with each shmem_put call and the destination node for each put operation. The put file contains these values. |
{trace-file-prefix}.uc1.ev.txt | Text |
The environment file is a list of all environment variables defined at SHMEM application run-time. |
This table describes each analyzer in the Fabric Profiler package, along with associated operations that you can perform.
Analyzer Type | Name | Purpose | Suggested Operations |
---|---|---|---|
espba | Barrier Trace Analyzer |
Reads the function trace file and displays barrier wait times for each barrier call in the source code for each PE. |
|
espfbla | Fabric Backlog Analyzer |
Reads the put trace file and correlates that with the HFI trace file to visualize fabric backlog at any point in time. |
|
espla | Function (latency) Trace Analyzer |
Reads the function trace file and displays function latency for all instrumented SHMEM calls. Trace files that contain ~100,000s of function calls can take several minutes to complete. The default display shows composite PE wait time for all calls at each point in time. |
|
espmsa | Message Straggler Analyzer |
Reads the function trace file and correlates the activity in the trace file with network activity in the HFI trace file. |
|
espr | Analyzer Report |
A non-interactive report that gathers information about a SHMEM application run and displays it in HTML format. The report can take several minutes to be completed. When completed, the HTML report is saved in the same location as the profile trace file, with a matching file name. |
Use the File menu to select the profile trace file for a particular application run. |