Collecting Lightweight Statistics

Intel® Trace Analyzer and Collector User and Reference Guide

Download PDF

ID 767272

Date 10/31/2024

Version 2021.10

Public

Visible to Intel only — GUID: GUID-70E18EA1-1EF5-4B0D-AF9F-3D7800825184

View Details

Collecting Lightweight Statistics

Intel® Trace Collector can gather and store statistics about the function calls and their communication. These statistics are gathered even if no trace data is collected, so it is a good starting point for trying to understand an unknown application that might produce an unmanageable trace.

Usage Instructions

To collect this lightweight statistics for your application, set the following environment variables before tracing:

$ export VT_STATISTICS=ON
$ export VT_PROCESS=OFF

Alternatively, set the VT_CONFIG environment variable to point to the configuration file:

# Enable statistics gathering
STATISTICS ON
# Do not gather trace data
PROCESS 0:N OFF

$ export VT_CONFIG=<configuration_file_path>/config.conf

The statistics is written into the *.stf file. Use the stftool to convert the data to the ASCII text with --print-statistics. For example:

$ stftool tracefile.stf --print-statistics

NOTE:

The resulting output has easy-to-process format, so you can use text processing programs and scripts such as awk*, Perl*, and Microsoft Excel* for better readability. A Perl script convert-stats with this capability is provided in the bin folder.

Output Format

Each line contains the following information:

Thread or process
Function ID
Receiver (if applicable)
Message size (if applicable)
Number of involved processes (if applicable)

And the following statistics:

Count – number of communications or number of calls as applicable
Minimum execution time excluding callee times
Maximum execution time excluding callee times
Total execution time excluding callee times
Minimum execution time including callee times
Maximum execution time including callee times
Total execution time including callee times

Within each line the fields are separated by colons.

Receiver is set to 0xffffffff for file operations and to 0xfffffffe for collective operations. If message size equals 0xffffffff the only defined value is 0xfffffffe to mark it as a collective operation.

The message size is the number of bytes sent or received per single message. With collective operations the following values (buckets of message size) are used for individual instances:

Value	Process-local bucket	Is the same value on all processes?
MPI_Barrier	0	Yes
MPI_Bcast	Broadcast bytes	Yes
MPI_Gather	Bytes sent	Yes
MPI_Gatherv	Bytes sent	No
MPI_Scatter	Bytes received	Yes
MPI_Scatterv	Bytes received	No
MPI_Allgather	Bytes sent + received	Yes
MPI_Allgatherv	Bytes sent + received	No
MPI_Alltoall	Bytes sent + received	Yes
MPI_Alltoallv	Bytes sent + received	No
MPI_Reduce	Bytes sent	Yes
MPI_Allreduce	Bytes sent + received	Yes
MPI_Reduce_Scatter	Bytes sent + received	Yes
MPI_Scan	Bytes sent + received	Yes

Message is set to 0xffffffff if no message was sent, for example, for non-MPI functions or functions like MPI_Comm_rank.

If more than one communication event (message or collective operation) occur in the same function call (for example in MPI_Waitall, MPI_Waitany, MPI_Testsome, MPI_Sendrecv etc.), the time in that function is evenly distributed over all communications and counted once for each message or collective operation. Therefore, it is impossible to compute a correct traditional function profile from the data referring to such function instances (for example, those that are involved in more than one message per actual function call). Only the Total execution time including callee times and the Total execution time excluding callee times can be interpreted similar to the traditional function profile in all cases.

The number of involved processes is negative for received messages. If messages were received from a different process/thread it is -2.

Statistics are gathered on the thread level for all MPI functions, and for all functions instrumented through the API or compiler instrumentation.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® Trace Analyzer and Collector User and Reference Guide

Collecting Lightweight Statistics

Usage Instructions

Output Format

See Also