Visible to Intel only — GUID: GUID-9EB08214-D029-4EA5-A416-AA90E48AB834
Visible to Intel only — GUID: GUID-9EB08214-D029-4EA5-A416-AA90E48AB834
MPI-2 Benchmark Modes
MPI-2 benchmarks can run in the following modes:
Blocking/nonblocking mode. These modes apply to the IMB-IO benchmarks only. For details, see sections IMB-IO Blocking Benchmarks and IMB-IO Nonblocking Benchmarks.
Aggregate/non-aggregate mode. Non-aggregate mode is not available for nonblocking flavors of IMB-IO benchmarks.
The following example illustrates aggregation of M transfers for IMB-EXT and blocking Write benchmarks:
Select a repetition count M
time = MPI Wtime();
issue M disjoint transfers
assure completion of all transfers
time = (MPI_Wtime() - time) / M
In this example:
M is a repetition count:
M = 1 in the non-aggregate mode
M= n_sample in the aggregate mode. For the exact definition of n_sample see the Actual Benchmarking section.
A transfer is issued by the corresponding one-sided communication call (for IMB-EXT) and by an MPI-IO write call (for IMB-IO).
Disjoint means that multiple transfers (if M>1) are to/from disjoint sections of the window or file. This permits to avoid misleading optimizations when using the same locations for multiple transfers.
The variation of M provides important information about the system and the MPI implementation, crucial for application code optimizations. For example, the following possible internal strategies of an implementation could influence the timing outcome of the above pattern.
Accumulative strategy. Several successive transfers (up to M in the example above) are accumulated without an immediate completion. At certain stages, the accumulated transfers are completed as a whole. This approach may save time of expensive synchronizations. This strategy is expected to produce better results in the aggregate case as compared to the non-aggregate one.
Non-accumulative strategy. Every Transfer is completed before the return from the corresponding function. The time of expensive synchronizations is taken into account. This strategy is expected to produce equal results for aggregate and non-aggregate cases.
Assured Completion of Transfers
Following the MPI standard, assured completion of transfers is the minimum sequence of operations after which all processes of the file communicator have a consistent view after a write.
The aggregate and non-aggregate modes differ in when the assured completion of data transfers takes place:
after each transfer (non-aggregate mode)
after a bunch of multiple transfers (aggregate mode)
For Intel(R) MPI Benchmarks, assured completion means the following:
For IMB-EXT benchmarks, MPI_Win_fence
For IMB-IO Write benchmarks, a triplet MPI_File_sync/MPI_Barrier(file_communicator)/MPI_File_sync. This fixes the non-sufficient definition in the Intel(R)s MPI Benchmarks 3.0.