Getting Credible Performance Numbers

OpenCL™ Developer Guide for Intel® Core™ and Intel® Xeon® Processors

Download PDF

ID 773005

Date 10/30/2018

Version 2018

Public

Visible to Intel only — GUID: GUID-DB4B85B6-3795-4330-ABBB-301E7D3DED36

View Details

Getting Credible Performance Numbers

Performance measurements are done on a large number of invocations of the same routine. Since the first iteration is almost always significantly slower than the subsequent ones, the minimum (or average, geometric mean, and so on) value for the execution time is usually used for final projections.

An alternative to calling kernel several times is using a single “warm-up” run.

The warm-up run might be helpful for kernels with small amount of computations, as it helps to amortize the following potential (one-time) costs:

Bringing data to the cache
Lazy object creation
Delayed initializations
Other costs, incurred by the OpenCL™ runtime

NOTE:

NOTE: You need to make your performance conclusions on reproducible data. If warm-up run does not help or execution time still varies, try running large number of iterations and then average the results. For time values that range too much, consider using geomean.

Consider the following:

For bandwidth-limited kernels, operating on the data that does not fit in the last-level cache, the warm-up run does not improve the stability of measurement significantly.
For a kernel with a small number of instructions executed over a small data set, make sure there is a sufficient number of iterations, so that the kernel run time is at least 20 milliseconds for CPU device.
Kernels with smaller run time might provide unreliable data, so increasing the amount of computations artificially gives you important insights into the hotspots. For example, you can add loop in the kernel, or replicate some pieces.

Refer to the “OpenCL™ Optimizations Tutorial” SDK sample for code examples of performing warm-up run before starting performance measurement.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

OpenCL™ Developer Guide for Intel® Core™ and Intel® Xeon® Processors

Getting Credible Performance Numbers

See Also