Roofline Use Case
This topic is part of a tutorial that shows how to use the automated Roofline chart to make prioritized optimization decisions.
The Roofline analysis is an optional analysis that plots an application's achieved performance and arithmetic intensity against the machine's maximum achievable performance.
Use the Roofline chart to answer the following questions:
What is the maximum achievable performance with your current hardware resources?
Does your application work optimally on current hardware resources?
If not, what are the best candidates for optimization?
Is memory bandwidth or compute capacity limiting performance for each optimization candidate?
Roofline analysis is cache-aware; it measures all memory subsystem traffic, not just DDR memory traffic. It works on both single-threaded and multithreaded code.
Follow these steps to use the Vectorization Advisor and the roofline_demo_samples C++ sample application to:
Run a Roofline analysis.
Focus on the Roofline chart data of most interest.
Interpret Roofline chart data.
Use Roofline chart data interpretations to make optimization decisions.
Step |
Step Detail |
---|---|
Step 1: Prepare for tutorial. |
Do one of the following:
|
Step 2: Run a Roofline analysis. |
|
Step 3: Address memory bandwidth bottlenecks. |
|
Step 4: Address compute capacity bottlenecks. |
|
Step 5: Identify the real bottlenecks. |
|