Identify the Real Bottlenecks
This topic is part of a tutorial that shows how to use the automated Roofline chart to make prioritized optimization decisions.
Perform the following steps:
Key take-aways from these steps:
The first roofline above a dot position isn't always the bottleneck; any roofline above a dot position could be the culprit
Even a roofline below a dot position can be a bottleneck; however, the farther a dot is positioned above a roofline, the less likely that roofline is causing the bottleneck.
If the first roofline above a dot position does not make logical sense, investigate the next roofline, and just keep working your way up the Roofline chart, using common sense, other Intel Advisor features, and your familiarity with your application to inform your investigation.
The Roofline chart is not a Data-In-Answers-Out utility; however, it puts you in the ballpark and guides you in the right direction to optimize your code.
Open a Result Snapshot
Do one of the following:
If you prefer to work in the standalone GUI, from the File menu, choose Open > Result and choose the Result3.advixeexpz result.
If you prefer to work in the Visual Studio* IDE, from the File menu, choose Open > File and choose the Result3.advixeexpz result.
Focus the Roofline Chart on the Data of Most Interest
Use the display toggles to show the Roofline chart and Survey Report side by side.
On the Intel Advisor toolbar, click the Loops And Functions filter drop-down and choose Loops.
In the Roofline chart:
Select the Use Single-Threaded Loops checkbox.
Click the control, then deselect the Visibility checkbox for all SP... roofs. (All variables in this sample code are double-precision, so there is no need to clutter the chart with single-precision rooflines.)
In the Point Colorization section, choose Colors of Point Weight Ranges to differentiate dot colors by runtime (red, yellow, and green).
Click to save your changes.
Click the control. In the x-axis fields, backspace over the existing values and enter 0.05 and 0.7. In the y-axis fields, backspace over the existing values and enter 1.0 and 14.8. Click the button to save your changes.
Interpret Roofline Chart Data
Notice the position of the dot representing the loop at main in roofline.cpp:138 (the red dot).
One possible reason for the dot position: The loop is suffering from a memory bandwidth bottleneck, based on the dot position below the L3 Bandwidth roofline.
However, based on our familiarity with the sample code, we know the dataset definitely fits into in L1 cache. So the next L2 Bandwidth roofline does not seem to be the likely culprit either.
Another possible reason for the dot position: The next roofline up is the Scalar Add Peak roofline, so perhaps the loop is suffering from a compute capacity bottleneck.
Using the Survey Report, we can quickly verify the loop is scalar (blue loop icon).
What happens if we vectorize the loop but add no memory optimizations? This is exactly what we did. The outcome is the loop in main at roofline.cpp:151 (the yellow dot).
Notice the dot representing this loop is positioned above the Scalar Add Peak roofline and closer to the L1 Bandwidth roofline.
This proves the bottleneck was compute capacity, not memory bandwidth.