Intel® VTune™ Profiler

User Guide

ID 766319
Date 12/20/2024
Public
Document Table of Contents

Top-down Report

Similar to the Top-down window, available in GUI, the Top-down represents call sequences (stacks) detected during collection phase starting from the application root. Use the top-down report to explore the call sequence flow of the application and analyze the time spent in each program unit and on its callees.

NOTE:

Intel® VTune™ Profiler collects information about program unit callees only during User-Mode Sampling and Tracing Collection or Hardware Event-based Sampling Collection with stack collection enabled.

Examples

Example 1: Hotspots Top-down Report

This example displays the report for the specified Hotspots analysis in the user-mode sampling mode with functions stacks limited to 5 elements.

vtune -report top-down -r r001hs -limit 5 
        
Function Stack          CPU Time:Total  CPU Time:Effective Time:Total  CPU Time:Spin Time:Total  CPU Time:Overhead Time:Total
----------------------  --------------  -----------------------------  ------------------------  ----------------------------
Total                         100.000%                       100.000%                  100.000%                      100.000%
 func@0x6b2daccf               99.853%                        99.835%                  100.000%                      100.000%
  func@0x6b2dacf0              99.853%                        99.835%                  100.000%                      100.000%
   BaseThreadInitThunk         99.853%                        99.835%                  100.000%                      100.000%
    thread_video               95.614%                        97.876%                   78.195%                          0.0%

Example 2: Hotspots Report with Enabled Call Stack Collection (Linux*)

This command runs the Hotspots analysis in the hardware event-based sampling mode with enabled call stack collection.

vtune -collect hotspots -knob sampling-mode=hw -knob enable-stack-collection=true -- /home/tachyon

The following command generates the top-down report for the previously collected result and shows the result for columns with the time:total strings in the title.

vtune -report top-down -r r001hs -column=time:total 
      

Function Stack          CPU Time:  CPU Time:             CPU Time:        Context Switch Time:  Context Switch Time:  Context Switch Time:
                        Total      Effective Time:Total  Spin Time:Total  Total                 Wait Time:Total       Inactive Time:Total
----------------------  ---------  --------------------  ---------------  --------------------  --------------------  --------------------
Total                    100.000%              100.000%         100.000%              100.000%              100.000%              100.000%
 func@0x6b2daccf          97.595%               97.704%          89.202%               65.777%               90.121%               62.893%
  func@0x6b2dacf0         97.595%               97.704%          89.202%               65.777%               90.121%               62.893%
   BaseThreadInitThunk    97.595%               97.704%          89.202%               65.777%               90.121%               62.893%
    threadstartex         67.091%               67.855%           8.335%               29.825%                9.027%               32.289%
...

Example 3: Hotspots Report with Disabled Stack Collection (Windows*)

This command runs the Hotspots analysis in the hardware event-based sampling mode with disabled call stack collection.

vtune -collect hotspots -knob sampling-mode=hw -knob enable-stack-collection=false -- C:\tachyon\tachyon.exe    
      

This command generates the top-down report for the previously collected result, and shows the result for columns with the time:total string in the title. The report does not include information about program unit callees, as it was not collected during the analysis.

vtune -report top-down -r r001hs -column=time:total 
      

Function Stack          CPU Time:Total  CPU Time:Effective Time:Total  CPU Time:Spin Time:Total
----------------------  --------------  -----------------------------  ------------------------
Total                         100.000%                       100.000%                  100.000%
 grid_intersect                50.172%                        50.213%                      0.0%
 sphere_intersect              31.740%                        31.766%                      0.0%
 grid_bounds_intersect          3.766%                         3.769%                      0.0%
 pos2grid                       0.778%                         0.778%                      0.0%
...