Optimize Memory & Cache Use with Intel® VTune™ Profiler
This video demos using Intel® VTune™ Profiler with other Intel® oneAPI Base Toolkit components like the Intel® oneAPI DPC+/C+ Compiler. The demo provides a hands-on example of the benefits of using Intel VTune Profiler to optimize application performance.
This scenario looks at memory and cache accesses. A hot spot analysis identifies a DRAM memory access that's taking too long. Further microarchitecture analysis points to cache misses and L3 cache accesses. The resolution uses built-in compiler prefetch commands to improve data access times. Intel VTune Profiler provides various views to aid with your analysis:
- The platform diagram shows DRAM bus utilization.
- The bandwidth utilization histogram reports memory accesses of varying bandwidth.
- The latency histogram shows memory accesses and their latency and allows you to identify their origin.
This dataset then allows you to identify the remedy and reduce high-latency cache use.
Product and Performance Information
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.