Profile Heterogeneous Computing Performance with Intel® VTune™ Profiler
Profile Heterogeneous Computing Performance with Intel® VTune™ Profiler
Subscribe Now
Stay in the know on all things CODE. Updates are delivered to your inbox.
Overview
Programming of heterogeneous platforms requires a deep understanding of system architecture on all levels, which helps application design to take advantage of the best data and work decomposition between CPUs and accelerating hardware like GPUs. However, in many cases the applications are being converted from a conventional CPU programming language (like C++) or from an accelerator-friendly but still low-level language (like OpenCL™ code). The main problem is to determine which part of the application benefits from being offloaded to a GPU. Another problem is to estimate how much performance increase one might gain due to the acceleration in the particular GPU device. Each platform has its unique limitations that affect the performance of offloaded computing tasks, for example: data transfer tax, task initialization overhead, memory latency, and bandwidth limitations. To take into account these constraints, software developers need tools to collect the right information and produce recommendations to make the best design and optimization decisions.
This presentation introduces two new GPU performance analysis types in Intel® VTune™ Profiler, and a methodology of heterogeneous applications performance profiling supported by the analyses. Intel VTune Profiler is an established tool for performance characterization on CPUs. It includes GPU offload analysis and GPU hot spot analysis of applications, written on most offloading models with OpenCL code, SYCL* (Data Parallel C++), and OpenMP* Offload.
Vladimir Tsmbal
Senior technical consulting engineer, Intel Corporation
Vladimir specializes in teaching customers how to use various Intel® Software Development Tools to develop, tune, and optimize their parallel applications on Intel® architecture. In particular, his focus is on the Intel® Parallel Studio XE product suite and the analysis tools it contains, including Intel VTune Profiler (which he helped develop), Intel® Advisor, and Intel® Inspector.
Prior to joining Intel in 2005, Vladimir worked as a research assistant, and developed hardware graphics accelerators and software and hardware systems for medical diagnostics. He holds a PhD in mathematics and computer science from Taganrog State University of Radio Engineering, Russia.
Find and fix performance bottlenecks and optimize application and system performance and system configuration for HPC, cloud, IoT, media, storage, and more.
You May Also Like
Related Article