Visible to Intel only — GUID: GUID-B5024526-55A8-43E4-A8F2-49FD6556FBBA
Top-down Microarchitecture Analysis Method
OpenMP* Code Analysis Method
Custom Data Collection for Performance Analysis (NEW)
Software Optimization for Intel® GPUs (NEW)
Core Utilization in DPDK Apps
PCIe Traffic in DPDK Apps
DPDK Event Device Profiling
Effective Utilization of Intel® Data Direct I/O Technology
Compile a Portable Optimized Binary with the Latest Instruction Set
Profiling High Bandwidth Memory Performance on Intel® Xeon® CPU Max Series
Profiling Windows* Applications for Hybrid CPU Platforms (NEW)
Profiling Machine Learning Applications (NEW)
Profiling Single-Node Kubernetes* Applications (NEW)
Analyzing Hot Code Paths Using Flame Graphs (NEW)
Improving Hotspot Observability in a C++ Application Using Flame Graphs
Measuring Performance Impact of NUMA in Multi-Processor Systems
Profiling Games built with Unity* (NEW)
Profiling Games built with Unreal Engine* (NEW)
Profiling Java Applications as a Remote User (NEW)
Profiling JavaScript* Code in Node.js*
Analyzing CPU and FPGA (Intel® Arria® 10 GX) Interaction
Profiling a .NET* Core Application
Profiling Applications in Amazon Web Services* (AWS) EC2 Instances
Enabling Performance Profiling in GitLab* CI
Configuring a Hyper-V* Virtual Machine for Hardware-Based Hotspots Analysis
Profiling an Application for Performance Anomalies (NEW)
Profiling an OpenMP* Offload Application running on a GPU (NEW)
Profiling a SYCL* Application running on a GPU
Profiling an FPGA-driven SYCL* Application
Profiling Hardware Without Intel Sampling Drivers
Profiling MPI Applications
Profiling Docker* Containers
Profiling a Remote Target Through a Proxy Server (NEW)
Profiling in a Singularity* Container
Profiling Linux*, Android*, and QNX* System Boot Time
Using Intel® VTune™ Profiler Server with Visual Studio Code and Intel® DevCloud for oneAPI (NEW)
Using Intel® VTune™ Profiler Server in HPC Clusters
Using the Command-Line Interface to Analyze the Performance of a SYCL* Application running on a GPU (NEW)
Cache-Related Latency Issues in Segmented Cache Environment
False Sharing
Frequent DRAM Accesses
Poor Port Utilization
Page Faults
Instruction Cache Misses
Inefficient Synchronization
Inefficient TCP/IP Synchronization
OS Thread Migration
OpenMP* Imbalance and Scheduling Overhead
Processor Cores Underutilization: OpenMP* Serial Time
Scheduling Overhead in Intel® Threading Building Blocks (Intel® TBB) Apps
PMDK Application Overhead
Visible to Intel only — GUID: GUID-B5024526-55A8-43E4-A8F2-49FD6556FBBA
Methodologies
Start cooking your performance analysis. Understand tuning techniques, performance metrics and hardware solutions to collect statistics. Next, drill down to particular tuning or configuration recipes that feature Intel® VTune™ Profiler or its predecessor, Intel® VTune™ Amplifier.
- Top-down Microarchitecture Analysis Method
Use this recipe to know how an application is utilizing available hardware resources and how to make it take advantage of CPU microarchitectures. One way to obtain this knowledge is by using on-chip Performance Monitoring Units (PMUs). - OpenMP* Code Analysis Method
This recipe introduces a flow to analyze CPU utilization of your OpenMP* or hybrid OpenMP-MPI application and identify causes of possible inefficiencies. - Custom Data Collection for Performance Analysis (NEW)
Learn how to configure a data collector to inject custom data into an analysis by Intel® VTune™ Profiler. Get additional context on the collected data and insights for an enhanced analysis. - Software Optimization for Intel® GPUs (NEW)
Use Intel® VTune™ Profiler to estimate overhead when offloading onto an Intel GPU. Analyze the performance of computing tasks offloaded onto the GPU. - Core Utilization in DPDK Apps
Explore metrics that characterize core utilization in terms of packet receiving in Data Plane Development Kit* (DPDK)-based applications. - PCIe Traffic in DPDK Apps
This recipe introduces PCIe Bandwidth metrics used in Intel® VTune™ Profiler to explore the PCIe traffic for a packet forwarding DPDK-based workload. - DPDK Event Device Profiling
Use Intel® VTune™ Profiler to analyze the efficiency of DPDK Event Device pipeline utilization in your DPDK-based application and identify issues, such as inhomogeneous load distribution and worker core underutilization. - Effective Utilization of Intel® Data Direct I/O Technology
This recipe demonstrates how Intel® VTune™ Profiler reveals the utilization efficiency of the Intel® Data Direct I/O technology, a hardware feature of Intel® Xeon® processors. - Compile a Portable Optimized Binary with the Latest Instruction Set
Learn the different methods for compiling a binary with the latest instruction set while maintaining portability.