Intel® VTune™ Profiler

Cookbook

ID 766316
Date 6/24/2024
Public
Document Table of Contents

Profiling an FPGA-driven SYCL* Application

Use this recipe to profile an FPGA-driven SYCL application. The recipe features the AOCL Profiler integrated in the CPU/FPGA Interaction (preview) analysis type in Intel® VTune™ Profiler.

Ingredients

Here are the minimum hardware and software requirements for this performance recipe.

Install and Configure the Toolkit

  1. Plug the Intel PAC card into the PCIe slot on the machine.

  2. Download and install Intel oneAPI Base Toolkit for Linux. Select all default options and either the online or offline installer.

  3. Download Intel FPGA Add-on for oneAPI Base Toolkit.

  4. Unzip the FPGA add-on package and run setup.sh. Select all default options.

  5. Set up the oneAPI environment.

    source <oneAPI-install-dir>/setvars.sh

  6. Install the FPGA board.

    aocl install

  7. Run the diagnose command to ensure that all diagnostics pass.

    aocl diagnose

Build the Sample Application

  1. Download code samples from the repository for Intel oneAPI DPC++ Compiler samples.

    git clone https://github.com/intel/BaseKit-code-samples.git

  2. Open the crr sample folder.

    cd BaseKit-code-samples/FPGAExampleDesigns/crr

  3. Open the src/CMakeLists.txt file.

  4. Locate the line of code that lists hardware flags. It should start with set(HARDWARE_LINK_FLAGS.

  5. Add -Xsprofile to the set of flags.

  6. Go back to the main directory for the sample. Create a new folder called build and open it.

    mkdir build
    cd build

  7. Compile the sample.

    cmake ..
    make fpga
    This process can take several hours. Once it has finished, you should have an executable file called crr.fpga.

You can now run crr.fpga on FPGA hardware.

Run CPU/FPGA Interaction Analysis

  1. Open Intel® VTune™ Profiler and click New Project on the Welcome screen.

    The Create a Project dialog box opens.

  2. Specify a project name, a location for your project, and click Create Project.

    The Configure Analysis window opens.

  3. In the WHERE pane, select Local Host.

  4. In the WHAT pane, select Launch Application as the target.

    • In the Application field, specify the path to the crr.fpga executable.

    • In the Application parameters field, enter ordered_inputs.csv.

    Set up FPGA analysis

  5. In the HOW pane, select CPU/FPGA Interaction (preview) from the Platform Analysis group.

  6. In the analysis settings, select AOCL Profiler for the FPGA profiling data source.

    Set up FPGA analysis
  7. Click Start at the bottom to run the analysis.

Analyze Results

Once data collection completes, you can see the finalized results in the CPU/FPGA Interaction viewpoint. Start with the Summary window to view these details:

  • FPGA top compute tasks

  • Top tasks and hotspots for the CPU

Result summary for CPU/FPGA Interaction

Switch to the Bottom-up window to see detailed information at the kernel level including:

  • Stalls
  • Occupancy
  • Data transfer size
  • Average bandwidth for transferred data

Bottom-up window

Use the timeline view to see these details about kernel instances:

  • Start/end times
  • Overtime stalls
  • Occupancy
  • Bandwidth metrics

In the Bottom-up window, right-click on a kernel and select View Source from context menu.

This opens the Source View, where you can see metrics for specific kernel source lines.