Hardware Profile Guided Optimization Use Guide for ChromeOS* Libraries

Authors:

  • Jaishankar Rajendran

  • Parshuram Sangle

author-image

By

Introduction

HWPGO (hardware profile guided optimization) compiler optimizations refer to a set of techniques used by compilers to enhance the performance of software applications based on insights gained from hardware performance metrics. These optimizations are aimed at improving the running speed, reducing resource consumption, and enhancing the overall efficiency of applications on specific hardware configurations. HWPGO can potentially adapt to different hardware configurations or changes in runtime behavior, making optimizations more dynamic and tailored to the current runtime environment. This guide outlines how to use the optimization in ChromeOS* libraries and maximize the performance on the workloads run in ChromeOS. The following image illustrates this process.

Figure 1. HWPGO Flow

Set Up the ChromeOS Image and Repository

  1. Set up the ChromeOS repository and generate the ChromeOS image that could seamlessly start up and work.
  2. Download the operating system image file (matching the ChromeOS repository build) to a USB drive with sufficient storage space and ensure it’s compatible with the hardware on your Chromebook* notebook computer.
  3. Insert the USB drive.
  4. Choose the downloaded operating system image file and follow the on-screen instructions to complete the flashing process.
  5. Identify the module that is of interest to support HWPGO that contributes to high CPU use.
    • To determine which library is contributing the most to the CPU use, use the following code:
perf record -b <command line>

For UI-based workloads, use a system-wide collection using the -a option.

perf report --sort=dso

This option sorts the modules by their CPU use overhead.

Set Up Build Configuration and Compilation and Deployment

The following steps detail the process undertaken to enable HWPGO for the libvulkan module on the Rex Chromebook. This methodology is applicable to other modules and Chromebook models as well.

1. Add the following flags in the <xx>.ebuild file of the module of interest from the ChromeOS repository.

For example, for the module mesa iris, add the following flags to mesa-iris-9999.ebuild:

append-cppflags -O2
append-flags -gdwarf -gline-tables-only -fdebug-info-for-profiling -funique-internal-linkage-names

2. Enter the ChromeOS developer shell using cros_sdk in your ChromeOS repository and run the following commands:

export BOARD=rex
cros-workon-rex start media-libs/mesa-iris

3. To compile the base libvulkan with debug information, use the following command:

USE="cros-debug" CFLAGS="-g" FEATURES="nostrip noclean -splitdebug" emerge-brya media-libs/mesa-iris | tee -a build-log-libvulkan.txt

4. Navigate to the library directory and securely copy libvulkan_intel.so to the remote machine:

cd /build/rex/usr/lib64
scp libvulkan_intel.so root@<DUT IP>:/usr/lib64/libvulkan_intel.so

5. Restart the device under test (DUT) and verify that the library sizes match between the repository and the DUT after the restart.

6. Collect the base perf.data using the appropriate command for your platform (12th or 13th generation Intel® Core™ i7 processors or Intel Core i3 N Processor).

For 12th and 13th generation Intel Core i7 processors:

perf record -b -a -e                
cpu_atom/event=0xc4,umask=0x20,name=br_inst_retired.near_taken,period=100003/pu -e
cpu_core/event=0xc4,umask=0xC0,name=br_inst_retired.near_taken,period=100003/pu

For Intel Core i3 N processors:

perf record -b -a -e cpu/event=0xc4,umask=0xC0,name=br_inst_retired.near_taken,period=100003/pu

For more details on the events and umask values, refer to Intel Perfmon Events and Intel Perfmon on GitHub*.

7. Rename the collected perf.data and generate a performance report using the following command:

perf report -i perf.data --sort=dso | tee perf_report_hwpgo_gtav.txt

8. Analyze the performance report to identify which libraries are used the most.

9. Return to the ChromeOS developer shell (cros_sdk) and generate a perf-script-out using the following command:

perf script -F ip,brstack --show-mmap-events -i rpl_borealis_libvulkan_base_perf.data > perf-script-out

10. Create a profile file using llvm-profgen with the following command, ensuring the base unstripped libvulkan library with debug information is in the current directory:

llvm-profgen --format=text --perfscript=./perf-script-out --binary=./libvulkan_intel.so --output=vulkan-games.prof

11. Once the profile file is created, check its validity by ensuring that the majority of the symbols toward the beginning of the file have non-zero counts. For profile file format details, see the clang documentation.

12. Exit the ChromeOS developer shell (cros_sdk) and copy the profile file to the specified path in the source tree exposed in the build.

cp vulkan-games.prof /mnt/host/source/src/third_party/chromiumos-overlay/media-libs/mesa-iris

13. Open the mesa-iris-9999.build file with a text editor (such as vi) and add the -fprofile-sample-use flag with profile file path or name to the specified line in the build file as shown in the following code snippet:

append-flags -gdwarf -gline-tables-only -fdebug-info-for-profiling -funique-internal-linkage-names  -fprofile-sample-use=${FILESDIR}/vulkan-games.prof

14. Run git diff to confirm that the profile file has been correctly added to the build configuration.

15. Re-enter the ChromeOS developer shell (cros_sdk) and start the libvulkan compilation with the appropriate command for your platform (12th or 13th generation Intel Core i7 processors).

For 12th generation Intel Core i7 processors:

USE="cros-debug" CFLAGS="-g" FEATURES="nostrip noclean -splitdebug" emerge-brya media-libs/mesa-iris | tee -a rpl-build-log-borealis-libvulkan.log

For 13th generation Intel Core i7 processors:

USE="cros-debug" CFLAGS="-g" FEATURES="nostrip noclean -splitdebug" emerge-rex media-libs/mesa-iris | tee -a mtl-build-log-borealis-libvulkan.log

16. Navigate to the library directory and verify the presence of the newly compiled libvulkan_intel.so:

cd /build/rex/usr/lib64
ls -lrt --> libvulkan_intel.so will be present

17. Deploy the newly generated library with HWPGO in the DUT with the following command:

scp libvulkan_intel.so root@<DUT IP>:/usr/lib64/libvulkan_intel.so

Note If you encounter an error, remove rootfs and try the previous command for copying.

18. Once the library is copied, restart the DUT, navigate to the library directory, and verify the presence of the newly compiled libvulkan_intel.so.

19. To confirm the successful deployment of the libvulkan_intel module, check the file sizes of libvulkan_intel.so in the DUT and repository. They should be the same.

20. The libvulkan_intel.so module is now optimized with HWPGO. The same procedure can be applied for other modules.

 

Summary

The previous steps outline a comprehensive process for HWPGO optimizing the libvulkan_intel.so library within the ChromeOS development environment. This procedure involves several critical stages, such as setting up the development environment, compiling with specific flags for performance profiling, and ensuring the correct deployment of the optimized library to the DUT.

Overall, this optimization process is a meticulous and iterative endeavor that requires a solid understanding of the development tools, performance analysis, and the specific requirements of the ChromeOS environment. By following these steps and maintaining a disciplined approach to development and profiling, you can achieve significant performance enhancements for the libvulkan_intel.so or any other library, ultimately leading to a better user experience on devices running ChromeOS.