Optimize Geekbench* Performance in ChromeOS*

Authors:

  • Jaishankar Rajendran

  • Parshuram Sangle

author-image

By

An In-Depth Analysis and Impact on Benchmark Scores

Geekbench1 is widely used for CPU benchmarking across various operating systems and platforms such as Windows*, Linux*, Android*, and macOS*. In ChromeOS*, Geekbench can run in three different environments:

  • Android version in ARCVM (Android Runtime Compatibility for Virtual Machines)
  • Linux version in Crostini* VM (virtual machine)2
  • Linux version in Chrome

This article explores the key differences between the Android Application (APK) version and the Linux version in Chrome operating environments, and provides a detailed analysis of software stack variances, subtests, and operational flow. Benchmark scores are derived using the Chromebook* notebook featuring Intel® Core™ processors.

It also examines the technical details of the Android memory allocator and explores potential optimizations to enhance performance. Additionally, this article highlights the successful implementation of optimization for memory sharing between the Android virtual machine (VM) and the guest machine. This optimization plays a pivotal role in significantly enhancing the performance of Geekbench and various other workloads.

Factors Affecting CPU and Memory Evaluation

Geekbench is a widely used CPU benchmark for evaluating and optimizing CPU and memory performance using workloads that include data compression, image processing, machine learning, and physics simulation. Performance on these workloads is important for a wide variety of applications including web browsers, image editors, and developer tools.

Although the same subtests are run as part of the Geekbench benchmark, critical characteristics of the operating environments determine the subtest levels and final scores. These characteristics include but are not limited to CPU runtime, instructions per clock, branch miss rate, memory latency, and cache miss rates. Understanding these operating environment factors is vital to understanding the Geekbench scoring mechanism for making meaningful comparisons.

Differences in Operating Environments

Chrome and Android operating environments differ in several key ways, with the primary distinction being memory allocation mechanisms and software stacks. Figure 1 shows a high-level overview of the differences in memory allocation flow between Android, which uses Bionic libc, and Chrome, which uses the GNU C library. This contrast is one of the key differences in these operating environments.

Memory allocation comparison

Figure 1. Chrome and Linux GNU C Library compared to Android Bionic libc (memory allocation).

GNU C Library for Linux

  • The GNU C Library3 provides the core libraries for the GNU system, GNU/Linux systems, and many other systems that use Linux as the kernel, including Chrome.
  • The GNU C library is described in the ANSI, C99, and C11 standards. It includes macros, symbols, and function implementations, such as print() and malloc().
  • The GNU libc is a POSIX standard library and serves as the userland glue for system calls, such as open() and read(). Specifically, GNU C library does not implement system calls; the kernel does that. However, GNU C library provides the userland interface to services offered by the kernel so that user applications can use a system call just like an ordinary function.
  • The GNU C Library uses malloc for dynamic memory allocation.

Bionic C Library (libc) for Android

  • Bionic C library (libc)4 is a C library specifically created for Android mobile and embedded environments, providing essential C standard library functions. It prioritizes performance and memory efficiency for mobile applications.
  • Bionic features a custom implementation that is optimized for embedded systems.
  • Known for its small size and fast code paths, it provides very fast, custom pthread implementations.
  • For dynamic memory allocation, Android uses jemalloc
  • Bionic C library is not compatible with GNU C library.

 

Comparative Analysis: Memory Allocation in Chrome and Android Environments

Geekbench Subtest Scores

See table 1 for a detailed comparison of benchmark scores obtained from the Linux version of Geekbench run in the ChromeOS shell, Crostini shell, and the Android version of Geekbench run in ARCVM on the Chromebook* platform featuring Intel® Core™ Ultra Processors.

Table 1. Geekbench Score Comparison between Different Operating Environments

OS R121:15678.0.0
RAM 16GB
CPU(S): 14
On-line CPU(s) list: 0-13
CPU max MHz 4800
CPU min MHz 400

 

Tests   Geekbench 5.4.5 Linux in ChromeOS Shell Geekbench 5.4.5 Linux in Crostini VM Geekbench 5.4.5 Android APK in ARCVM Delta ChromeOS Shell vs Crostini VM Delta ChromeOS Shell vs ARCVM
Single-Core Score 1688 1622 1456

-4%

-14%
Crypto Score 3330 3219 3129 -3% -6%
Integer Score 1478 1452 1260 -2% -15%
Floating Point Score 1869 1723 1602 -8% -14%
Multi-Core Score 7478 7243 6056 -3% -19%
Crypto Score 10677 10226 11101 -4% 4%
Integer Score 7184 6955 5549 -3% -23%
Floating Point Score 7582 7369 6313 -3% -17%
             
Memory Usage by
GeekBench 5.4.5
  1.35 GB 1.58 GB 0.45 GB    


Further analysis identifies a notable disparity in the performance of subtests (HTML5, SQLite*, Clang*) that are influenced by memory allocation and access. Table 2 shows a detailed comparison of subtest scores.

Table 2. Geekbench Subtest Score Comparison between Different Operating Environments

Tests

 

GeekBench 5.4.5 Linux
in ChromeOS Shell

GeekBench 5.4.5 Linux
in Crostini VM

GeekBench 5.4.5 Android
APK in ARCVM

Delta ChromeOS Shell
vs Crostini VM

Delta ChromeOS Shell
vs ARCVM

Single-Core

AES-XTS

Score

3330

3219

3129

-3%

-6%

Text Compression

Score

1226

1208

1169

-1%

-5%

Image Compression

Score

1604

1670

1340

4%

-16%

Navigation

Score

975

1008

933

3%

-4%

HTML5

Score

1797

1732

1256

-4%

-30%

SQLite

Score

1653

1613

1265

-2%

-23%

PDF Rendering

Score

1532

1581

1328

3%

-13%

Text Rendering

Score

1469

1372

1325

-7%

-10%

Clang

Score

1735

1640

1446

-5%

-17%

Camera

Score

1513

1421

1357

-6%

-10%

N-Body Physics

Score

1865

1727

1786

-7%

-4%

Rigid Body Physics

Score

1906

1801

1535

-6%

-19%

Gaussian Blur

Score

1391

1296

1411

-7%

1%

Face Detection

Score

2024

1732

1686

-14%

-17%

Horizon Detection

Score

1526

1374

982

-10%

-36%

Image Inpainting

Score

3035

2853

2691

-6%

-11%

HDR

Score

3138

3026

2490

-4%

-21%

Ray Tracing

Score

2485

2044

2158

-18%

-13%

Structure for Motion

Score

1585

1589

1606

0%

1%

Speech Recognition

Score

1345

1280

1137

-5%

-15%

Machine Learning

Score

1263

1157

1055

-8%

-16%

Multi-Core

AES-XTS

Score

10677

10226

11101

-4%

4%

Text Compression

Score

5168

5552

4600

7%

-11%

Image Compression

Score

7436

7418

6234

0%

-16%

Navigation

Score

4729

4482

4173

-5%

-12%

HTML5

Score

10343

9608

6958

-7%

-33%

SQLite

Score

9362

8743

6055

-7%

-35%

PDF Rendering

Score

7458

8110

5236

9%

-30%

Text Rendering

Score

8352

6502

7709

-22%

-8%

Clang

Score

7533

7298

5531

-3%

-27%

Camera

Score

6174

6385

4430

3%

-28%

 

As shown in figure 2, the memory logs confirm that the Linux version of Geekbench has a maximum allocation of memory, whereas the Android version is not able to get a complete allocation.
 

A screenshot of a computerDescription automatically generated

Figure 2. Memory use

The Android version of Geekbench uses significantly less memory, with only 0.45 GB compared to the 1.3 GB used by the Linux version, indicating a lower memory allocation for Android. The Linux version employs the GNU C library (glibc/libm) and standard malloc for dynamic memory allocation, while the Android version relies on Bionic C library (libc) and jemalloc.

The jemalloc Memory Allocator

The jemalloc5 memory allocator is a general-purpose malloc implementation that emphasizes fragmentation avoidance and scalable concurrency support. Using jemalloc reduces performance but also lessens memory fragmentation. Google must decide the extent of acceptable performance degradation.

The jemalloc memory allocator offers several adjustable parameters, which are often conservatively set by default and may not be ideal for numerous typical workloads. Properly calibrating jemalloc for a particular application or workload can commonly enhance system-level performance by a small percentage or allow for beneficial trade-offs. To achieve optimal tuning, it is necessary to acquire jeprof files within the Android operating system. However, ChromeOS currently lacks the infrastructure for deploying Android OS images.

Memory Allocation Dynamics

  • Android operates within a virtual machine environment, with memory management handled by crosvm, which uses a ballooning policy.
  • The ballooning policy is informed by the Low Memory Killer Daemon (LMKD) within Android to guide memory allocation.
  • In contrast, the Linux version of Geekbench runs directly, adhering to a standard memory allocation policy that grants requested memory without modification.

User Interface and Command-Line Application Resource Use

  • The command-line interface (CLI) version of Geekbench for Linux is more memory-efficient compared to the Android APK, which includes a graphical user interface (GUI).
  • The CLI version does not engage graphics modules, avoiding the additional overhead associated with the Android graphics pipeline.
  • The Java UI threads in the APK introduce further overhead. See figure 3.

Benchmark Runtime Comparison

Geekbench running on Linux via the CLI completes in 175 seconds, whereas the Geekbench APK version takes 200 seconds to finish.

Java Native Interface (JNI) Calls

JNI calls are generally slower. See figure 3.

A screenshot of a computerDescription automatically generated

Figure 3. JNI calls

Given the factors causing the bottleneck in Geekbench scores, strategies focus on bridging the gap between the built-in Android version and the Android version running in a virtual machine (VM) deployment.

Optimization Strategy 1: Fine-Tune jemalloc Parameters

Continued investigation into fine-tuning jemalloc parameters yields substantial performance enhancements. Our experiments, which involve running Geekbench workloads with modified jemalloc settings, indicate notable improvements in performance. The jemalloc dirty_decay_ms configuration parameter, in particular, had a pronounced impact on the performance of the malloc-intensive Geekbench multicore (MC) workload.

Table 3 details insights into the performance trends of the Geekbench MC scores under various dirty_decay_ms settings in a virtual machine environment.

Table 3. Geekbench Score Comparison with dirty_decay_ms Parameter Settings

dirty_decay_ms_value Base (default) 5 Sec 15 Sec 20 Sec 30 Sec
Intel® Core™ i3 N Processor based Chromebook - GeekBench MC Score 2748 2843 2890 2923 3019


The following elements contribute to the observed performance improvements:

  • An approximate 28% decrease in CPU cycles per second
  • A decrease in the number of vm_exits caused by EPT_VIOLATION
  • A reduction in the overhead associated with memory ballooning

Moreover, configuring the narenas parameter to align with the available number of CPUs (rather than using the fixed value of 12) has resulted in further gains. Geekbench MC scores on Chromebook platforms based on Intel® Core™ i3 N processors experienced an additional increase of 2% to 3%.

Strategy 2: Responsive Ballooning (VM Memory Management Service)

Implementing the responsive ballooning feature successfully eliminates the overhead traditionally associated with limited cache memory ballooning. This leads to performance gains of approximately 8% to 10%.

This significant enhancement refines the existing cache ballooning policy of ARCVM by restructuring it to incorporate a broader range of metrics. The new design focuses on managing memory distribution more effectively between the guest (Android VM) and the host (ChromeOS). This ensures a more dynamic and efficient allocation of resources, responding adeptly to the needs of both environments.

Results from Fine-Tuning and Responsive Ballooning

Optimization Strategy 1: Fine-Tune jemalloc Parameters

Modifying the jemalloc dirty_decay_ms parameter to a 30-second interval resulted in performance gains across a diverse set of workloads on multiple ChromeOS platforms. The following table details the observed performance gains.

Table 4. Performance results with optimal (30x) dirty_decay_ms value

Platform Workloads and Gains
Intel® Core™ i3 N Processor based Chromebook
  • CPU Single core - 5%
  • CPU Multicore - 11%
  • GPU/Memory Workloads - No gains
  • Productivity - 5%
  • Concurrent workloads and Android app launch time - no regression
12th Generation Intel® Core™ i7 Processors based Chromebook
  • CPU Single core - 2%-3%
  • CPU Multicore - 13%
  • GPU Workloads - 3%
13th Generation Intel® Core™ i7 Processors based Chromebook
  • CPU Single core - 4%
  • CPU Multicore - 14%

Strategy 2: Responsive Ballooning (VM Memory Management Service)

Tables 5 through 8 display a comparative analysis of Geekbench 5.4.5 scores between Android versions (performed within a VM environment) and their Linux counterparts across different platforms. With the introduction of the responsive ballooning feature, the performance disparity between the Android and Linux versions of Geekbench notably narrows to less than 10%.

Table 5. Geekbench Scores Using Chromebooks Based on 11th Generation Intel® Core™ i7 Processors

Benchmark

11th Generation Intel® Core i7 Processors based Chromebook

Android APK Version

Android CLI Version

Linux CLI Version in ChromeOS

Android APK vs CLI

Android CLI vs Linux CLI

GeekBench-5.4.5
Single Core

1385

1473

1658

-6%

-13%

GeekBench-5.4.5
Multi Core

5161

5246

5872

-2%

-12%

 

Table 6. Geekbench Scores Using Chromebooks Based on Intel Core i3 N Processors

Benchmark

Intel® Core i3 N Processor based Chromebook

Android APK Version

Android CLI Version

Linux CLI Version in ChromeOS

Android APK vs CLI

Android CLI vs Linux CLI

GeekBench-5.4.5
Single Core

1048

1087

1174

-4%

-8%

GeekBench-5.4.5
Multi Core

4926

5087

5411

-3%

-6%

 

Table 7. Geekbench Scores Using Chromebooks based on 12th Generation Intel Core i7 Processors

Benchmark

12th Generation Intel® Core i7 Processor based Chromebook

Android APK

Android CLI

Chrome Native CLI

Android APK vs CLI

Android CLI vs Native CLI

GeekBench-5.4.5
Single Core

1299

1399

1534

-8%

-10%

GeekBench-5.4.5
Multi Core

6375

6471

7000

-2%

-8%

 

Table 8. GeekBench Scores Using Chromebooks based on 13th Generation Intel Core i7 Processors

Benchmark

13th Generation Intel® Core i7 Processor based Chromebook

Android APK

Android CLI

Chrome Native CLI

Android APK vs CLI

Android CLI vs Native CLI

GeekBench-5.4.5
Single Core

1783

1896

2069

-6%

-9%

GeekBench-5.4.5
Multi Core

6617

6684

7325

-1%

-10%

 

Conclusion and Next Steps

The analysis of Geekbench workloads and their respective operating environments reveals distinct differences between the built-in Linux version and the Android version running within a VM environment. These differences are responsible for the variations in benchmark scores. Although the optimizations are effective in reducing this gap, they are not sufficient to fully bridge it. Certain inherent constraints within the VM software stack are irremovable, which prevents the complete equalization of performance between the two versions. However, these insights pave the way for future developments, and ongoing research and innovation hold the potential for further enhancements, leading to even more robust performance improvements in virtualized environments.

Test Configuration

Platform: Nissa Chromebook notebook

Software: Google ChromeOS CPFE R124-15810.0.0

Hardware: Intel Core i3 N305 processor, 8 GB RAM

References

  1. Geekbench Website
  2. Run Custom Containers under ChromeOS
  3. The GNU C Library
  4. Bionic: Android's C Library, Math Library, and Dynamic Linker
  5. Jemalloc: Memory Allocator