An In-Depth Analysis and Impact on Benchmark Scores
Geekbench1 is widely used for CPU benchmarking across various operating systems and platforms such as Windows*, Linux*, Android*, and macOS*. In ChromeOS*, Geekbench can run in three different environments:
- Android version in ARCVM (Android Runtime Compatibility for Virtual Machines)
- Linux version in Crostini* VM (virtual machine)2
- Linux version in Chrome
This article explores the key differences between the Android Application (APK) version and the Linux version in Chrome operating environments, and provides a detailed analysis of software stack variances, subtests, and operational flow. Benchmark scores are derived using the Chromebook* notebook featuring Intel® Core™ processors.
It also examines the technical details of the Android memory allocator and explores potential optimizations to enhance performance. Additionally, this article highlights the successful implementation of optimization for memory sharing between the Android virtual machine (VM) and the guest machine. This optimization plays a pivotal role in significantly enhancing the performance of Geekbench and various other workloads.
Factors Affecting CPU and Memory Evaluation
Geekbench is a widely used CPU benchmark for evaluating and optimizing CPU and memory performance using workloads that include data compression, image processing, machine learning, and physics simulation. Performance on these workloads is important for a wide variety of applications including web browsers, image editors, and developer tools.
Although the same subtests are run as part of the Geekbench benchmark, critical characteristics of the operating environments determine the subtest levels and final scores. These characteristics include but are not limited to CPU runtime, instructions per clock, branch miss rate, memory latency, and cache miss rates. Understanding these operating environment factors is vital to understanding the Geekbench scoring mechanism for making meaningful comparisons.
Differences in Operating Environments
Chrome and Android operating environments differ in several key ways, with the primary distinction being memory allocation mechanisms and software stacks. Figure 1 shows a high-level overview of the differences in memory allocation flow between Android, which uses Bionic libc, and Chrome, which uses the GNU C library. This contrast is one of the key differences in these operating environments.
Figure 1. Chrome and Linux GNU C Library compared to Android Bionic libc (memory allocation).
GNU C Library for Linux
- The GNU C Library3 provides the core libraries for the GNU system, GNU/Linux systems, and many other systems that use Linux as the kernel, including Chrome.
- The GNU C library is described in the ANSI, C99, and C11 standards. It includes macros, symbols, and function implementations, such as print() and malloc().
- The GNU libc is a POSIX standard library and serves as the userland glue for system calls, such as open() and read(). Specifically, GNU C library does not implement system calls; the kernel does that. However, GNU C library provides the userland interface to services offered by the kernel so that user applications can use a system call just like an ordinary function.
- The GNU C Library uses malloc for dynamic memory allocation.
Bionic C Library (libc) for Android
- Bionic C library (libc)4 is a C library specifically created for Android mobile and embedded environments, providing essential C standard library functions. It prioritizes performance and memory efficiency for mobile applications.
- Bionic features a custom implementation that is optimized for embedded systems.
- Known for its small size and fast code paths, it provides very fast, custom pthread implementations.
- For dynamic memory allocation, Android uses jemalloc
- Bionic C library is not compatible with GNU C library.
Comparative Analysis: Memory Allocation in Chrome and Android Environments
Geekbench Subtest Scores
See table 1 for a detailed comparison of benchmark scores obtained from the Linux version of Geekbench run in the ChromeOS shell, Crostini shell, and the Android version of Geekbench run in ARCVM on the Chromebook* platform featuring Intel® Core™ Ultra Processors.
Table 1. Geekbench Score Comparison between Different Operating Environments
OS | R121:15678.0.0 |
---|---|
RAM | 16GB |
CPU(S): | 14 |
On-line CPU(s) list: | 0-13 |
CPU max MHz | 4800 |
CPU min MHz | 400 |
Tests | Geekbench 5.4.5 Linux in ChromeOS Shell | Geekbench 5.4.5 Linux in Crostini VM | Geekbench 5.4.5 Android APK in ARCVM | Delta ChromeOS Shell vs Crostini VM | Delta ChromeOS Shell vs ARCVM | |
---|---|---|---|---|---|---|
Single-Core | Score | 1688 | 1622 | 1456 |
-4% |
-14% |
Crypto | Score | 3330 | 3219 | 3129 | -3% | -6% |
Integer | Score | 1478 | 1452 | 1260 | -2% | -15% |
Floating Point | Score | 1869 | 1723 | 1602 | -8% | -14% |
Multi-Core | Score | 7478 | 7243 | 6056 | -3% | -19% |
Crypto | Score | 10677 | 10226 | 11101 | -4% | 4% |
Integer | Score | 7184 | 6955 | 5549 | -3% | -23% |
Floating Point | Score | 7582 | 7369 | 6313 | -3% | -17% |
Memory Usage by GeekBench 5.4.5 |
1.35 GB | 1.58 GB | 0.45 GB |
Further analysis identifies a notable disparity in the performance of subtests (HTML5, SQLite*, Clang*) that are influenced by memory allocation and access. Table 2 shows a detailed comparison of subtest scores.
Table 2. Geekbench Subtest Score Comparison between Different Operating Environments
Tests |
GeekBench 5.4.5 Linux |
GeekBench 5.4.5 Linux |
GeekBench 5.4.5 Android |
Delta ChromeOS Shell |
Delta ChromeOS Shell |
|
---|---|---|---|---|---|---|
Single-Core |
||||||
AES-XTS |
Score |
3330 |
3219 |
3129 |
-3% |
-6% |
Text Compression |
Score |
1226 |
1208 |
1169 |
-1% |
-5% |
Image Compression |
Score |
1604 |
1670 |
1340 |
4% |
-16% |
Navigation |
Score |
975 |
1008 |
933 |
3% |
-4% |
HTML5 |
Score |
1797 |
1732 |
1256 |
-4% |
-30% |
SQLite |
Score |
1653 |
1613 |
1265 |
-2% |
-23% |
PDF Rendering |
Score |
1532 |
1581 |
1328 |
3% |
-13% |
Text Rendering |
Score |
1469 |
1372 |
1325 |
-7% |
-10% |
Clang |
Score |
1735 |
1640 |
1446 |
-5% |
-17% |
Camera |
Score |
1513 |
1421 |
1357 |
-6% |
-10% |
N-Body Physics |
Score |
1865 |
1727 |
1786 |
-7% |
-4% |
Rigid Body Physics |
Score |
1906 |
1801 |
1535 |
-6% |
-19% |
Gaussian Blur |
Score |
1391 |
1296 |
1411 |
-7% |
1% |
Face Detection |
Score |
2024 |
1732 |
1686 |
-14% |
-17% |
Horizon Detection |
Score |
1526 |
1374 |
982 |
-10% |
-36% |
Image Inpainting |
Score |
3035 |
2853 |
2691 |
-6% |
-11% |
HDR |
Score |
3138 |
3026 |
2490 |
-4% |
-21% |
Ray Tracing |
Score |
2485 |
2044 |
2158 |
-18% |
-13% |
Structure for Motion |
Score |
1585 |
1589 |
1606 |
0% |
1% |
Speech Recognition |
Score |
1345 |
1280 |
1137 |
-5% |
-15% |
Machine Learning |
Score |
1263 |
1157 |
1055 |
-8% |
-16% |
Multi-Core |
||||||
AES-XTS |
Score |
10677 |
10226 |
11101 |
-4% |
4% |
Text Compression |
Score |
5168 |
5552 |
4600 |
7% |
-11% |
Image Compression |
Score |
7436 |
7418 |
6234 |
0% |
-16% |
Navigation |
Score |
4729 |
4482 |
4173 |
-5% |
-12% |
HTML5 |
Score |
10343 |
9608 |
6958 |
-7% |
-33% |
SQLite |
Score |
9362 |
8743 |
6055 |
-7% |
-35% |
PDF Rendering |
Score |
7458 |
8110 |
5236 |
9% |
-30% |
Text Rendering |
Score |
8352 |
6502 |
7709 |
-22% |
-8% |
Clang |
Score |
7533 |
7298 |
5531 |
-3% |
-27% |
Camera |
Score |
6174 |
6385 |
4430 |
3% |
-28% |
As shown in figure 2, the memory logs confirm that the Linux version of Geekbench has a maximum allocation of memory, whereas the Android version is not able to get a complete allocation.
Figure 2. Memory use
The Android version of Geekbench uses significantly less memory, with only 0.45 GB compared to the 1.3 GB used by the Linux version, indicating a lower memory allocation for Android. The Linux version employs the GNU C library (glibc/libm) and standard malloc for dynamic memory allocation, while the Android version relies on Bionic C library (libc) and jemalloc.
The jemalloc Memory Allocator
The jemalloc5 memory allocator is a general-purpose malloc implementation that emphasizes fragmentation avoidance and scalable concurrency support. Using jemalloc reduces performance but also lessens memory fragmentation. Google must decide the extent of acceptable performance degradation.
The jemalloc memory allocator offers several adjustable parameters, which are often conservatively set by default and may not be ideal for numerous typical workloads. Properly calibrating jemalloc for a particular application or workload can commonly enhance system-level performance by a small percentage or allow for beneficial trade-offs. To achieve optimal tuning, it is necessary to acquire jeprof files within the Android operating system. However, ChromeOS currently lacks the infrastructure for deploying Android OS images.
Memory Allocation Dynamics
- Android operates within a virtual machine environment, with memory management handled by crosvm, which uses a ballooning policy.
- The ballooning policy is informed by the Low Memory Killer Daemon (LMKD) within Android to guide memory allocation.
- In contrast, the Linux version of Geekbench runs directly, adhering to a standard memory allocation policy that grants requested memory without modification.
User Interface and Command-Line Application Resource Use
- The command-line interface (CLI) version of Geekbench for Linux is more memory-efficient compared to the Android APK, which includes a graphical user interface (GUI).
- The CLI version does not engage graphics modules, avoiding the additional overhead associated with the Android graphics pipeline.
- The Java UI threads in the APK introduce further overhead. See figure 3.
Benchmark Runtime Comparison
Geekbench running on Linux via the CLI completes in 175 seconds, whereas the Geekbench APK version takes 200 seconds to finish.
Java Native Interface (JNI) Calls
JNI calls are generally slower. See figure 3.
Figure 3. JNI calls
Given the factors causing the bottleneck in Geekbench scores, strategies focus on bridging the gap between the built-in Android version and the Android version running in a virtual machine (VM) deployment.
Optimization Strategy 1: Fine-Tune jemalloc Parameters
Continued investigation into fine-tuning jemalloc parameters yields substantial performance enhancements. Our experiments, which involve running Geekbench workloads with modified jemalloc settings, indicate notable improvements in performance. The jemalloc dirty_decay_ms configuration parameter, in particular, had a pronounced impact on the performance of the malloc-intensive Geekbench multicore (MC) workload.
Table 3 details insights into the performance trends of the Geekbench MC scores under various dirty_decay_ms settings in a virtual machine environment.
Table 3. Geekbench Score Comparison with dirty_decay_ms Parameter Settings
dirty_decay_ms_value | Base (default) | 5 Sec | 15 Sec | 20 Sec | 30 Sec |
---|---|---|---|---|---|
Intel® Core™ i3 N Processor based Chromebook - GeekBench MC Score | 2748 | 2843 | 2890 | 2923 | 3019 |
The following elements contribute to the observed performance improvements:
- An approximate 28% decrease in CPU cycles per second
- A decrease in the number of vm_exits caused by EPT_VIOLATION
- A reduction in the overhead associated with memory ballooning
Moreover, configuring the narenas parameter to align with the available number of CPUs (rather than using the fixed value of 12) has resulted in further gains. Geekbench MC scores on Chromebook platforms based on Intel® Core™ i3 N processors experienced an additional increase of 2% to 3%.
Strategy 2: Responsive Ballooning (VM Memory Management Service)
Implementing the responsive ballooning feature successfully eliminates the overhead traditionally associated with limited cache memory ballooning. This leads to performance gains of approximately 8% to 10%.
This significant enhancement refines the existing cache ballooning policy of ARCVM by restructuring it to incorporate a broader range of metrics. The new design focuses on managing memory distribution more effectively between the guest (Android VM) and the host (ChromeOS). This ensures a more dynamic and efficient allocation of resources, responding adeptly to the needs of both environments.
Results from Fine-Tuning and Responsive Ballooning
Optimization Strategy 1: Fine-Tune jemalloc Parameters
Modifying the jemalloc dirty_decay_ms parameter to a 30-second interval resulted in performance gains across a diverse set of workloads on multiple ChromeOS platforms. The following table details the observed performance gains.
Table 4. Performance results with optimal (30x) dirty_decay_ms value
Platform | Workloads and Gains |
---|---|
Intel® Core™ i3 N Processor based Chromebook |
|
12th Generation Intel® Core™ i7 Processors based Chromebook |
|
13th Generation Intel® Core™ i7 Processors based Chromebook |
|
Strategy 2: Responsive Ballooning (VM Memory Management Service)
Tables 5 through 8 display a comparative analysis of Geekbench 5.4.5 scores between Android versions (performed within a VM environment) and their Linux counterparts across different platforms. With the introduction of the responsive ballooning feature, the performance disparity between the Android and Linux versions of Geekbench notably narrows to less than 10%.
Table 5. Geekbench Scores Using Chromebooks Based on 11th Generation Intel® Core™ i7 Processors
Benchmark |
11th Generation Intel® Core™ i7 Processors based Chromebook |
||||
Android APK Version |
Android CLI Version |
Linux CLI Version in ChromeOS |
Android APK vs CLI |
Android CLI vs Linux CLI |
|
GeekBench-5.4.5 |
1385 |
1473 |
1658 |
-6% |
-13% |
GeekBench-5.4.5 |
5161 |
5246 |
5872 |
-2% |
-12% |
Table 6. Geekbench Scores Using Chromebooks Based on Intel Core i3 N Processors
Benchmark |
Intel® Core™ i3 N Processor based Chromebook |
||||
Android APK Version |
Android CLI Version |
Linux CLI Version in ChromeOS |
Android APK vs CLI |
Android CLI vs Linux CLI |
|
GeekBench-5.4.5 |
1048 |
1087 |
1174 |
-4% |
-8% |
GeekBench-5.4.5 |
4926 |
5087 |
5411 |
-3% |
-6% |
Table 7. Geekbench Scores Using Chromebooks based on 12th Generation Intel Core i7 Processors
Benchmark |
12th Generation Intel® Core™ i7 Processor based Chromebook |
||||
Android APK |
Android CLI |
Chrome Native CLI |
Android APK vs CLI |
Android CLI vs Native CLI |
|
GeekBench-5.4.5 |
1299 |
1399 |
1534 |
-8% |
-10% |
GeekBench-5.4.5 |
6375 |
6471 |
7000 |
-2% |
-8% |
Table 8. GeekBench Scores Using Chromebooks based on 13th Generation Intel Core i7 Processors
Benchmark |
13th Generation Intel® Core™ i7 Processor based Chromebook |
||||
Android APK |
Android CLI |
Chrome Native CLI |
Android APK vs CLI |
Android CLI vs Native CLI |
|
GeekBench-5.4.5 |
1783 |
1896 |
2069 |
-6% |
-9% |
GeekBench-5.4.5 |
6617 |
6684 |
7325 |
-1% |
-10% |
Conclusion and Next Steps
The analysis of Geekbench workloads and their respective operating environments reveals distinct differences between the built-in Linux version and the Android version running within a VM environment. These differences are responsible for the variations in benchmark scores. Although the optimizations are effective in reducing this gap, they are not sufficient to fully bridge it. Certain inherent constraints within the VM software stack are irremovable, which prevents the complete equalization of performance between the two versions. However, these insights pave the way for future developments, and ongoing research and innovation hold the potential for further enhancements, leading to even more robust performance improvements in virtualized environments.
Test Configuration
Platform: Nissa Chromebook notebook
Software: Google ChromeOS CPFE R124-15810.0.0
Hardware: Intel Core i3 N305 processor, 8 GB RAM