Agilex™ 5 ES Device Errata and User Guidelines

ID 825514
Date 12/09/2024
Public
Document Table of Contents

3.1.9. 1401736: Non-cacheable loads of mismatched size might not be single-copy atomic

Description

The Cortex* -A55 core supports single-copy atomic load and store accesses as described in the Arm architecture. However, in some unusual code sequences, this erratum can cause the core executing a store and later a load to the same address but with a different access size to load data that does not meet the requirements of a single-copy atomic load.

Conditions

On one core, the following condition must occur:

  • A store instruction is executed. This could be any halfword, word, or doubleword store instruction that is not a store release, and the address must be aligned to the access size. The store must be to Normal Non-cacheable, Normal Write-Through, or Device-GRE memory.

On a second core, which can be within the same cluster or in a different cluster, the following conditions must occur:

  1. A store instruction is executed. This store must be a smaller access size to the store on the first core, and must be to an address within the bytes accessed by the first core. The address must also be aligned to the access size.
  2. A load instruction is executed. The load must be a larger access size than the store from the same core, and at least some bytes of the load must be to the same address as the store. The address must also be aligned to the access size.

The Arm architecture requires that the load is single-copy atomic. However, in the conditions described, the load might observe a combination of the two stores, indicating that the store on the first core was serialized first. However, if the load is repeated, then the second time it might see just the data from the store from the first core indicating that the store on the first core was serialized second.

Impact

Concurrent, unordered stores are not common in multi-threaded code. In the C11 standard, they are restricted to the family of "relaxed" atomics. In addition, using different size load and store instructions to access the same data is unusual. Therefore the majority of multi-threaded software is not going to meet the conditions for this erratum.

Workaround

Most multi-threaded software is not expected to meet the conditions for this erratum and therefore does not require a workaround. If a workaround is required, then the store on the second core should be replaced with a store release instruction.

Category

Category C