High Bandwidth Memory (HBM2) Interface FPGA IP User Guide

ID 683189
Date 3/29/2024
Public
Document Table of Contents

6.5.7. ECC Error Correction and Detection

The HBM2 controller supports single-bit error correction and double-bit error detection. It does not support correction or detection of more than two error bits.

The ECC encoder and decoder blocks reside inside the UIB subsystem and help to efficiently perform the ECC logic without additional latency. The ECC logic performs the following operations:

  • When a single-bit error is detected, the error is corrected and passed to the AXI Interface.
  • When a double-bit error is detected, a signal is asserted to indicate the error and is passed through the axi_ruser_err_dbe signal in the AXI Read Data Channel Interface.
  • Error information is available through the APB interface.

You can use the HBM2 controller's Read-Modify-Write feature to correct a single or double-bit error detected in the HBM2 DRAM. A single-bit error or a double-bit error corresponds to an error detected on every 64-bit HBM2 DQ bus; consequently, multiple errors could be detected:

  • Multiple single-bit errors are treated as a single single-bit error, and the single-bit error count increases by 1.
  • Multiple double-bit errors are treated as a single double-bit error, and the double-bit error count increases by 1.
  • A single-bit error and a double-bit error are treated as a double-bit error, because the double-bit error has higher priority. The double-bit error count increases by 1 and the single-bit error count does not change.

Read-Modify-Write

The Read-Modify-Write feature reads from the HBM2 DRAM, modifies the data, and writes back to the HBM2 memory. The HBM2 controller supports the following functions as part of the Read-Modify-Write process:

  • Dummy Writes – Corrects HBM2 DRAM data detected to have a single-bit error.
  • Partial Writes – Issues partial writes to HBM2 DRAM where not all bytes are enabled.

Dummy Writes

When the ECC decoder logic detects and corrects a single-bit error on the Read data, user logic may correct the corresponding bit in the HBM2 DRAM, using the Dummy Write process. A Dummy Write issues a Read from the memory location and writes the corrected Read data back to the corresponding memory location.

To request a Dummy Write, issue a regular AXI Write transaction with all byte enables deasserted, to the corresponding address.

The HBM2 controller handles the Read-Modify-Write operation internally and corrects the DRAM data without additional user intervention. The Read-Modify-Write operation follows this process:

  • The AXI Adaptor within the UIB subsystem decodes all the byte enables deasserted and identifies the Read-Modify-Write request.
  • The HBM2 controller then issues a Read to the corresponding address in the HBM2 DRAM.
  • The ECC Decoded Read data is used as Write Data – this is the corrected Read data if a single-bit error was detected.
  • The HBM2 Controller issues an HBM2 Write transaction and later writes the decoded Read data into the memory.

Partial Writes

The HBM2 controller's Partial Write capability allows the user logic to issue a partial write to the HBM2 DRAM, when not all the byte enables are asserted and only selected DRAM bytes are written to. Partial writes are also supported for Pseudo Channel data widths of 256 and 288 bits when the controller tab parameters Write Data Mask Enable and Memory Channel ECC Generation and Checking/Correction are not selected. The Partial Write feature first issues a Read from the memory location, and then merges correct Read data with the correct Write data, to be written back to the memory location. (This process is necessary because Data Mask signals are not available to user logic when ECC is enabled.) The HBM2 controller intelligently supports Partial Writes using the AXI4 interface.

To request a Partial Write, you issue a regular AXI Write transaction, with not all byte enables asserted (that is, not a full Write), and with corresponding Write data to be written to the HBM2 DRAM.

The HBM2 controller handles the issuance of the corresponding memory transactions necessary to complete the Partial Write:

  • The AXI Adaptor within the UIB Subsystem decodes the byte enables and identifies the Partial Write request.
  • The HBM2 controller then issues a Read to the corresponding address in the HBM2 DRAM.
  • The ECC Decoded Read data is merged with the requested Write data based on the byte enables.
  • The HBM2 controller issues an HBM2 Write transaction and later writes the merged data into the memory with the updated ECC code.