AN 802: Intel® Stratix® 10 SoC Device Design Guidelines

ID 683117
Date 8/05/2021
Public
Document Table of Contents

3.1.5. Interface Bandwidths

To identify which interface should be used to move data between the HPS and FPGA fabric, an understanding of the bandwidth of each interface is necessary. The figure below illustrates the peak throughput available between the HPS and FPGA fabric as well as the internal bandwidths within the HPS. The example shown assumes that the FPGA fabric operates at 400 MHz, the MPU operates at 1200 MHz, and the 64-bit external SDRAM operates at 1067 MHz.

Figure 12. Stratix 10 HPS Memory Mapped BandwidthFor abbreviations, refer to the figure in Overview of HPS Memory-Mapped Interfaces.

Relative Latencies and Throughputs for Each HPS Interface

Interface

Transaction Use Case

Latency

Throughput

HPS-to-FPGA

MPU accessing memory in FPGA

Medium

Medium

HPS-to-FPGA

MPU accessing peripheral in FPGA

Medium

Very Low

Lightweight HPS-to-FPGA

MPU accessing register in FPGA

Low

Low

Lightweight HPS-to-FPGA

MPU accessing memory in FPGA

Low

Very Low

FPGA-to-HPS

FPGA master accessing non-cache coherent SDRAM

High

Medium

FPGA-to-HPS

FPGA master accessing HPS on-chip RAM

Low

High

FPGA-to-HPS

FPGA master accessing HPS peripheral

Low

Low

FPGA-to-HPS

FPGA master accessing coherent memory resulting in cache miss

High

Medium

FPGA-to-HPS

FPGA master accessing coherent memory resulting in cache hit

Low

Medium-High

FPGA-to-SDRAM

FPGA master accessing SDRAM through single FPGA-to-SDRAM port

Medium

High

FPGA-to-SDRAM

FPGA masters accessing SDRAM through multiple FPGA-to-SDRAM ports

Medium

Very High

Note: For the interfaces with no configuration recommended, refer to the corresponding interface sections: "HPS-to-FPGA Bridge", "Lightweight HPS-to-FPGA Bridge", and "FPGA-to-HPS Bridge".

GUIDELINE: Avoid using the HPS-to-FPGA bridge to access peripheral registers in the FPGA from the MPU.

The HPS-to-FPGA bridge is optimized for bursting traffic and peripheral accesses are typically short word-sized accesses of only one beat. As a result if peripherals are accessed through the HPS-to-FPGA bridge, the transaction can be stalled by other bursting traffic that is already in flight.

GUIDELINE: Avoid using the lightweight HPS-to-FPGA bridge to access memory in the FPGA from the MPU.

The lightweight HPS-to-FPGA bridge is optimized for non-bursting traffic and typically memory accesses are performed as bursts (often 32 bytes due to cache operations). As a result, if memory is accessed through the lightweight HPS-to-FPGA bridge, the throughput is limited.

GUIDELINE: Avoid using the FPGA-to-HPS bridge to access non-cache coherent SDRAM from masters in the FPGA.

The FPGA-to-HPS bridge is optimized for accessing non-SDRAM accesses (peripherals, on-chip RAM). As a result, accessing SDRAM directly by performing non-coherent accesses increases the latency and limits the throughput compared to accesses to FPGA-to-SDRAM ports.

GUIDELINE: Use soft logic in the FPGA (for example, a DMA controller) to move shared data between the HPS and FPGA. Avoid using the MPU and the HPS DMA controller for this use case.

When moving shared data between the HPS and FPGA Intel® recommends to do so from the FPGA instead of moving the data using the MPU or HPS DMA controller. If the FPGA must access cache coherent data then it must access the FPGA-to-HPS bridge with the appropriate ACE-Lite cache extensions signaling to issue a cacheable transaction. If non-cache coherent data must be moved to the FPGA or HPS, a DMA engine implemented in FPGA logic can move the data through one of the FPGA-to-SDRAM bridge ports, achieving the highest throughput possible. Even though the HPS includes a DMA engine internally that can move data between the HPS and FPGA, its purpose is to assist peripherals that do not master memory or provide memory to memory data movements on behalf of the MPU.