Variable Precision DSP Blocks User Guide: Agilex™ 5 FPGAs and SoCs

ID 813968
Date 9/20/2024
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

3.3.5. Tensor Accumulation Mode

In tensor accumulation mode, two 32-bit floating-point values (one for each column) are fed in through ports fp32_a{1..2} to perform addition or subtraction with the 32-bit floating-point accumulator. The accumulator adds or subtracts the cascade_data_in_col_{1:2} or the previous cycle’s accumulation value depending upon the dynamic inputs acc_en and zero_en.

Whether the accumulator adds or subtracts is an IP configuration option.

The two 32-bit floating-point results are sent out through fp32_col{1..2}[31:0] and can be cascaded to the next DSP block through cascade_data_out_col_{1..2}[31:0].

The DOT engine is bypassed in this mode.

Table 26.  Tensor Accumulation Mode Equations
zero_en acc_en fp32_col_1[31:0] fp32_col_2[31:0]
0 0 fp32_a1[31..0] +/- cascade_data_in_col_1[31:0] fp32_a2[31:0] +/- cascade_data_in_col_2[31..0]
0 1 fp32_a1[31..0] +/- fp32_col_1[31:0] fp32_a2[31..0] +/- fp32_col_2[31:0]
1 NA fp32_a1[31..0] fp32_a2[31..0]
Figure 60. Tensor Accumulation Mode One Column Datapath