Variable Precision DSP Blocks User Guide: Agilex™ 5 FPGAs and SoCs

ID 813968
Date 9/20/2024
Public
Document Table of Contents

3.3.5. Tensor Accumulation Mode

In tensor accumulation mode, two 32-bit floating-point values (one for each column) are fed in through ports fp32_a{1..2} to perform addition or subtraction with the 32-bit floating-point accumulator. The accumulator adds or subtracts the cascade_data_in_col_{1:2} or the previous cycle’s accumulation value depending upon the dynamic inputs acc_en and zero_en.

Whether the accumulator adds or subtracts is an IP configuration option.

The two 32-bit floating-point results are sent out through fp32_col{1..2}[31:0] and can be cascaded to the next DSP block through cascade_data_out_col_{1..2}[31:0].

The DOT engine is bypassed in this mode.

Table 26.  Tensor Accumulation Mode Equations
zero_en acc_en fp32_col_1[31:0] fp32_col_2[31:0]
0 0 fp32_a1[31..0] +/- cascade_data_in_col_1[31:0] fp32_a2[31:0] +/- cascade_data_in_col_2[31..0]
0 1 fp32_a1[31..0] +/- fp32_col_1[31:0] fp32_a2[31..0] +/- fp32_col_2[31:0]
1 NA fp32_a1[31..0] fp32_a2[31..0]
Figure 60. Tensor Accumulation Mode One Column Datapath