Variable Precision DSP Blocks User Guide: Agilex™ 5 FPGAs and SoCs

ID 813968
Date 9/20/2024
Public
Document Table of Contents

3.3.4. Tensor Fixed-point Mode

In tensor fixed-point mode, two columns of 80-bit weights can be preloaded to the ping-pong buffers by using one of the following methods:
  • Data input feed
  • Side input feed

A signed 20-bit fixed-point DOT product vector is calculated using the preloaded weights and data_in_{1..10} inputs. The DOT product performs 10 signed 8x8 multiplications.

Next, the CPA adder adds the cascade_data_in_col_{1..2} or the previous cycle’s accumulation value depending upon the dynamic inputs acc_en and zero_en.

The CPA adder outputs the data in 32-bit fixed-point format to core fabric as fxp32_col_{1..2}[31:0] and the next DSP block in the chain through the cascade_data_out_col_{1..2}[31:0] buses.

Table 25.  Tensor Fixed-point Mode Equations
zero_en acc_en fxp32_col_1[31:0] fxp32_col_2[31..0]
0 0 data_in_1[7:0]*B1C1 + data_in_2[7:0]*B2C1 + data_in_3[7:0]*B3C1 + data_in_4[7:0]*B4C1 + data_in_5[7:0]*B5C1 + data_in_6[7:0]*B6C1 + data_in_7[7:0]*B7C1 + data_in_8[7:0]*B8C1 + data_in_9[7:0]*B9C1 + data_in_10[7:0]*B10C1 + cascade_data_in_col_1[31..0] data_in_1[7:0]*B1C2 + data_in_2[7:0]*B2C2 + data_in_3[7:0]*B3C2 + data_in_4[7:0]*B4C2 + data_in_5[7:0]*B5C2 + data_in_6[7:0]*B6C2 + data_in_7[7:0]*B7C2 + data_in_8[7:0]*B8C2 + data_in_9[7:0]*B9C2 + data_in_10[7:0]*B10C2 + cascade_data_in_col_2[31..0]
0 1 data_in_1[7:0]*B1C1 + data_in_2[7:0]*B2C1 + data_in_3[7:0]*B3C1 + data_in_4[7:0]*B4C1 + data_in_5[7:0]*B5C1 + data_in_6[7:0]*B6C1 + data_in_7[7:0]*B7C1 + data_in_8[7:0]*B8C1 + data_in_9[7:0]*B9C1 + data_in_10[7:0]*B10C1 + fxp32_col_1[31..0] data_in_1[7:0]*B1C2 + data_in_2[7:0]*B2C2 + data_in_3[7:0]*B3C2 + data_in_4[7:0]*B4C2 + data_in_5[7:0]*B5C2 + data_in_6[7:0]*B6C2 + data_in_7[7:0]*B7C2 + data_in_8[7:0]*B8C2 + data_in_9[7:0]*B9C2 + data_in_10[7:0]*B10C2 + fxp32_col_2[31..0]
1 NA data_in_1[7:0]*B1C1 + data_in_2[7:0]*B2C1 + data_in_3[7:0]*B3C1 + data_in_4[7:0]*B4C1 + data_in_5[7:0]*B5C1 + data_in_6[7:0]*B6C1 + data_in_7[7:0]*B7C1 + data_in_8[7:0]*B8C1 + data_in_9[7:0]*B9C1 + data_in_10[7:0]*B10C1 data_in_1[7:0]*B1C2 + data_in_2[7:0]*B2C2 + data_in_3[7:0]*B3C2 + data_in_4[7:0]*B4C2 + data_in_5[7:0]*B5C2 + data_in_6[7:0]*B6C2 + data_in_7[7:0]*B7C2 + data_in_8[7:0]*B8C2 + data_in_9[7:0]*B9C2 + data_in_10[7:0]*B10C2
Figure 58. Tensor Fixed-point Mode One Column Datapath