Variable Precision DSP Blocks User Guide: Agilex™ 5 FPGAs and SoCs

ID 813968
Date 9/20/2024
Public
Document Table of Contents

3.3.3. Tensor Floating-point Mode

In tensor floating-point mode, two columns of ten 8-bit weights and 8-bit shared exponents are preloaded into the ping-pong buffers by using one of the following methods:
  • Data input feed
  • Side input feed

A signed 20-bit fixed-point DOT product vector is calculated using the preloaded weights and data_in_{1..10} inputs.

The DOT product performs 10 signed 8x8 multiplications.

Next, the fixed-point to 32-bit floating-point converter converts the output of each DOT product into 32-bit floating-point values that are adjusted by the shared_exponent_data[7:0] and preloaded buffer share exponent values.

Then, the accumulator adds or subtracts the cascade_data_in_col_{1..2} or the previous cycle’s accumulation value depending upon the dynamic inputs acc_en and zero_en.

Whether the accumulator adds or subtracts is an IP configuration option.

The accumulator outputs the data in FP32 data format to the core fabric as fp32_col_{1..2}[31:0] and or the next DSP block in the chain through the cascade_data_out_col_{1..2}[31:0] buses.

Table 23.  Tensor Floating-point Mode Equations
zero_en acc_en fp32_col_1[31:0] fp32_col_2[31..0]
0 0 32-bit floating point conversion of ( data_in_1[7:0]*B1C1 + data_in_2[7:0]*B2C1 + data_in_3[7:0]*B3C1 + data_in_4[7:0]*B4C1 + data_in_5[7:0]*B5C1 + data_in_6[7:0]*B6C1 + data_in_7[7:0]*B7C1 + data_in_8[7:0]*B8C1 + data_in_9[7:0]*B9C1 + data_in_10[7:0]*B10C1) +/- cascade_data_in_col_1[31..0] 32-bit floating point conversion of ( data_in_1[7:0]*B1C2 + data_in_2[7:0]*B2C2 + data_in_3[7:0]*B3C2 + data_in_4[7:0]*B4C2 + data_in_5[7:0]*B5C2 + data_in_6[7:0]*B6C2 + data_in_7[7:0]*B7C2 + data_in_8[7:0]*B8C2 + data_in_9[7:0]*B9C2 + data_in_10[7:0]*B10C2) +/- cascade_data_in_col_2[31..0]
0 1 32-bit floating point conversion of ( data_in_1[7:0]*B1C1 + data_in_2[7:0]*B2C1 + data_in_3[7:0]*B3C1 + data_in_4[7:0]*B4C1 + data_in_5[7:0]*B5C1 + data_in_6[7:0]*B6C1 + data_in_7[7:0]*B7C1 + data_in_8[7:0]*B8C1 + data_in_9[7:0]*B9C1 + data_in_10[7:0]*B10C1) +/- fp32_col_1[31..0] 32-bit floating point conversion of ( data_in_1[7:0]*B1C2 + data_in_2[7:0]*B2C2 + data_in_3[7:0]*B3C2 + data_in_4[7:0]*B4C2 + data_in_5[7:0]*B5C2 + data_in_6[7:0]*B6C2 + data_in_7[7:0]*B7C2 + data_in_8[7:0]*B8C2 + data_in_9[7:0]*B9C2 + data_in_10[7:0]*B10C2) +/- fp32_col_2[31..0]
1 NA 32-bit floating point conversion of ( data_in_1[7:0]*B1C1 + data_in_2[7:0]*B2C1 + data_in_3[7:0]*B3C1 + data_in_4[7:0]*B4C1 + data_in_5[7:0]*B5C1 + data_in_6[7:0]*B6C1 + data_in_7[7:0]*B7C1 + data_in_8[7:0]*B8C1 + data_in_9[7:0]*B9C1 + data_in_10[7:0]*B10C1) 32-bit floating point conversion of ( data_in_1[7:0]*B1C2 + data_in_2[7:0]*B2C2 + data_in_3[7:0]*B3C2 + data_in_4[7:0]*B4C2 + data_in_5[7:0]*B5C2 + data_in_6[7:0]*B6C2 + data_in_7[7:0]*B7C2 + data_in_8[7:0]*B8C2 + data_in_9[7:0]*B9C2 + data_in_10[7:0]*B10C2)

The output signals fp32_col_{1..2}_flag[3:0] are provided in conjunction with the floating point output to show the exception type. The encoding of this signal is as shown in the following table.

Table 24.  Exception Handling Results
bit 3 Overflow
bit 2 Underflow
bit 1 Inexact
bit 0 Invalid (NaN)
Figure 56. Tensor Floating-point Mode One Column Datapath