Visible to Intel only — GUID: mhd1690835508088
Ixiasoft
Visible to Intel only — GUID: mhd1690835508088
Ixiasoft
3.3.4. Tensor Fixed-point Mode
- Data input feed
- Side input feed
A signed 20-bit fixed-point DOT product vector is calculated using the preloaded weights and data_in_{1..10} inputs. The DOT product performs 10 signed 8x8 multiplications.
Next, the CPA adder adds the cascade_data_in_col_{1..2} or the previous cycle’s accumulation value depending upon the dynamic inputs acc_en and zero_en.
The CPA adder outputs the data in 32-bit fixed-point format to core fabric as fxp32_col_{1..2}[31:0] and the next DSP block in the chain through the cascade_data_out_col_{1..2}[31:0] buses.
zero_en | acc_en | fxp32_col_1[31:0] | fxp32_col_2[31..0] |
---|---|---|---|
0 | 0 | data_in_1[7:0]*B1C1 + data_in_2[7:0]*B2C1 + data_in_3[7:0]*B3C1 + data_in_4[7:0]*B4C1 + data_in_5[7:0]*B5C1 + data_in_6[7:0]*B6C1 + data_in_7[7:0]*B7C1 + data_in_8[7:0]*B8C1 + data_in_9[7:0]*B9C1 + data_in_10[7:0]*B10C1 + cascade_data_in_col_1[31..0] | data_in_1[7:0]*B1C2 + data_in_2[7:0]*B2C2 + data_in_3[7:0]*B3C2 + data_in_4[7:0]*B4C2 + data_in_5[7:0]*B5C2 + data_in_6[7:0]*B6C2 + data_in_7[7:0]*B7C2 + data_in_8[7:0]*B8C2 + data_in_9[7:0]*B9C2 + data_in_10[7:0]*B10C2 + cascade_data_in_col_2[31..0] |
0 | 1 | data_in_1[7:0]*B1C1 + data_in_2[7:0]*B2C1 + data_in_3[7:0]*B3C1 + data_in_4[7:0]*B4C1 + data_in_5[7:0]*B5C1 + data_in_6[7:0]*B6C1 + data_in_7[7:0]*B7C1 + data_in_8[7:0]*B8C1 + data_in_9[7:0]*B9C1 + data_in_10[7:0]*B10C1 + fxp32_col_1[31..0] | data_in_1[7:0]*B1C2 + data_in_2[7:0]*B2C2 + data_in_3[7:0]*B3C2 + data_in_4[7:0]*B4C2 + data_in_5[7:0]*B5C2 + data_in_6[7:0]*B6C2 + data_in_7[7:0]*B7C2 + data_in_8[7:0]*B8C2 + data_in_9[7:0]*B9C2 + data_in_10[7:0]*B10C2 + fxp32_col_2[31..0] |
1 | NA | data_in_1[7:0]*B1C1 + data_in_2[7:0]*B2C1 + data_in_3[7:0]*B3C1 + data_in_4[7:0]*B4C1 + data_in_5[7:0]*B5C1 + data_in_6[7:0]*B6C1 + data_in_7[7:0]*B7C1 + data_in_8[7:0]*B8C1 + data_in_9[7:0]*B9C1 + data_in_10[7:0]*B10C1 | data_in_1[7:0]*B1C2 + data_in_2[7:0]*B2C2 + data_in_3[7:0]*B3C2 + data_in_4[7:0]*B4C2 + data_in_5[7:0]*B5C2 + data_in_6[7:0]*B6C2 + data_in_7[7:0]*B7C2 + data_in_8[7:0]*B8C2 + data_in_9[7:0]*B9C2 + data_in_10[7:0]*B10C2 |