Visible to Intel only — GUID: lxk1612826774656
Ixiasoft
Visible to Intel only — GUID: lxk1612826774656
Ixiasoft
4.4.1.4. Avalon® Streaming TX Interface
The Application Layer transfers data to the Transaction Layer of the PCI Express IP core over the Avalon® -ST TX interface. The Transaction Layer must assert tx_st_ready_o before transmission begins. Transmission of a packet must be uninterrupted when tx_st_ready_o is asserted.
There are four segments with a 256-bit data width. The interface supports multiple TLPs per cycle.
This interface supports one tx_st_sop_i signal and one tx_st_eop_i signal per cycle for each segment when the R-Tile IP is operating in a x16 configuration. This means there are four tx_st_sop_i signals and four tx_st_eop_i signals for the x16 IP core. This interface also does not follow a fixed latency between the tx_st_ready_o and tx_st_valid_i signals as specified by the Avalon Interface Specifications. Data can be received any time within the maximum latency between the deassertion of tx_st_ready_o and tx_st_valid_i, which is 16 coreclkout_hip cycles.
The x16 core provides four segments with each one having 256 bits of data (tx_st_data_i[255:0]), 128 bits of header (tx_st_hdr_i[127:0]), and 32 bits of TLP prefix (tx_st_tlp_prfx_i[31:0]). If this core is configured in the 1x16 mode, all four segments are used, so the data bus becomes a 1024-bit bus altogether, consisting of tx_st0_data_i[255:0], tx_st1_data_i[255:0], tx_st2_data_i[255:0], and tx_st3_data_1[255:0]. The start of packet can appear in any of the segments, as indicated by the tx_stN_sop_i signals.
Parity generation is done via a 32:1 XOR (i.e. there is one parity bit for every 32 data, header or prefix bits).
Signal Name | Direction | Description | EP/RP/BP | Clock Domain |
---|---|---|---|---|
pX_tx_stN_data_i[255:0] where X = 0,1,2,3 (IP core number) N = 0,1,2,3 (segment number) |
Input | Application Layer data for transmission. The data bus is organized in multiple 256-bit segments. In x16 mode, all four segments are used to effectively form a 1024-bit data bus. In x8 mode, two segments are used to form a 512-bit data bus. In x4 mode, each 256-bit segment is an independent data bus. The Application Layer must provide a properly formatted TLP on the TX interface. The data is valid when the corresponding tx_stN_valid_i signal is asserted. The mapping of message TLPs is the same as the mapping of Transaction Layer TLPs with 4-dword headers. The number of data cycles must be correct for the length and address fields in the header. Issuing a packet with an incorrect number of data cycles results in the TX interface hanging and becoming unable to accept further requests. Note: There must be no Idle cycle between the tx_stN_sop_i and tx_stN_eop_i cycles unless there is backpressure with the deassertion of tx_st_ready_o. |
EP/RP/BP | coreclkout_hip |
pX_tx_stN_hdr_i[127:0] where X = 0,1,2,3 (IP core number) N = 0,1,2,3 (segment number) |
Input | This is the header to be transmitted, which follows the TLP header format of the PCIe specifications except for the requester ID/completer ID fields (tx_stN_hdr_i[95:80]):
These signals are valid when the corresponding tx_stN_sop_i signal is asserted. |
EP/RP/BP | coreclkout_hip |
pX_tx_stN_prefix_i[31:0] where X = 0,1,2,3 (IP core number) N = 0,1,2,3 (segment number) |
Input | This is the TLP prefix to be transmitted, which follows the TLP prefix format of the PCIe specifications. PASID is supported. These signals are valid when the corresponding tx_stN_sop_i signal is asserted. The TLP prefix uses a Big Endian implementation (i.e. the Fmt field is in bits [31:29] and the Type field is in bits [28:24]). If no prefix is present for a given TLP, that dword, including the Fmt field, is all zeros. |
EP/RP/BP | coreclkout_hip |
pX_tx_stN_sop_i where X = 0,1,2,3 (IP core number) N = 0,2 (segment number) |
Input | Indicate the first cycle of a TLP when asserted in conjunction with the corresponding bit of tx_stN_valid_i. For the x16 configuration:
These signals are asserted for one clock cycle per each TLP. They also qualify the corresponding tx_stN_hdr_i and tx_stN_tlp_prfx_i signals.
Note: pX_tx_stN_sop_i pulses can only be sent on segments 0 and/or 2 (st0 and/or st2).
|
EP/RP/BP | coreclkout_hip |
pX_tx_stN_eop_i where X = 0,1,2,3 (IP core number) N = 0,1,2,3 (segment number) |
Input | Indicate the last cycle of a TLP when asserted in conjunction with the corresponding bit of tx_stN_valid_i. For the x16 configuration:
These signals are asserted for one clock cycle per each TLP. |
EP/RP/BP | coreclkout_hip |
pX_tx_stN_dvalid_i where X = 0,1,2,3 (IP core number) N = 0,1,2,3 (segment number) |
Input | Qualify the data of the corresponding segment of tx_stN_data_i into the IP core on ready cycles. To facilitate timing closure, Intel recommends that you register both the tx_st_ready_o and tx_stN_dvalid_i signals. |
EP/RP/BP | coreclkout_hip |
pX_tx_stN_hvalid_i where X = 0,1,2,3 (IP core number) N = 0,1,2,3 (segment number) |
Input | Qualify the header of the corresponding segment of tx_stN_data_i into the IP core on ready cycles. To facilitate timing closure, Intel recommends that you register both the tx_st_ready_o and tx_stN_hvalid_i signals. |
EP/RP/BP | coreclkout_hip |
pX_tx_stN_pvalid_i where X = 0,1,2,3 (IP core number) N = 0,1,2,3 (segment number) |
Input | Qualify the prefix of the corresponding segment of tx_stN_data_i into the IP core on ready cycles. To facilitate timing closure, Intel recommends that you register both the tx_st_ready_o and tx_stN_pvalid_i signals. |
EP/RP/BP | coreclkout_hip |
pX_tx_stN_data_par_i[Z:0] where X = 0,1,2,3 (IP core number) and Z varies based on the core. N = 0,1,2,3 (segment number) |
Input | Parity for tx_stN_data_i. Bit [0] corresponds to tx_stN_data_i[31:0], bit [1] corresponds to tx_stN_data_i[63:32], and so on. By default, the PCIe Hard IP generates the parity for the TX data. |
EP/RP/BP | coreclkout_hip |
pX_tx_stN_hdr_par_i[3:0] where X = 0,1,2,3 (IP core number) N = 0,1,2,3 (segment number) |
Input | Parity for tx_stN_hdr_i. By default, the PCIe Hard IP generates the parity for the TX header. |
EP/RP/BP | coreclkout_hip |
pX_tx_stN_prefix_par_i where X = 0,1,2,3 (IP core number) N = 0,1,2,3 (segment number) |
Input | Parity for tx_stN_tlp_prfx_i. By default, the PCIe Hard IP generates the parity for the TX TLP prefix. |
EP/RP/BP | coreclkout_hip |
pX_tx_st_ready_o where X = 0,1,2,3 (IP core number) |
Output | Indicates that the PCIe Hard IP is ready to accept data. The readyLatency maximum is 16 cycles. If tx_st_ready_o is asserted by the Transaction Layer in the PCIe Hard IP on cycle <n>, then <n> + readyLatency is a ready cycle, during which the Application may assert tx_stN_valid_i and transfer data. If tx_st_ready_o is deasserted by the Transaction Layer on cycle <n>, then the Application must deassert tx_stN_valid_i within the readyLatency number of cycles after cycle <n>. tx_st_ready_o can be deasserted in the following conditions:
|
EP/RP/BP | coreclkout_hip |
As an example, Figure 26 below shows the behavior of the Avalon Streaming TX interface in a back-to-back TLPs scenario with data spanning across multiple segments. The following text describes the waveforms per clock cycle:
- Clock cycle 1: The R-tile Intel FPGA IP for PCI Express asserts p0_tx_st_ready_o signal, indicating the Hard IP is ready to accept TLPs from the Application logic.
- Clock cycle 2:
- The start of the first TLP (T0) is in segment 0, indicated by the assertion of p0_tx_st0_sop_i.
- The signal p0_tx_st0_hvalid_i is asserted to validate the header of this first TLP (T0H0) in the p0_tx_st0_hdr_i bus.
- The signal p0_tx_st0_dvalid_i is asserted to validate the data of this first TLP (T0D0) in the p0_tx_st0_data_i bus.
- The signal p0_tx_st1_dvalid_i is asserted to validate the next portion of the data of this first TLP (T0D1) in the p0_tx_st1_data_i bus.
- The signal p0_tx_st2_dvalid_i is asserted to validate the next portion of the data of this first TLP (T0D2) in the p0_tx_st2_data_i bus.
- The signal p0_tx_st3_dvalid_i is asserted to validate the final portion of the data of this first TLP (T0D3) in the p0_tx_st3_data_i bus.
- The end of this first TLP (T0) is in segment 3, denoted by the assertion of p0_tx_st3_eop_i.
- Clock cycle 3:
- The next TLP (T1), arrives in segment 0, as denoted by p0_tx_st0_sop_i staying high.
- The signal p0_tx_st0_hvalid_i is asserted to validate the header of this TLP (T1H0) in the p0_tx_st0_hdr_i bus.
- The signal p0_tx_st0_dvalid_i is asserted to validate the data of this TLP (T1D0) in the p0_tx_st0_data_i bus.
- The signal p0_tx_st1_dvalid_i is asserted to validate the next portion of the data of this TLP (T1D1) in the p0_tx_st1_data_i bus.
- The signal p0_tx_st2_dvalid_i is asserted to validate the next portion of the data of this TLP (T1D2) in the p0_tx_st2_data_i bus.
- The signal p0_tx_st3_dvalid_i is asserted to validate the final portion of the data of this TLP (T1D2) in the p0_tx_st3_data_i bus.
- The end of this TLP (T1) is in segment 3, denoted by p0_tx_st3_eop_i staying high.