25G Ethernet Arria® 10 FPGA IP User Guide

ID 683639
Date 7/25/2024
Public
Document Table of Contents

8. Debugging the Link

Begin debugging your link at the most basic level, with word lock. Then, consider higher level issues.

The following steps should help you identify and resolve common problems that occur when bringing up a 25G Ethernet Intel FPGA IP core link:

  1. Establish word lock—The RX lanes should be able to achieve word lock even in the presence of extreme bit error rates. If the IP core is unable to achieve word lock, check the transceiver clocking and data rate configuration. Check for cabling errors such as the reversal of the TX and RX lanes. Check the clock frequency monitors (KHZ_REF, KHZ_TX, KHZ_RX PHY registers) in the Control and Status registers.

    To check for word lock: Clear the FRM_ERR register by writing the value of 1 followed by another write of 0 to the SCLR_FRM_ERR register at offset 0x324.Then read the FRM_ERR register at offset 0x323. If the value is zero, the core has word lock. If non-zero the status is indeterminate

  2. When having problems with word lock, check the EIO_FREQ_LOCK register at address 0x321. The values in this register define the status of the recovered clock. In normal operation, all the bits should be asserted. A non-asserted (value-0) or toggling logic value on the bit that corresponds to any lane, indicates a clock recovery problem. Clock recovery difficulties are typically caused by the following problems:
    • Bit errors
    • Failure to establish the link
    • Incorrect clock inputs to the IP core
  3. Check the PMA FIFO levels by selecting appropriate bits in the EIO_FLAG_SEL register and reading the values in the EIO_FLAGS register. During normal operation, the TX and RX FIFOs should be nominally filled. Observing a the TX FIFO is either empty or full typically indicates a problem with clock frequencies. The RX FIFO should never be full, although an empty RX FIFO can be tolerated.
  4. Establish lane integrity—When operating properly, the lanes should not experience bit errors at a rate greater than roughly one per hour per day. Bit errors within data packets are identified as FCS errors. Bit errors in control information, including IDLE frames, generally cause errors in XL/CGMII decoding.
  5. Verify packet traffic—The Ethernet protocol includes automatic lane reordering so the higher levels should follow the PCS. If the PCS is locked, but higher level traffic is corrupted, there may be a problem with the remote transmitter virtual lane tags.
  6. Tuning—You can adjust transceiver analog parameters to improve the bit error rate. If you turn on ADME in the IP core parameter editor, you can use the Transceiver Toolkit for this purpose.

In addition, your IP core can experience loss of signal on the Ethernet link after it is established. In this case, the TX functionality is unaffected, but the RX functionality is disrupted. The following symptoms indicate a loss of signal on the Ethernet link:

  • The IP core deasserts the rx_pcs_ready signal, indicating the IP core has lost alignment marker lock.
  • The IP core deasserts the RX PCS fully aligned status bit (bit [0]) of the RX_PCS_FULLY_ALIGNED_S register at offset 0x326. This change is linked to the change in value of the rx_pcs_ready signal.
  • If Enable link fault generation is turned on, the IP core sets local_fault_status to the value of 1.
  • The IP core asserts the Local Fault Status bit (bit [0]) of the Link_Fault register at offset 0x508 . This change is linked to the change in value of the local_fault_status signal.
  • The IP core triggers the RX digital reset process by asserting soft_rxp_rst .