Summary
Troubleshooting Intel® Server Board S2600WFR for internal (IERR) error
Description
Found hardware events below on the System Event Log (SEL) after turning back the server on and some of the Dual in-line memory modules (DIMMs) are not being detected:
- EventID:0391 Time:Sun Feb 20 21:34:08 2022 Controller:BMC SensorType:Processor SensorName:P1 Status Description:IERR - Asserted
- Capability SensorName:SPS FW Health Description:Restore factory presets using 'Force ME Recovery' IPMI command or by doing AC power cycle with Recovery jumper asserted. If this does not clear the issue, reflash the SPI flash. If the issue persists, provide the content of Event Data 3 to Intel support team for interpretation. - OEM - Asserted
- EventID:0400 Time:Sun Feb 20 21:34:44 2022 Controller:BMC SensorType:Processor SensorName:P1 Status Description:IERR – Deasserted
Resolution
- Update the Basic Input/Output System (BIOS) to latest Firmware following the correct prerequisite. Refer to BIOS Update for Intel® Server Boards for the instructions on how to apply the update.
- Configure the memory modules as per Technical Product Specification population rules. Refer to Supported Memory and Memory Population Rules for the Intel® Server Board S2600WF Family
- Reseat same DIMMs in same slots and check if all the DIMMs will be detected
- Swap the affected DIMMs with a known working one and check if memory is getting detected correctly after replacement between the slots.
- Swap the CPUs between slots and check if memory is getting detected correctly in BMC/OS.
- If memory is still not detected, extract another set of Debug logs and submit a ticket to Intel Customer Support for further analysis