Provides troubleshooting steps to recover from an IERR ("Processor CPU Machine Chk", "CPU Internal Error", "CPU IErr," or "CPU Machine Check error")
Server crashes with (due to) Processor CPU Machine Chk.
Usually, the System Event Log (SEL) viewer shows the following:
3 | 12/18/2017 | 03:17:30 | Processor CPU Machine Chk | Transition to Non-recoverable | Asserted
4 | 12/18/2017 | 03:17:30 | Unknown MSR Info Log | | Asserted
5 | 12/18/2017 | 03:17:30 | Unknown MSR Info Log | | Asserted
etc
While this (also known as CPU Internal Error, CPU IErr, or CPU Machine Check errors) may be a signal that indicates an unrecoverable processor scenario, it is usually an indication that the Central Processing Unit (CPU) has detected an error in the system, or received an erroneous instruction from a system component. Still, the following are valid troubleshooting steps:
- Restart the system.
- Check the System Event Log to find out which processor is generating the error. This would depend on the error found.
- Ensure the BIOS/firmware is the latest.
- Try with one, compatible processor at a time (minimal configuration).
- Test with another, compatible processor, if possible. If the board is an Intel® Server Board, refer to the Product Specifications page for processor-board compatibility information.
- Remove and reinstall the memory.
- Check all the drives, and ensure the cables are connected properly on both ends
- Check all the risers and the PCIe cards
- Clear the system event log and monitor for 24 hours.
- Therefore, if further errors are found, contact Intel Customer Support with a copy of these logs, plus the logs of the System Information Retrieval Utility.
There could be several causes of this error, not necessarily the CPU. A system bus interruption or a memory interruption can even start it up.