Decoding Machine Check Exception (MCE) output after a purple screen error (1005184)
- An ESX/ESXi host halts with a purple diagnostic screen.
- The purple diagnostic screen shows a message similar to:
- Machine Check Exception: Unable to continue
- Hardware (Machine) Error
- PCPU: 1 hardware errors seen since boot (1 corrected by hardware)
- When extracting the logs from the core dump, you see the error similar to:
- ALERT: MCE: 171: Machine Check Exception: Bank x, Status nnnnnnnnnnn
- MC:PCPUn B:x S:nnnnnnnnnnn M:mmmmmmmmmmmm: A:aaaaaaaaaaa
- On AMD systems you see a message which indicates a hardware issue, but an MCE does not occur. You see the error similar to:
vmkernel: 72:03:47:16.847 cpu4:14403)MCE: 978: MCE not recoverable but did not generate an exception.
The machine check architecture is a mechanism within a CPU to detect and report hardware issues. When a problem is detected, a machine check exception (MCE) is thrown. If an MCE is thrown and a purple diagnostic screen is displayed, a hardware problem has caused it. There is no other way to generate an MCE.
When the system has faulted with a purple screen, capture the screen output then reboot the server and contact your hardware vendor. In the meantime, the information regarding the fault itself can be decoded to get a better idea of what may be happening.
MCE: 215: CMCI on cpu1 bank8: Status:0xd000008000310080 Misc:0x0 Addr:0x0: Valid.Overflow.Err enabled.MCE: 220: Status bits: "Memory Controller Error on Channel 0.
- Intel - Chapters 15 and 16 of the Intel 64 and IA-32 Architectures Software Developer's Manual.
- AMD - Chapter 9 of the AMD64 Architecture Programmer's Manual Volume 2: System Programming
Note: The preceding links were correct as of September 8, 2014. If you find the links are broken, provide feedback and a VMware employee will update the link.
- Collecting diagnostic information from an ESX or ESXi host that experiences a purple diagnostic screen (1004128)
- Decodificación del error: Machine check Exception (MCE), después de una pantalla de error color púrpura (1036039)