"LINT1 motherboard interrupt" error in an ESX/ESXi host (1804)
- ESXi/ESX hosts are unstable and may fail with a purple diagnostic screens citing an NMI, Non-Maskable, or LINT1 Interrupt.
- The console displays an entry similar to:
LINT1 motherboard interrupt. This is a hardware problem: please contact your hardware vendor.
Note: For additional symptoms and log entries, see the Additional Information section.
An NMI is a physical hardware event. It is typically the result of a non-recoverable condition (in the context of continued operation during that specific boot cycle) that the system BIOS and/or management chipset encounters.
NMI events are routed by the CPU through the Advanced Programmable Interrupt Controller (APIC) to the operating system (in this case, the ESXi host) through the operating system kernel (in this case, the VMkernel). NMI data is transmitted through port 0x61 (ISA-Compatible Register Address hex-61), which is an 8-bit register reserved for NMI data.
An NMI event occurs due to hardware issues such as:
- A bad memory module or processor.
- Severe thermal cycling of a critical component, usually after an extended downtime or a cooling component failure.
- Components running out-of-specification, such as an over-voltage or under-voltage condition due to hardware fault involving a voltage regulator module.
- Unapproved or incompatible components, such as an active memory backplane whose design revision is too early for the chassis.
- A firmware, BIOS or other component mismatch. For example, such as option-card of revision X requiring a minimum option-card firmware revision Y and a minimum chassis BIOS revision Z.
- The CPU IOMMU feature that is used to map the DMA memory for a device from the host operating system to the guest operating system is encountering an error and cannot proceed. You can identify the PCI ID for the device from the vmkernel core dump (Device 007:00.0) and the device by running lspci from the ESXi shell. You can then match the PCI ID to the device. Note that the PCI device may not be the cause, but only the trigger to the issue with another hardware component.
- The ESXi kernel may use an IPI NMI in an attempt to unstick a CPU due to either a hardware or a software condition.
If you experience an NMI event:
- Identify the virtual machines (if any) were powered on at the time of the NMI event.
- Check if powering on a specific virtual machine trigger an NMI event.
- Verify if moving a suspect memory module to a new slot (and therefore higher or lower in memory address space) alter the behavior.
Note: Replacing or relocating hardware components does not necessarily help you determine the root cause of the NMI event and may result in unplanned downtime.
To resolve the NMI event, contact the hardware vendor and provide these data:
- Timeframe that the event happened.
- At least 10 minutes of logs leading up to the event.
- Chassis diagnostics log output and management chipset log output.
- Chassis vital product data.
- A copy of the
- The relevant VMware Service Request number, if opened.
- Chassis management chipsets often function as an intelligent handler for chassis faults and can capture significant amounts of information during an NMI event.
- The IBM xSeries chassis includes a BIOS option of Reboot on System NMI. When enabled, this results in an immediate chassis-reboot rather than a chassis-halt. In this event the ESXi host logs do not mention the NMI. Other enterprise hardware vendors may offer a similar BIOS option.
Note: Depending on the version of ESXi and the configuration, NMI log entries may appear in the /var/log/vmkernel or /var/log/messages log files, on the console, or in the VMkernel core dump file if the condition triggers a VMkernel purple diagnostic screen.
- The VMkernel log file at
/var/log/messagescontains entries similar to one of these:
ALERT: APIC: 1143: Lint1 interrupt on pcpu 0 (port x61 contains 0x5)
ALERT: APIC: 1150: Lint1 interrupt on pcpu 0 (port x61 contains 0xb1)
WARNING: NMI: 2550: Forwarding LINT1 motherboard interrupt to host (75188 forwarded so far)
Fatal NMI: IO Parity Error (0xNN)
Fatal NMI: RAM Parity Error (0xNN)
- The VMkernel log entries indicate that a Non-Maskable Interrupt (NMI) event occured.
- The messages log at
/var/log/messagescontains entries similar to:
kernel: [1831046.301319] Uhhuh. NMI received for unknown reason 11.
kernel: [1831046.323134] Dazed and confused, but trying to continue
- You may experience a purple screen when passing through a device to a virtual machine and when reviewing the vmkernel core dump, you see these events:
WARNING: IOMMUIntel: 2211: IOMMU Unit # 0: R/W = 1, Device 007:00.0 Faulting PA = 0xdf63e000 Fault Reason = 6
- In the purple diagnostic screen on a non-responsive host, running an ESXi 5.5 host, you see an entry similar to:
Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
YYYY-MM-DDT03:12:21.546Z cpu0:16969934)@BlueScreen: LINT1/NMI (motherboard nonmaskable interrupt), undiagnosed. This may be a hardware problem; please contact your hardware vendor.
- Español: Identificar y abordar eventos de interrupciones no enmascarables en un host ESX/ESXi (2073165)
- Português: Identificando e resolvendo eventos de interrupção não mascarável em um host ESX/ESXi (2130004)
- 日本語: ESX/ESXi ホスト上でマスク不可能割り込みイベントを特定および処理する (2076746)
- 简体中文: 确定并解决 ESX/ESXi 主机上的不可屏蔽中断事件 (2077552)