ESXi host with a CD-ROM drive model DU-8A5LH can cause PSOD with vmhci_driver backtrace
book
Article ID: 318727
calendar_today
Updated On:
Products
VMware vSphere ESXi
Show More
Show Less
Issue/Introduction
To avoid the PSOD.
Symptoms:
ESXi Host using CD-ROM drive model DU-8A5LH fails with PSOD referencing vmw_ahci with lines such as:
PanicvPanicInt@vmkernel#nover+0x545 stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Panic_WithBacktrace@vmkernel#nover+0x56 stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Heartbeat_DetectCPULockups@vmkernel#nover+0x4be xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Timer_BHHandler@vmkernel#nover+0xdc stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx BH_DrainAndDisableInterrupts@vmkernel#nover+0x7b stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx IntrCookie_VmkernelInterrupt@vmkernel#nover+0xc6 stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx IDT_IntrHandler@vmkernel#nover+0x9d stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx gate_entry_@vmkernel#nover+0x0 stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Power_ArchSetCState@vmkernel#nover+0x10a stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx CpuSchedIdleLoopInt@vmkernel#nover+0x39b stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx CpuSchedDispatch@vmkernel#nover+0x114a stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx CpuSchedWait@vmkernel#nover+0x27a stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx CpuSchedTimedWaitInt@vmkernel#nover+0xa8 stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx CpuSched_EventQueueWaitShared@vmkernel#nover+0x2c stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx UserThread_QueueWait@(user)#<None>+0x34 stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx LinuxThread_Futex@(user)#<None>+0x273 stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx User_LinuxSyscallHandler@(user)#<None>+0x113 stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx User_LinuxSyscallHandler@vmkernel#nover+0x1d stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx gate_entry_@vmkernel#nover+0x0 stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx and
Util_FormatTimestampUTC@vmkernel#nover+0x1e stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx LogFormatStringV@vmkernel#nover+0x9c stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx LogWarningWithPcpu@vmkernel#nover+0x40f stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx vmk_vLogNoLevel@vmkernel#nover+0x63 stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx vmk_LogNoLevel@vmkernel#nover+0x3e stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx CompletionBottomHalf@(vmw_ahci)#<None>+0x69b stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx HBAIntrHandler@(vmw_ahci)#<None>+0x84 stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx AHCI_EdgeIntrHandler@(vmw_ahci)#<None>+0x20 stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx IntrCookieBH@vmkernel#nover+0x1e0 stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx BH_Check@vmkernel#nover+0xfe stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx CpuSchedDispatch@vmkernel#nover+0xed4 stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx CpuSchedWait@vmkernel#nover+0x27a stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx CpuSched_NoEvqWait@vmkernel#nover+0x19 stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx vmk_WorldWait@vmkernel#nover+0x65 stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx ExceptionHandlerWorld@(vmw_ahci)#<None>+0x9b stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx vmkWorldFunc@vmkernel#nover+0x4f stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx CpuSched_StartWorld@vmkernel#nover+0x99 stack: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
vmkernel.log entries such as:
2018-12-14T13:51:11.701Z cpu11:66006)vmw_ahci[00000017]: CompletionBottomHalf:PORT_IRQ_UNK_FIS exception. 2018-12-14T13:51:11.701Z cpu11:66006)vmw_ahci[00000017]: LogExceptionSignal:Port 7, Signal: --|--|--|--|--|--|IR|--|--|--|--|-- (0x0040) Curr: --|--|--|--|--|--|IR|--|--|--|--|-- (0x0040) 2018-12-14T13:51:11.701Z cpu11:66006)vmw_ahci[00000017]: CompletionBottomHalf:strange irq(s), 0x10 2018-12-14T13:51:11.701Z cpu11:66006)vmw_ahci[00000017]: CompletionBottomHalf:PORT_IRQ_UNK_FIS exception. 2018-12-14T13:51:11.701Z cpu11:66006)vmw_ahci[00000017]: LogExceptionSignal:Port 7, Signal: --|--|--|--|--|--|IR|--|--|--|--|-- (0x0040) Curr: --|--|--|--|--|--|IR|--|--|--|--|-- (0x0040) 2018-12-14T13:51:11.701Z cpu11:66006)vmw_ahci[00000017]: CompletionBottomHalf:strange irq(s), 0x10 2018-12-14T13:51:11.701Z cpu11:66006)vmw_ahci[00000017]: CompletionBottomHalf:PORT_IRQ_UNK_FIS exception. 2018-12-14T13:51:11.701Z cpu11:66006)vmw_ahci[00000017]: LogExceptionSignal:Port 7, Signal: --|--|--|--|--|--|IR|--|--|--|--|-- (0x0040) Curr: --|--|--|--|--|--|IR|--|--|--|--|-- (0x0040) 2018-12-14T13:51:11.701Z cpu11:66006)vmw_ahci[00000017]: HBAIntrHandler:new interrupts coming, PxIS = 0x10, no repeat 2018-12-14T13:51:11.701Z cpu11:66006)vmw_ahci[00000017]: CompletionBottomHalf:strange irq(s), 0x10 Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
Environment
VMware vSphere 7.0.x VMware vSphere ESXi 6.7 VMware vSphere ESXi 6.5
Cause
ESXi host with a CD-ROM drive model DU-8A5LH, the CD-ROM drive might report an unknown File Interchange Service (FIS) exception. The vmw_ahci driver does not handle the exception properly and creates repeated PORT_IRQ_UNK_FIS exception logs in the kernel. The repeated logs cause lack of physical CPU heartbeat
Resolution
VMware Engineering are aware of this issue.
Workaround:
Disable vmw_ahci via the following command from the ESXi cmdline:
esxcli system module set -e false -m vmw_ahci
Additional Information
Impact/Risks: No Impact
Feedback
thumb_up
Yes
thumb_down
No