Search the VMware Knowledge Base (KB)
View by Article ID

ESXi host fails with PSOD after upgrading to 6.5 (2151749)

  • 2 Ratings

Symptoms

  • Post ESXi upgrade to 6.5 Update 1 experience PSOD.
  • Issue is observed when host NIC  is equipped with 10GB.
  • The purple diagnostic screen contains back traces similar to:

    2017-09-16T15:34:30.908Z cpu6:65645)@BlueScreen: #PF Exception 14 in world 65645:HELPER_UPLIN IP 0x41802c496258 addr 0x0
    PTEs:0x292379a027;0x2efe54c027;0xbfffffffff001;
    2017-09-16T15:34:30.908Z cpu6:65645)Code start: 0x41802c200000 VMK uptime: 4:02:26:10.151
    2017-09-16T15:34:30.908Z cpu6:65645)0x4390c369bd00:[0x41802c496258]UplinkTreePackQueueFilters@vmkernel#nover+0x188 stack: 0xe15427000
    2017-09-16T15:34:30.909Z cpu6:65645)0x4390c369bd90:[0x41802c49e142]UplinkLB_LoadBalanceCB@vmkernel#nover+0x1e42 stack: 0x1
    2017-09-16T15:34:30.909Z cpu6:65645)0x4390c369bf20:[0x41802c4916f2]UplinkAsyncProcessCallsHelperCB@vmkernel#nover+0x116 stack: 0x43048761eac0
    2017-09-16T15:34:30.910Z cpu6:65645)0x4390c369bf50:[0x41802c2c9e0d]helpFunc@vmkernel#nover+0x3c5 stack: 0x4300b9b2a050
    2017-09-16T15:34:30.910Z cpu6:65645)0x4390c369bfe0:[0x41802c4c91b5]CpuSched_StartWorld@vmkernel#nover+0x99 stack: 0x0
    2017-09-16T15:34:30.913Z cpu6:65645)base fs=0x0 gs=0x418041800000 Kgs=0x0

  • You see similar stack trace:

    gdb) bt
    #0 UplinkTreePackQueueFilters (dev=dev@entry=0x430671a711c0, fromQueue=0x430671ba3c88, toQueue=0x430671ba3d90, fromQCpuLoad=4, toQCpuLoad=< optimized out >, numFiltersToMove=< optimized out >)at bora/vmkernel/net/uplink/lb.c:7121
    #1 0x000041801649dffe in UplinkTreeLB_DoOpportunisticPacking (dev=0x430671a711c0) at bora/vmkernel/net/uplink/lb.c:10631
    #2 UplinkRxQueuesLoadBalance (dev=0x430671a711c0) at bora/vmkernel/net/uplink/lb.c:11545
    #3 UplinkLB_LoadBalanceCB (dev=0x430671a711c0, arg=< optimized out >) at bora/vmkernel/net/uplink/lb.c:12633
    #4 0x000041801648fe12 in UplinkAsyncProcessCallsHelperCB (arg=0x430671a711c0) at bora/vmkernel/net/uplink/async.c:442
    #5 0x00004180162c92fd in helpFunc (data=< optimized out >) at bora/vmkernel/main/helper.c:3279
    #6 0x00004180164c8c95 in CpuSched_StartWorld (destWorld=< optimized out >, previous=< optimized out >) at bora/vmkernel/sched/cpusched.c:10780
    #7 0x0000000000000000 in ?? () from /bldmnt/storage61/release/bora-5310538/build/linux64/bora/build/esx/release/vmkmod-vmkernel64/chardevs
    (gdb) info locals

    filterCpuLoad = < optimized out >
    toQNewCpuLoad = 17
    rxf = 0x430671b589c8
    i = 1
    movedFilters = < optimized out >
    idx = < optimized out >
    newIdx = {qidx = 4, fidx = 2}
    fromQueueLoad = 4



    Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Cause

This issue occurs because Netqueue commit phase abruptly stop due to the failure of hardware activation of a Rx queue. As a result, Internal data-structure of the Netqueue layer's could go out of sync with the device and cause PSOD. 


Resolution

This is a known issue affecting ESXi 6.5

Currently,there is no resolution.

To workaround this issue,downgrade ESXi to 6.0 U2.

For more information see: Reverting to a previous version of ESXi KB 1033604.

Impact/Risks

Host crash, need hard reboot

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

Feedback

  • 2 Ratings

Did this article help you?
This article resolved my issue.
This article did not resolve my issue.
This article helped but additional information was required to resolve my issue.

What can we do to improve this information? (4000 or fewer characters)




Please enter the Captcha code before clicking Submit.
  • 2 Ratings
Actions
KB: