Connection to the /bootbank partition intermittently breaks when you use USB or SD devices
search cancel

Connection to the /bootbank partition intermittently breaks when you use USB or SD devices

book

Article ID: 318611

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:
  • ESXi host boots off of USB SD card.
  • Logs indicate boot drive is corrupt.
  • The problem may cause purple diagnostic screen errors (PSOD) on ESXi hosts.
    • If related to this problem, the diagnostic message should indicate or imply boot disk failure.
  • The partition that is corrupt is the VMFS-L locker partition.
  • vmkwarning.log may show entries similar to:
2021-02-02T19:26:34 cpu23:3614392)ALERT: Bootbank cannot be found at path '/bootbank'
  • vobd.log may show entries similar to:
2021-02-02T19:26:34.577Z: [vmfsCorrelator] 2938471510366us: [esx.problem.vmfs.resource.corruptondisk] 5fecb7ff-8662d6b5-7732-24b6fde3ff81 LOCKER-5fecb7ff-8662d6b5-7732-24b6fde3ff81
2021-02-02T19:31:38.981Z: [vmfsCorrelator] 2938700315575us: [vob.vmfs.resource.corruptondisk] Volume 5fecb7ff-8662d6b5-7732-24b6fde3ff81 ("LOCKER-5fecb7ff-8662d6b5-7732-24b6fde3ff81") might be damaged on the disk. Resource cluster metadata corruption has been detected.
  • vmkernel.log may show entries similar to:
2021-02-02T19:57:01.265Z cpu4:2098636)WARNING: [type 6] Invalid clusterNum 2251799813685250.
2021-02-02T19:57:01.265Z cpu4:2098636)WARNING: Res3: 7148: Volume 5fecb7ff-8662d6b5-7732-24b6fde3ff81 ("LOCKER-5fecb7ff-8662d6b5-7732-24b6fde3ff81") might be damaged on the disk. Resource cluster metadata corruption has been detected.
2021-02-02T19:57:01.265Z cpu4:2098636)WARNING: FS3: 633: VMFS volume LOCKER-5fecb7ff-8662d6b5-7732-24b6fde3ff81/5fecb7ff-8662d6b5-7732-24b6fde3ff81 on mpx.vmhba32:C0:T0:L0:7 has been detected corrupted
2021-02-02T19:38:02.798Z cpu7:2097218)ScsiDeviceIO: 4062: Cmd(0x45ba922e3e80) 0x1a, CmdSN 0x1fea2a0 from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x7 D:0x0 P:0x0 Invalid sense data: 0x0 0x0 0x0.
2021-02-02T19:38:02.935Z cpu14:2097225)ScsiDeviceIO: 4062: Cmd(0x45ba92263280) 0x1a, CmdSN 0x1fea2a8 from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x7 D:0x0 P:0x0 Invalid sense data: 0xc7 0xf 0x43.


NOTE: The preceding log excerpts are only examples.Date,time and environmental variables may vary depending on your environment.


Environment

VMware vSphere ESXi 7.0.0
VMware vSphere ESXi 6.7

Cause

Device disconnection happened on USB hardware on ESXi.This is seen on xHCI controller that, when commands fail, and USB bus reset happens after one retry it will lead to all USB device (including USB SD card)reconnection. When USB boot device is reconnected, ESXi host may not be  able to release path resource and will consider a new device is plugged in and give a new path. So from ESXi host, it shows boot device is lost.

Resolution

This issue is resolved in VMware vSphere ESXi 7.0 U2c. For more information refer to ESXi 7.0 U2c release notes .To  download go to the Customer Connect Patch Downloads page. 
This issue is resolved in  VMware vSphere  ESXi 6.7 U3p. For more information refer to ESXi 6.7 U3p release notes.To download go to the Customer Connect Patch Downloads page.

NOTE:As a best practice, do not set dump partition on USB storage device and do not set USB devices under a heavy workload. For more information, see VMware knowledge base articles Configuring ESXi coredump to file instead of partition and High frequency of read operations on VMware Tools image may cause SD card corruption

Workaround:
If SD card is lower tolerant devices, we can reduce heavy access to SD cards by following below steps
The ToolsRamdisk advanced option tells ESXi to copy vmtools, which exists on the SD-Card, to RAMdisk. It mainly happens on vsan or VDI environment. This creates the advanced option, giving it an accepted value of 0 (disable) or 1 (enable).
  • To do this run below command
# esxcfg-advcfg -A ToolsRamdisk --add-desc "Use VMware Tools repository from /tools ramdisk" --add-default "0" --add-type 'int' --add-min "0" --add-max "1"
# esxcli system settings advanced set -o /UserVars/ToolsRamdisk -i 1
# reboot

 This sets the ToolsRamdisk option to enable. On bootup ESXi will check the value of ToolsRamdisk and copy the files to the /tools ramdisk.To verify this, run the command:
# esxcli system visorfs ramdisk list
This command shows the ramdisks that have been created. In particular, the /tools ramdisk appears in this output, and it has a Used value of 200MB, so the tools have been copied to there successfully

For more information refer to  VMware KB :High frequency of read operations on VMware Tools image may cause SD card corruption 

Future ESXi 7.0.x version will have this advanced option set automatically. Refer VMware KB :ToolsRamdisk option is not available with ESXi 7.0.x releases

This issue may occur on several releases of ESXi, however the likelihood of experiencing the behaviour is higher on ESXi 7.0 due to some changes in the product that require better performance and endurance from the boot device as noted below:

Starting in ESXi 7.0, the boot partition is formatted as VMFS-L instead of FAT (previous releases) to improve I/O performance.


Additional Information



Impact/Risks:
Requires ESXi host reboot.