Bootbank loads in /tmp/ after reboot of ESXi 7.0 Update 1 host
search cancel

Bootbank loads in /tmp/ after reboot of ESXi 7.0 Update 1 host

book

Article ID: 318029

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • Configuration changes not persisting across host reboot operations
  • The ESXi host may roll back to a previous configuration
  • In vSAN environments, the health status may report warnings of EPD status as abnormal
For example:
  • From the ESXi shell, you see the bootbank, altbootbank and scratch partitions directing to /tmp/
For example:

  • NOTE:-  There is a similar issue where post upgrade of vSphere 7.0 Update 1 and during boot, ESXi attempts to update the Software FCOE configuration for a nic and the resulting configstore write failures eventually leads to incorrect/missing bootbank/altbootbank symlinks, this issue also ends up in the same situation symptoms as per this article, if there is FCoE invoved in your environment then make sure you also reference this article for workaround :-  https://kb.vmware.com/s/article/81722


Environment

VMware vSphere ESXi 7.0.0

Resolution

Cause

The storage-path-claim service claims the storage device ESXi is installed on during startup. This causes the bootbank/altbootbank image to become unavailable and the ESXi host reverts to ramdisk.

Resolution

This issue has been fixed in VMware ESXi 7.0 Update 1c

Workaround

To workaround this issue, delay the storage-path-claim service to allow ESXi to retrieve the correct bootbank image from its storage device.

Step 1 - Determine the required delay (in seconds)
  1. On the ESXi shell or SSH, run the following command:

grep storage-path-claim /var/log/sysboot.log

  1. Note the time highlighted (hh:mm:ss) in the example log output below (log timestamps will vary)

[2020-10-23 22:51:03.218874 - 2020-10-23 22:51:33.436044] sysboot: storage-path-claim

  1. Run the following command to determine when the volume has mounted
grep 'mounted.*rw' /var/run/log/vobd.log|tail -1
  1. Note the timestamp in the log output (hh:mm:ss), example below:

2020-10-23T22:52:02.413Z: [vmfsCorrelator] 66257428us: [esx.audit.vmfs.volume.mounted] File system [LOCKER-12345678-12345678-abcd-1234567890abcd, ....] on volume 12345678-12345678-abcd-1234567890abcd has been mounted in 'rw' mode. The datastore is now accessible on this host.

  1. Determine the delay (in seconds) required for the next set of instructions.

22:52:02 - 22:51:33 = 29 seconds
 

Note: it is recommended to use a value of 30 seconds or higher for the next steps. In rare cases this value has been observed to go up to 240 seconds (4 minutes).

Perform one of the following steps to apply the workaround.

Step 2a - Applying the workaround (when console access to ESXi is available)
 
Before you begin: If you do not have console access to the ESXi host, please refer to Step 2b instructions
  1.  Reboot the ESXi host.
  2.  During the pre-boot splash screen, press SHIFT+O to modify the boot option line.
  3.  In the resulting screen, move to the end of the boot line.
  4.  Add the following to the same line: 

devListStabilityCount=30

For example:

  1. Press the enter key to resume boot.
  2. Login to the ESXi shell or SSH after the system has rebooted
  3. Run the following command:

ls -al /

The output will list the links for bootbank and altbootbank. Confirm the path shows as /vmfs/volumes/UUID for both.

Note: If the path remains as /tmp/, you must reboot the ESXi host and increase the devListStabilityCount value in the boot option line (see earlier steps). Do not proceed further with the instructions until this has been corrected.
  1. Once the bootbank/altbootbank directories are directed to /vmfs/volumes/UUID, navigate to the bootbank directory:

cd /bootbank/

  1. Make a backup of the boot.cfg

cp boot.cfg boot.cfg.bak

  1. Edit boot.cfg to add the following setting to the line beginning with kernelopt
devListStabilityCount= the value you set in boot option line
Before:


After:
  1. Save the changes, and reboot the ESXi host to apply the workaround.
Step 2b - Applying the workaround (without direct console access)
  1. Navigate to the bootbank directory:

cd /vmfs/volumes/BOOTBANK*/

  1. Make a backup of the boot.cfg

cp boot.cfg boot.cfg.bak

  1. Edit boot.cfg to add the following setting to the line beginning with kernelopt

devListStabilityCount= the value you determined in the beginning of Step 1

Before:



After:
  1. Save the changes, and reboot the ESXi host to apply the workaround.
  2. Login via SSH after the system has rebooted
  3. Run the following command:

ls -al /

The output will list the links for bootbank and altbootbank. Confirm the path shows as /vmfs/volumes/UUID for both.

Note: If the path remains as /tmp/, you must go back to the start of step 2b and increase the devListStabilityCount value in the boot.cfg file (see earlier steps).