Virtual Machines running on VMware vSAN 7.0 U1 / 7.0 U1 P02 report in-guest data consistency following a concurrent maintenance activity and storage policy change.
search cancel

Virtual Machines running on VMware vSAN 7.0 U1 / 7.0 U1 P02 report in-guest data consistency following a concurrent maintenance activity and storage policy change.

book

Article ID: 326648

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

This article provides information regarding a potential data inconsistency when very specific conditions are met for VMDK's residing on VMware vSAN 7.0 U1/vSAN 7.0 U1 P02

Symptoms:
After exiting host(s) from maintenance mode, Applications such as databases report in-guest data inconsistency on a VMDK.
It occurs in the following conditions.

- Host entered into maintenance mode with Ensure Accessibility.
- During exit maintenance mode on host(s), a concurrent storage policy change was made on VM(s).
- The storage policy changes most match one of the following conditions:
RAID-0→RAID-5, RAID-0→RAID-6, RAID-1→RAID-5, RAID-1→RAID-6, RAID-5→RAID-6, RAID-6→RAID-5
* This KB DOES NOT apply to other Storage Policy changes which are NOT listed above.

 


Environment

VMware vSAN 7.0.x

Cause

This issue occurs in VMware vSAN 7.0 U1 and later when virtual machine disks residing on vSAN datastore, undergoes a storage policy change (described in symptoms) and concurrently an exit maintenance mode task was performed against a host which was previously put in maintenance mode with ensure accessibility. This is a rare occurrence, requiring specific conditions to be met.

Resolution

This is a known issue affecting ONLY vSAN 7.0 U1 and vSAN 7.0 U1- Patch 02 with On-Disk-Format version 13 under very specific conditions.

VMware recommends applying the configuration changes suggested in the "Workaround" section to mitigate the risk of encountering this issue proactively on vSAN 7.0 U1 / vSAN 7.0 U1- Patch 02 releases with On-Disk-Format version 13 if an upgrade is not possible.

The patch to remediate future impact, and address existing VMs which may be exposed to the potential impact, is available in vSAN 7.0 U1d and higher.

The fix is also available in version vSAN 7.0 U2 GA and higher.
VMware recommends applying the workaround and then upgrading to a fixed release if you have already hit this issue.


Note: Once the cluster has been upgraded to a fixed in version revert the workaround by running command esxcfg-advcfg -s 1 /VSAN/DeltaComponent on all hosts in the cluster.

Note: If your application reports data inconsistency errors, contact VMware Support and note this Knowledge Base article in the problem description. To contact VMware support, see Filing a Support Request in Customer Connect (2006985) or How to Submit a Support Request.

 


Workaround:

To work around this issue, and avoid future occurrence of this issue, set the advanced parameter on each Host for DeltaComponent to "0". This must be completed on all vSAN nodes, including any vSAN Witness Hosts, this step has to be also performed on all new hosts that are added to the cluster in the future.
 

Please note that this has to be done on all hosts including the witness host in case of 2-node stretched or stretched clusters.

  1. Download the attached python file "setConfigOption.py"
  2. Transfer the file to /tmp/ directory of every ESXi host in the vSAN cluster including the witness server.
  3. Log in to the ESXi host through SSH as root.
  4. Run command "python /tmp/setConfigOption.py" on all the hosts in the cluster.
  5. Repeat this process on all ESXi hosts in the vSAN Cluster.

  6. If the Script fails with any of the following errors, Contact VMware Support for further assistance.

    "Error while disabling delta comps feature. Exiting..."
    "Found delta components before disabling config. Exiting..."
    "Found deltas after disabling feature. Re enabling"
    
    Sample Output:
    [root@localhost:/tmp] python setConfigOption.py
    cmmds-tool find -t DOM_OBJECT
    Disabling Delta Comps feature
    esxcfg-advcfg -s 0 /VSAN/DeltaComponent
    Sleeping 70s for config option to take effect. Zzz..
    cmmds-tool find -t DOM_OBJECT
    esxcfg-advcfg -g /VSAN/DeltaComponent
    b'Value of DeltaComponent is 0\n'
    [root@localhost:/tmp] esxcfg-advcfg -g /VSAN/DeltaComponent
    Value of DeltaComponent is 0

Note:
No reboot or service restart is required for this change to take effect, it will become effective within 60 seconds. Once the workaround is applied, maintenance tasks and policy changes can be performed concurrently after applying the workaround.
This change is persistent across Host reboots and can be made at any time with no impact to running Production workloads.


Additional Information

Impact/Risks:
Applications such as databases may report in-guest data inconsistency errors on VMDK's residing on VMware vSAN under specific conditions.

Attachments

setConfigOption get_app