Search the VMware Knowledge Base (KB)
View by Article ID

VMware vSAN 6.2 on disk upgrade fails due to CBT enabled virtual disks (2144882)

  • 4 Ratings
Language Editions

Symptoms

  • Attempting to upgrade vSAN to version 6.2 fails during the on disk upgrade process with errors relating to Changed Block Tracking (CBT) enabled virtual machine disks.
  • The vSAN Disk Format Conversion task fails at 10%.
  • The Convert disk format for vSAN task fails with a General vSAN error. status.

Note: For additional symptoms and log entries, see the Additional Information section.

Purpose

To resolve the General vSAN error status when upgrading vSAN to version 6.2, install the patch ESXi600-201605001.

Cause

When upgrading the on-disk format, during the phase between 10% - 15%, vSAN realigns objects to prepare them for new features. The process is performed in two steps.
  • During the first step, vSAN realigns objects and their components to have a 1 MB address space.

    The process will fail during this step if the cluster is unstable or if there is not enough disk space.

  • During the second step, vSAN realigns vsanSparse objects (typically snapshot objects on the VSAN datastore) to be 4k aligned.

    The process fails if objects exist that cannot be upgraded to version 2.5 (which is an interim on-disk format version only used during the upgrade from versions 2 to 3).

    An object will fail to upgrade under these conditions:

    • The object is left behind and no longer referenced by anything.
    • The disk chain is not complete or is corrupted.

Resolution

This issue is resolved in VMware ESXi 6.0, Patch Release ESXi600-201605001. For more information, see VMWare Downloads.
To download the patch file, use Release Date May 12, 2016 and Build number 3825889.
 
To work around this issue if you do not want to upgrade, identify the affected objects and take appropriate action to satisfy the upgrade conditions.

  1. Ensure the initial realignment is complete:

    • To rule out failures during the first step of the on disk upgrade check, review the vmware-health-service.log file. During step one of object realignment, you see lines similar to:

      Note: The vmware-health-service.log file is located in these directories:

      • Windows vCenter Server: %Programdata%\VMware\vCenterServer\logs\vsan-health\
      • vCenter Server Appliance: /storage/log/vmware/vsan-health/

        2016-01-26T21:54:43.650Z INFO vsan-health[Thread-19] [VsanRealignClusterLib::QueryUnalignedStatus] Found 6 objects which aren't MB aligned
        2016-01-26T21:54:43.650Z INFO vsan-health[Thread-19] [VsanRealignClusterLib::CheckForUnalignedObjects] Fixing MB alignment on 2deaa756-0d63-0f53-690e-020003c607e5...

    • The first step of object realignment is complete when you see this entry:

      [VsanRealignClusterLib::CheckForUnalignedObjects] All Objects now MB aligned

      This means there are no known issues at this point if the cluster is stable and there is space available.
      If you do not see this line in the vmware-health-service.log file, review the errors returned after the alignment output.

  2. Review the list of UUIDs in the disk upgrade failure Error stack message in the vSphere Client and cross reference it with the precheck output after running the VsanRealign.pyscript, see step 7 for more information.

  3. Download and save the attached zip file 2144882_VsanRealign.zip to the Windows machine you use to access the vSphere Client.

  4. Unzip 2144882_VsanRealign.zip and then use the Datastore browser to copy the VsanRealign.py script to a shared datastore on your ESXi host.

  5. VMware recommends you to copy the script to a datastore initially and then copy using command line to /var/tmp.

    For example:

    cp /vmfs/volumes/VMFS1/VsanRealign.py/var/tmp/VsanRealign.py


    Notes:

    • Any ESXi host in the cluster can be used to run the Vsanrealign.pyscript.
    • If there is no VMFS or NFS based datastore mounted to the ESXi host, then copy the Vsanrealign.py to the /var/tmp/ directory using scp or winscp.

  6. On the ESXi shell session, change directory to where you have saved the VsanRealign.pyscript.

    For example:

    cd /var/tmp/

  7. Run this command to start the precheck process of the script:
         
    python VsanRealign.py precheck

    For example:

    -------------------------------------------------------------------
    The following objects are in use by Change Block Tracking and may encounter issues during upgrade.
    Rerun this script with the 'fixcbt' option if upgrade fails.
    -------------------------------------------------------------------
    Object UUID: f491f356-7d55-3bc9-184f-02001782eb4d
    Recorded Path: /vmfs/volumes/vsan:523e23728ba31d24-84ab9fc2821d6bdf/9991f356-91a3-6444-2264-02001782eb4d/linux-vm07.vmdk
    Recorded VM: linux-vm07
    .
    .
    .

    Note: Even if the precheck finds no issues, proceed to step 8 because CBT issues may not be discovered initially.

  8. Run this command to start the fixcbt process of the script:

    python VsanRealign.py fixcbt

    Note: The namespace scan can take a long time.

    The fixcbt script runs an initial namespace scan, the CBT fix is applied to eligible disks and then a further namespace scan occurs. You see output similar to:

    python VsanRealign.py fixcbt
    Starting namespace scan

    Finished scanning, compiling results
    -------------------------------------------------------------------
    Script will attempt to fix all disks stuck with Change Block Tracking failures
    -------------------------------------------------------------------
    Marked /vmfs/volumes/vsan:523e23728ba31d24-84ab9fc2821d6bdf/9991f356-91a3-6444-2264-02001782eb4d/linux-vm07.vmdk (f491f356-7d55-3bc9-184f-02001782eb4d) as upgraded
    Marked /vmfs/volumes/vsan:523e23728ba31d24-84ab9fc2821d6bdf/0e91f356-d09c-6ef7-cef1-02001760ad31/linux-vm03.vmdk (3391f356-25d3-1d42-3792-02001760ad31) as upgraded
    Marked /vmfs/volumes/vsan:523e23728ba31d24-84ab9fc2821d6bdf/a295f356-6c2f-c35a-7a78-020017c59620/linux-vm04.vmdk (a795f356-5265-cae5-d176-020017c59620) as upgraded
    .
    .
    .
    Marked /vmfs/volumes/vsan:523e23728ba31d24-84ab9fc2821d6bdf/9991f356-91a3-6444-2264-02001782eb4d/linux-vm07_1.vmdk (fe93f356-bada-6b26-be31-02001782eb4d) as upgraded
    [root@sc-rdops-vm02-dhcp-38-170:/vmfs/volumes/56e9bcac-87e4e61a-1785-005056a7ebe5/var/tmp] python VsanRealign.py fixcbt
    Starting namespace scan

    The script does a final scan to ensure that no further virtual disks are found to be affected by CBT issues. The second scan will also identify any virtual machines that may have been migrated, which would prevent modification due to the descriptor file being locked by another ESXi host.

    Note
    : If you see output from the second namespace scan, run step 8 again to resolve the outstanding issues.

  9. Run the Convert disk format for vSAN process again.

Additional Information

Additional symptoms and log entries
  • In the Error stack, you see entries similar to:

    Failed to realign following vSAN objects: 1e58f256-44f6-0201-3f17-02001823f4e0, f44ff256-11e1-1c0a-a06a-020018428a28, 8959f256-7ab7-f01b-913e-02001823f4e0, db50f256-29a3-2f2f-2404-02001823f4e0, 2358f256-9d3c-8b66-af4b-02001823f4e0, 1c58f256-bf10-b769-fb61-020018428a28, 7559f256-4da3-e787-ecbb-02001823f4e0, e850f256-10e0-eb8d-cac9-020018c78c91, 5858f256-4251-d5aa-d008-02001823f4e0, 0a51f256-f9de-3db4-c75b-020018428a28, 1e58f256-34fc-1cb7-f8aa-020018428a28, dd50f256-fb57-80c0-8894-020018d2303b, ec50f256-9792-1ddb-d757-020018428a28, 534ff256-4b1c-05e5-1092-020018428a28, de57f256-0c14-8bf0-635b-020018c78c91, 1658f256-3ae8-fcf7-a9a3-020018d2303b, due to being locked or lack of vmdk descriptor file, which requires manual fix.

  • In the vmware-vsan-health-service.log file, you see entries similar to:

    Note: The vmware-vsan-health-service.log file is located in these directories:

    • vCenter Server on Windows: %Programdata%\VMware\vCenterServer\logs\vsan-health\vmware-vsan-health-service.log
    • vCenter Server Appliance: /storage/log/vmware/vsan-health/vmware-vsan-health-service.log
2016-03-24T11:03:00.148Z DEBUG vsan-health[Thread-1223] [VsanRealignClusterLib::RealignClusterV3] Finished namespaces
2016-03-24T11:03:00.620Z INFO vsan-health[Thread-1223] [VsanRealignClusterLib::RealignClusterV3] After namespace realign 22 objects need realign, previously 22
2016-03-24T11:03:00.621Z INFO vsan-health[Thread-1223] [VsanRealignClusterLib::RealignClusterV3] Made no progress. 22 objects still need realigning
2016-03-24T11:03:00.621Z INFO vsan-health[Thread-1223] [VsanRealignClusterLib::RealignClusterV3] (str) [
'6e94f356-4d25-f804-5792-02001760ad31',
'fa90f356-e0bf-430b-ccc0-02001760ad31',
....
'e78ff356-bc5d-56ee-aa62-0200170d326c',
'818df356-3c44-d5f5-cf77-0200170d326c'
]
2016-03-24T11:03:00.623Z ERROR vsan-health[Thread-1223] [VsanVcDiskFormatConverterImpl::_Run] Failed to migrate vsanSparse objects.
2016-03-24T11:03:00.623Z ERROR vsan-health[Thread-1223] [VsanVcDiskFormatConverterImpl::_Run] Made no progress
Traceback (most recent call last):
File "/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanVcDiskFormatConverterImpl.py", line 1633, in _Run
self._HandleUserCancellation)
File "/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanRealignClusterLib.py", line 335, in RealignClusterV3
uuidRemaining=objectsNeedingRealign)
RealignFailed: Made no progress


Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
Monitor the upgrade status
 
You can monitor the upgrade status at any point in time. For more information, see Monitoring the upgrade status in vSAN (2146218).

Tags

CBT enabled disks, disk upgrade fails, General vSAN error, General vSAN error

See Also

Update History

05/12/2016 - Added fixed information.

Attachments

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

Feedback

  • 4 Ratings

Did this article help you?
This article resolved my issue.
This article did not resolve my issue.
This article helped but additional information was required to resolve my issue.

What can we do to improve this information? (4000 or fewer characters)




Please enter the Captcha code before clicking Submit.
  • 4 Ratings
Actions
KB: