Search the VMware Knowledge Base (KB)
View by Article ID

vSAN on disk upgrade fails at 10% (2144881)

  • 6 Ratings
Language Editions

Symptoms

  • When upgrading vSAN, the On Disk Format Conversion task fails at 10%.
  • In the vSphere Web Client, you see an error similar to:

    A general system error occurred: Failed to realign following Virtual SAN objects: <uuid list>, due to being locked or lack of vmdk descriptor file which requires manual fix

  • The Convert disk format for vSAN task fails with a General Virtual SAN error status.

Note: For additional symptoms and log entries, see the Additional Information section.

Purpose

To troubleshoot the General Virtual SAN error status when upgrading vSAN, identify the affected objects and perform the corrective action suggested in this article.

Cause

When upgrading the on-disk format, during the 10% - 15% phase, vSAN realigns objects to prepare them for new features. The process is performed in two steps:
  • In the first step, vSAN realigns objects and their components to have a 1 MB address space. The process fails in this step if the cluster is unstable or if there is not enough disk space.
  • In the second step, vSAN realigns vsanSparse objects to be 4k aligned. The process fails if there are objects which cannot be upgraded to version 2.5.

    An object will fail to upgrade under these conditions:

    • The object is left behind and no longer referenced by anything.
    • The disk chain is not complete or is corrupted.

      Note: For an example scenario when the objects fail, see the Additional Information section.

Resolution

Note: The video below is specific to 6.2 but it is also applicable to vSAN 6.0/6.1 to 6.5 upgrades.


Caution
: Removing disks and objects is risky because it is possible that the objects are still in use. Always double check before you delete any object. If you are unsure about any of the steps detailed in this article, contact the VMware support.
 
To resolve this issue, identify the orphaned objects and take appropriate action to satisfy the upgrade conditions.
 
  1. Ensure the initial realignment is complete:

    • To rule out failures during the first step of the on disk upgrade check, review the vmware-vsan-health-service.log file.

      Note: The vmware-vsan-health-service.log file is located in these directories:

      • Windows vCenter Server: %Programdata%\VMware\vCenterServer\logs\vsan-health\
      • vCenter Server Appliance: /storage/log/vmware/vsan-health/

        During object realignment, you see entries similar to:

        2016-01-26T21:54:43.650Z INFO vsan-health[Thread-19] [VsanRealignClusterLib::QueryUnalignedStatus] Found 6 objects which aren't MB aligned
        2016-01-26T21:54:43.650Z INFO vsan-health[Thread-19] [VsanRealignClusterLib::CheckForUnalignedObjects] Fixing MB alignment on 2deaa756-0d63-0f53-690e-020003c607e5
        .
        .
        .
    • The first step of object realignment is complete when you see this entry:

      [VsanRealignClusterLib::CheckForUnalignedObjects] All Objects now MB aligned

      This means there are no known issues at this point if the cluster is stable and there is space available.
      If you do not see this line in the vmware-health-service.log file, review the errors returned after the alignment output.

  2. Review the output of the on disk upgrade failure Error stack and make note of the affected UUIDs in a text file. You will use this list of UUIDs to cross reference with the script output and confirm resolution of the issue.

    The Error stack reports the error:

    Failed to realign following virtual SAN objects:<UUID> due to being locked or lack of vmdk descriptor file, which requires manual fix.

    Copy the UUID(s) following the Failed to realign following Virtual SAN objects: string and save them to a text file.


  3. Download and save the attached zip file 2144881_VsanRealign.zip to the Windows machine you use to access the vSphere Client.

  4. Unzip 2144881_VsanRealign.zip and then use the Datastore browser to copy the VsanRealign.pyscript to a shared datastore on your ESXi host.

  5. VMware recommends you copy the script to a datastore initially and then copy using command line to /var/tmp.

    For example:

    cp /vmfs/volumes/VMFS1/VsanRealign.py /var/tmp/VsanRealign.py

  6. On the ESXi shell session, change directory to where you have saved the VsanRealign.pyscript.

    For example:

    cd /var/tmp/

  7. Run this command to start the script:

    python VsanRealign.py precheck

    Note: The namespace scan can take a long time.

    The script returns a list of vSAN objects with a problem and the recommended actions to allow to disk format upgrade to complete.

    You see output similar to:

    Finished scanning, compiling results
    -------------------------------------------------------------------
    The following objects were missing descriptor files.
    The recorded path doesn't exist and no other reference to the object was found.
    -------------------------------------------------------------------
    This will create descriptor files automatically for all disks under
    lostAndFound in the VSAN datastore.

    Other objects that aren't disks missing a descriptor will be removed permanently.

    NOTE: Recovered disks will not have any snapshot chain information in them.
    Any snapshots deltas will not be correctly recovered.

    39aef356-e4b9-c5bc-7c08-0200170d326c
    Object UUID: 4698f356-7865-aaae-fc20-0200170d326c
    Recorded Path: /vmfs/volumes/vsan:523e23728ba31d24-84ab9fc2821d6bdf/c695f356-3e2e-a40e-2ede-0200170d326c/linux-vm08-25632575.vswp
    Recorded VM: linux-vm08
    Object Class: vmswap
    Object Size: 536870912
    Parent directory exists
    AutoFix: Will remove object

    Object UUID: 8efdf456-c408-a163-2f8e-02001645d63b
    Recorded Path: /vmfs/volumes/vsan:5228981f7117d6eb-94106da8ad4a6377/88fdf456-85f5-bfac-51ff-02001645d63b/linux-vm-a.vmdk
    Recorded VM: None
    Object Class: vdisk
    Object Size: 2147483648
    Parent directory exists
    AutoFix: Will create new descriptor

    Descriptors will be created in /vmfs/volumes/vsan:5228981f7117d6eb-94106da8ad4a6377/lostAndFound

    NOTE: Recovered disks will not have any snapshot chain information in them.
    Any snapshots deltas will not be correctly recovered.

    Create 'linux-vm-a-8efdf456-c408-a163-2f8e-02001645d63b.vmdk' for 8efdf456-c408-a163-2f8e-02001645d63b
    Remove 4698f356-7865-aaae-fc20-0200170d326c type: vmswap vm name: linux-vm08


    When prompted for a decision to proceed with the AutoFix suggestions, enter yes to apply Autofix actions. In this case, the descriptor files for vdisk objects are recreated. The vmswap objects and all other objects that are not virtual disks missing a descriptor are permanently removed, as there is no useful data.

    Note: If you enter no, the AutoFix actions are not applied and you will need to take manual action.

  8. Review the report provided by the script for objects that are in use by Change Block Tracking that may cause an issue with the on-disk upgrade.

    You see a report similar to:

    The following objects are in use by Change Block Tracking and may encounter issues during upgrade.
    Rerun this script with the 'fixcbt' option if upgrade fails.

    If action is required, follow the instructions in VMware vSAN 6.2 on disk upgrade fails due to CBT enabled virtual disks (2144882) to resolve the CBT issues before proceeding to step 9. If no action is required, proceed with step 9.

  9. Per step 7, examine the lostAndFound directory on the vSAN Datastore. Examine the orphaned disks by attaching them to a non-production virtual machine to check file integrity and determine if this virtual disk is still applicable to your environment.

    Note: Realignment of these objects is still required. Run the on-disk upgrade process again to proceed.

Additional Information

Additional symptoms and log entries
  •  In the error stack, you see entries similar to:

    Failed to realign following Virtual SAN objects: 1e58f256-44f6-0201-3f17-02001823f4e0, f44ff256-11e1-1c0a-a06a-020018428a28, 8959f256-7ab7-f01b-913e-02001823f4e0, db50f256-29a3-2f2f-2404-02001823f4e0, 2358f256-9d3c-8b66-af4b-02001823f4e0, 1c58f256-bf10-b769-fb61-020018428a28, 7559f256-4da3-e787-ecbb-02001823f4e0, e850f256-10e0-eb8d-cac9-020018c78c91, 5858f256-4251-d5aa-d008-02001823f4e0, 0a51f256-f9de-3db4-c75b-020018428a28, 1e58f256-34fc-1cb7-f8aa-020018428a28, dd50f256-fb57-80c0-8894-020018d2303b, ec50f256-9792-1ddb-d757-020018428a28, 534ff256-4b1c-05e5-1092-020018428a28, de57f256-0c14-8bf0-635b-020018c78c91, 1658f256-3ae8-fcf7-a9a3-020018d2303b, due to being locked or lack of vmdk descriptor file, which requires manual fix.

  • In the vmware-vsan-health-service.log file, you see entries similar to:

    Note: The vmware-vsan-health-service.log file is located in these directories:

    • vCenter Server on Windows: %Programdata%\VMware\vCenterServer\logs\vsan-health\vmware-vsan-health-service.log
    • vCenter Server Appliance: /storage/log/vmware/vsan-health/vmware-vsan-health-service.log

2016-03-24T11:03:00.148Z DEBUG vsan-health[Thread-1223] [VsanRealignClusterLib::RealignClusterV3] Finished namespaces
2016-03-24T11:03:00.620Z INFO vsan-health[Thread-1223] [VsanRealignClusterLib::RealignClusterV3] After namespace realign 22 objects need realign, previously 22
2016-03-24T11:03:00.621Z INFO vsan-health[Thread-1223] [VsanRealignClusterLib::RealignClusterV3] Made no progress. 22 objects still need realigning
2016-03-24T11:03:00.621Z INFO vsan-health[Thread-1223] [VsanRealignClusterLib::RealignClusterV3] (str) [
'6e94f356-4d25-f804-5792-02001760ad31',
'fa90f356-e0bf-430b-ccc0-02001760ad31',
....
'e78ff356-bc5d-56ee-aa62-0200170d326c',
'818df356-3c44-d5f5-cf77-0200170d326c'
]
2016-03-24T11:03:00.623Z ERROR vsan-health[Thread-1223] [VsanVcDiskFormatConverterImpl::_Run] Failed to migrate vsanSparse objects.
2016-03-24T11:03:00.623Z ERROR vsan-health[Thread-1223] [VsanVcDiskFormatConverterImpl::_Run] Made no progress
Traceback (most recent call last):
File "/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanVcDiskFormatConverterImpl.py", line 1633, in _Run
self._HandleUserCancellation)
File "/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanRealignClusterLib.py", line 335, in RealignClusterV3
uuidRemaining=objectsNeedingRealign)
RealignFailed: Made no progress

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

 Monitor the upgrade status
 
To monitor the upgrade status at any point in time, see Monitoring the upgrade status in vSAN (2146218).

Tags

vSAN error, disk upgrade fails, vSAN objects locked, general vSAN error

See Also

Update History

04/18/2017 - Added vSAN 6.6 to Products

Attachments

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

Feedback

  • 6 Ratings

Did this article help you?
This article resolved my issue.
This article did not resolve my issue.
This article helped but additional information was required to resolve my issue.

What can we do to improve this information? (4000 or fewer characters)




Please enter the Captcha code before clicking Submit.
  • 6 Ratings
Actions
KB: