Knowledge Base

The VMware Knowledge Base provides support solutions, error messages and troubleshooting guides
 
Search the VMware Knowledge Base (KB)   View by Article ID
 

ESXi/ESX hosts with visibility to RDM LUNs being used by MSCS nodes with RDMs may take a long time to boot or during LUN rescan (1016106)

Symptoms

  • ESXi/ESX 4.x and ESXi 5.x hosts take a long time to boot. This time depends on the number of RDMs that are attached to the ESXi/ESX host.

    Note: In a system with 10 RDMs used in an MSCS cluster with two nodes, a reboot of the ESXi/ESX host with the secondary node takes approximately 30 minutes. In a system with less RDMs, the reboot time is less. For example, if only three RDMs are used, the reboot time is approximately 10 minutes.

  • ESXi intermittently shows an error message "Cannot synchronize host hostname. Operation Timed out. " on the Summary Tab and vSphere Client may not be able to start.

  • The screen logging shows the boot waiting after this message:

    Loading module multiextent.

  • The cluster is running virtual machines participating in an MSCS using shared RDMs and SCSI Reservations across hosts, and a virtual machine on another host is the active cluster node holding a SCSI Reservation.

  • Delays appear at these steps:

    • Starting path claiming and SCSI device discovery

      In the VMkernel log of the rebooting ESXi host (check the log file depending on the version of ESXi), you see entries similar to:

      Sep 24 12:25:36 cs-tse-d54 vmkernel: 0:00:01:57.828 cpu0:4096)WARNING: ScsiCore: 1353: Power-on Reset occurred on naa.6006016045502500176a24d34fbbdf11
      Sep 24 12:25:36 cs-tse-d54 vmkernel: 0:00:01:57.830 cpu0:4096)VMNIX: VmkDev: 2122: Added SCSI device vml0:3:0 (naa.6006016045502500166a24d34fbbdf11)
      Sep 24 12:25:36 cs-tse-d54 vmkernel: 0:00:02:37.842 cpu3:4099)ScsiDeviceIO: 1672: Command 0x1a to device "naa.6006016045502500176a24d34fbbdf11" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0


    • Mounting the partition of the RDM LUNs

      In the VMkernel log of the rebooting ESXi/ESX host, you see entries similar to:

      Sep 24 12:25:37 cs-tse-d54 vmkernel: 0:00:08:58.811 cpu2:4098)WARNING: ScsiCore: 1353: Power-on Reset occurred on naa.600601604550250083489d914fbbdf11
      Sep 24 12:25:37 cs-tse-d54 vmkernel: 0:00:08:58.814 cpu0:4096)VMNIX: VmkDev: 2122: Added SCSI device vml0:9:0 (naa.600601604550250082489d914fbbdf11)
      Sep 24 12:25:37 cs-tse-d54 vmkernel: 0:00:09:38.855 cpu2:4098)ScsiDeviceIO: 1672: Command 0x1a to device "naa.600601604550250083489d914fbbdf11" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
      Sep 24 12:25:37 cs-tse-d54 vmkernel: 0:00:09:38.855 cpu1:4111)ScsiDeviceIO: 4494: Could not detect setting of QErr for device naa.600601604550250083489d914fbbdf11. Error Failure.
      Sep 24 12:25:37 cs-tse-d54 vmkernel: 0:00:10:08.945 cpu1:4111)WARNING: Partition: 801: Partition table read from device naa.600601604550250083489d914fbbdf11 failed: I/O error
      Sep 24 12:25:37 cs-tse-d54 vmkernel: 0:00:10:08.945 cpu1:4111)ScsiDevice: 2200: Successfully registered device "naa.600601604550250083489d914fbbdf11" from plugin "NMP" of type 0


      Oct 5 14:21:03 vmkernel: 47:02:52:19.382 cpu17:9624)WARNING: NMP: nmp_IsSupportedPResvCommand: Unsupported Persistent Reservation Command,service action 0 type 4
      Oct 5 14:21:03 vmkernel: 47:02:52:19.383 cpu17:9624)WARNING: NMP: nmp_IsSupportedPResvCommand: Unsupported Persistent Reservation Command,service action 0 type 4
      Oct 5 14:21:03 vmkernel: 47:02:52:19.383 cpu23:9621)WARNING: NMP: nmp_IsSupportedPResvCommand: Unsupported Persistent Reservation Command,service action 0 type 4
      Oct 5 14:21:03 vmkernel: 47:02:52:19.383 cpu17:9624)WARNING: NMP: nmp_IsSupportedPResvCommand: Unsupported Persistent Reservation Command,service action 0 type 4
      Oct 5 14:21:03 vmkernel: 47:02:52:19.383 cpu12:4108)WARNING: NMP: nmpUpdatePResvStateSuccess: Parameter List Length 54310000 for service action 0 is beyondthe supported value 18
      Oct 5 14:21:03 vmkernel: 47:02:52:19.383 cpu12:4108)WARNING: NMP: nmpUpdatePResvStateSuccess: Parameter List Length 54310000 for service action 0 is beyondthe supported value 18
      Oct 5 14:21:03 vmkernel: 47:02:52:19.383 cpu3:5733)WARNING: NMP: nmpUpdatePResvStateSuccess: Parameter List Length 54310000 for service action 0 is beyondthe supported value 18
      Oct 5 14:21:03 vmkernel: 47:02:52:19.384 cpu12:9738)WARNING: NMP: nmpUpdatePResvStateSuccess: Parameter List Length 54310000 for service action 0 is beyondthe supported value 18
      Oct 5 14:21:05 vmkernel: 47:02:52:21.383 cpu23:9621)WARNING: NMP: nmp_IsSupportedPResvCommand: Unsupported Persistent Reservation Command,service action 0 type 4


  • If you configure the setting on an existing VMFS LUN, you may see these errors in the vmkernel.log file:

    YYYY-MM-DDT13:34:04.247Z cpu4:10169)WARNING: Partition: 1273: Device "naa.XXXXXXXXXXXXXXXXXXXxxxxxxxxxxxxx" with a VMFS partition is marked perennially reserved. This is not supported and may lead to data loss.
    YYYY-MM-DDT13:34:04.248Z cpu4:10169)WARNING: Partition: 1273: Device "naa.XXXXXXXXXXXXXXXXXXXxxxxxxxxxxxxx" with a VMFS partition is marked perennially reserved. This is not supported and may lead to data loss.
    YYYY-MM-DDT13:34:04.255Z cpu4:10169)WARNING: Partition: 1273: Device "naa.XXXXXXXXXXXXXXXXXXXxxxxxxxxxxxxx" with a VMFS partition is marked perennially reserved. This is not supported and may lead to data loss.

Purpose

This article describes a specific issue. If you experience all of the above symptoms, consult the sections below.

If you are experiencing only some of the symptoms, search the Knowledge Base for your symptoms or see:

Resolution

ESXi/ESX 4.x

This issue is resolved in the VMware ESXi/ESX 4.1 patch released on 2011-07-28. For more information, see:
In addition to installing the patch, modify this advanced configuration option on the affected ESXi/ESX hosts to speed up the boot process:
  • ESXi/ESX 4.1: Change the advanced option Scsi.CRTimeoutDuringBoot to 1.
  • ESXi/ESX 4.0: Change the advanced option Scsi.UWConflictRetries to 80.
For more information on changing advanced configuration options, see Configuring advanced options for ESXi/ESX (1038578).

On ESXi/ESX 4.1, if the rescan times are still extended, the best option to resolve the issue is to upgrade the host to ESXi 5.0, which includes both of the fixes above (that is, the Patch released on 2011-07-28 and changing the advanced option Scsi.CRTimeoutDuringBoot to 1).

Before configuring the perennially-reserved setting on an existing LUN, you can verify that the LUN is mounted as a VMFS LUN. To view the existing settings, run the command:

esxcfg-scsidevs -m|grep naa.XXXXXXXXXXXXXXXXXXX

The issue "Cannot synchronize host hostname. Operation Timed out " is fixed in ESXi 5.0 and we recommend upgrading to ESXi 5.0 or later.

ESXi 5.0

ESXi 5.0 uses a different technique to determine if Raw Device Mapped (RDM) LUNs are used for MSCS cluster devices, by introducing a configuration flag to mark each device as "perennially reserved" that is participating in an MSCS cluster. During the boot of an ESXi host, the storage mid-layer attempts to discover all devices presented to an ESXi host during the device claiming phase. However, MSCS LUNs that have a permanent SCSI reservation cause the boot process to lengthen as the ESXi host cannot interrogate the LUN due to the persistent SCSI reservation placed on a device by an active MSCS Node hosted on another ESXi host.

Configuring the device to be perennially reserved is local to each ESXi host, and must be performed on every ESXi 5.0 host that has visibility to each device participating in an MSCS cluster. This improves the boot time for all ESXi hosts that have visibility to the device(s).

There is no support to apply this setting using vSphere host profiles. As such, ESXi 5.0 hosts deployed using vSphere Auto Deploy cannot take advantage of this feature.

Note: The advanced option Scsi.CRTimeoutDuringBoot is no longer valid on ESXi 5.0.

Upgrading to ESXi 5.0

To upgrade to ESXi 5.0:
  1. Prior to upgrading, unpresent all MSCS RDMs from the host:

    1. Determine which RDM LUNs are part of an MSCS cluster.
    2. From the vSphere Client, select a virtual machine that has a mapping to the MSCS cluster RDM devices.
    3. Edit your virtual machine settings and navigate to your Mapped RAW LUNs.
    4. Select Manage Paths to display the device properties of the Mapped RAW LUN and the device identifier (that is, the naa ID).
    5. Take note of the naa ID, which is a globally unique identifier for your shared device.
    6. Unpresent all MSCS RDMs devices from the hosts.

  2. Upgrade the hosts to ESXi 5.0. For more information, see Methods of upgrading to ESXi 5.0 (2004501).

  3. Following reboot, use the esxcli command to mark the device as perennially reserved:

    Note: This works even if the LUNs are not currently presented to the host.

    esxcli storage core device setconfig -d naa.id --perennially-reserved=true

  4. Re-present the MSCS RDM devices to the hosts and rescan.

  5. Confirm that the correct devices are marked as perennially reserved by running the command:

    esxcli storage core device list |less
Note: Rebooting hosts should not have issues with MSCS devices.

Already upgraded ESXi 5.1/5.5 hosts

To mark the Passive MSCS LUNs as perennially reserved on an already upgraded ESXi 5.1/5.5 host, set the perennially reserved flag in Host Profiles. For more information, see the vSphere MSCS Setup Checklist in the vSphere Documentation Center:
vSphere 5.1 Resource Management Guide
vSphere 5.5 Resource Management Guide



Hosts hosting passive MSCS nodes with RDM LUNs, use the esxcli command to mark the device as perennially reserved

esxcli storage core device setconfig -d naa.id --perennially-reserved=true

Note
Stateless auto deploy will wipe all settings at boot so it is not possible to set the "perennially reserved" flag which will lead to a large delay in booting.

Already upgraded ESXi 5.0 hosts

To mark the MSCS LUNs as perennially reserved on an already upgraded ESXi 5.0 host, run the same esxcli command as above and all subsequent rescans/boots will be at normal speed.

  1. Determine which RDM LUNs are part of an MSCS cluster.
  2. From the vSphere Client, select a virtual machine that has a mapping to the MSCS cluster RDM devices.
  3. Edit your virtual machine settings and navigate to your Mapped RAW LUNs.
  4. Select Manage Paths to display the device properties of the Mapped RAW LUN and the device identifier (that is, the naa ID).
  5. Take note of the naa ID, which is a globally unique identifier for your shared device.
  6. Use the esxcli command to mark the device as perennially reserved:

    esxcli storage core device setconfig -d naa.id --perennially-reserved=true

  7. To verify that the device is perennially reserved, run this command:

    esxcli storage core device list -d naa.id

    In the output of the esxcli command, search for the entry Is Perennially Reserved: true. This shows that the device is marked as perennially reserved.

  8. Repeat the procedure for each Mapped RAW LUN that is participating in the MSCS cluster.
Note: The configuration is permanently stored with the ESXi host and persists across reboots. To remove the perennially reserved flag, run this command:

esxcli storage core device setconfig -d naa.id --perennially-reserved=false

PowerCLI 5.0

To mark the MSCS LUNs as perennially reserved using the PowerCLI, esxcli functionality is also available directly through the PowerCLI. Retrieve an esxcli instance and invoke any of its methods. For more information, see the VMware vSphere PowerCLI Blog.

To retrieve an esxcli instance, run this command:
Connect-VIServer -Server xxx.xxx.xxx.xxx -User xxxxx -Pass xxxxx
To set the esxcli instance, run this command:
$myesxcli= get-esxcli -VMHost ESXhost
To list the devices, run this command:
$myesxcli.storage.core.device.list()
To determine the PowerCLI parameters, run this command:
$myesxcli.storage.core.device.setconfig
TypeNameOfValue : VMware.VimAutomation.ViCore.Util10Ps.EsxCliExtensionMethod
OverloadDefinitions : {void setconfig(boolean detached, string device, boolean perenniallyreserved)}
MemberType : CodeMethod
Value : void setconfig(boolean detached, string device, boolean perenniallyreserved)
Name : setconfig
IsInstance : True
To list details by device naa ID, run this command:
$myesxcli.storage.core.device.list("naa.50060160c46036df50060160c46036df")
AttachedFilters :
DevfsPath : /vmfs/devices/disks/naa.50060160c46036df50060160c46036df
Device : naa.50060160c46036df50060160c46036df
IsPerenniallyReserved : false
IsPseudo : true
To set the device as perennially reserved, run this command:
$myesxcli.storage.core.device.setconfig($false, "naa.50060160c46036df50060160c46036df", $true)
To verify the parameter updates, run this command:
$myesxcli.storage.core.device.list("naa.50060160c46036df50060160c46036df")
AttachedFilters :
DevfsPath : /vmfs/devices/disks/naa.50060160c46036df50060160c46036df
Device : naa.50060160c46036df50060160c46036df
IsPerenniallyReserved : true
IsPseudo : true
To remove the perennially reserved flag, run this command:
$myesxcli.storage.core.device.setconfig($false, "naa.50060160c46036df50060160c46036df", $false)


Additional Information

For related information, see Obtaining LUN pathing information for ESX or ESXi hosts (1003973) and Using Tech Support Mode in ESXi 4.1 and ESXi 5.x (1017910).

Note: The PowerCLI and esxcli commands are case sensitive. If the naa.id is specified in uppercase letters when issuing the command, a new device is added on the ESXi host.

To be alerted when this article is updated, click Subscribe to Document in the Actions box.

Tags

MSCS-LUN-RDM-Boot

See Also

This Article Replaces

1035913

Update History

01/03/2011 - Added ESXi 4.x 02/08/2011 - Clarified steps for 4.0 and 4.1 09/15/2011 - Added information about ESXi 5.0 02/02/2012 - Added link to VMware ESX/ESXi 4.1 patch 02/08/2012 - Added note that PowerCLI and esxcli command is case sensitive 10/09/2012 - Additional log messages in symptoms.

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

Feedback

  • 70 Ratings

Did this article help you?
This article resolved my issue.
This article did not resolve my issue.
This article helped but additional information was required to resolve my issue.
What can we do to improve this information? (4000 or fewer characters)
  • 70 Ratings
Actions
KB: