VMware
 

Knowledge Base

Search the Knowledge Base:

Products:
Search In:
 

SCSI Reservation Failures on SUN StorageTek 9985 and 9990 Arrays

Details

VMware ESX 3.0.x , 3.5, or ESX 4.0 VMkernel Logs show SCSI Reservation errors similar to what is listed in Troubleshooting SCSI Reservation failures on Virtual Infrastructure 3.x and vSphere 4.0 (1005009). SUN StorageTek 9985 and 9990 arrays will exhibit the following symptoms: High storage Port Utilization with low I/Os going through the port.

 
Consult with your SUN Storage engineer on how to monitor these events on the array.

Solution

Troubleshooting SCSI Reservation Failures for Physical and Virtual LUNs on SUN StorageTek 9985 and 9990

Storage Port Fan-in Ratio

This is defined as the number of initiators sharing the same Storage Processor Port.

Check with SUN Storage support about how to identify and the best practice for the Fan-in Ratio.

VMware recommends reducing the Fan-in Ratio if you encounter SCSI Reservation Failures.

LUN Type

Queue depth, LUN-to-ESX ratio, LUN-to-VM ratio

Based on log and Fibre Channel trace analysis,one cause of this issue is that the Command Queue is exhausted.

If these symptoms are exhibited, the following are the best practices to work-around the problem.
 
Note: This section does not apply to SUN StorageTek 9985V or 9990V arrays). 

Perform one of the following to workaround the issue:

  • Reduce the Queue Depth on each Server's HBA connected to that array.
  • Reduce the number of ESX Server hosts sharing a given LUN or set of LUNs on that array.
  • Reduce the number of virtual machines per LUN so that the possible number of hosts accessing a given LUN is reduced. For example, if you limit the number of virtual machines per LUN to 4, the highest number of ESX hosts running these virtual machines does not exceed 4 in a DRS/HA cluster.

    Note: This option may result in using more LUNs of smaller sizes. The maximum number of LUNs accessible by a VMware ESX 3.x and ESX 4.0 is 256.

Recommended the Queue Depth setting on the HBA

Number of hosts sharing a LUN          Queue Depth Value

8                                                      2

4                                                      4

2                                                      8

As fewer hosts share a given LUN, the queue depth setting is higher.

Note: VMware has not received similar reports for SUN StorageTek 9985V or 9990V and the above does not apply to these models. The HBA Driver’s default Queue Depth is sufficient.

Changing the Queue Depth

Run the following commands at the ESX Server Console (or the RCLI for VMware ESXi 3.5).

Note: These procedures require rebooting the server.

QLogic HBAs

  1. Run the following command to identify the HBA's driver name:

    # vmkload_mod -l | grep qla

    The output is similar to:

    qla2300_707_vmw

  2. Substitute the <driver_name> parameter below with the name from the above output. Substitute the nn parameter with the Queue depth value calculated above.
    # esxcfg-module -s "ql2xmaxqdepth=nn" <driver_name>
    # esxcfg-boot -b
    # reboot
Emulex HBAs
  1. Run the following command to identify the HBA's driver name:

    # vmkload_mod -l | grep lpfcdd

    The output appears similar to:

    lpfcdd_7xx

  2. Substitute the <driver_name> parameter below with the name from the above output. Substitute the "nn" parameter with the Queue depth value calculated above.

    # esxcfg-module -s “lpfc0_lun_queue_depth=nn” <driver_name>

    If you have 2 Emulex HBAs in the server, the command would be:

    # esxcfg-module -s "lpfc0_lun_queue_depth=nn lpfc1_lun_queue_depth=nn" <driver_name>
    # esxcfg-boot -b
    # reboot
Items specific to Virtual LUNs
 
SUN StorageTek 9985 and 9990 arrays provide access to physical LUNs (internal to the array) as well as Virtual LUNs whose physical LUNs are actually hosted on other arrays behind them (Externalized).
  • Physical LUN that is represented by a Virtual LUN must be on Tier 1 type physical disks (Fibre SCSI Disks or Fibre SAS Disks) and with minimum 10K RPM rating to provide best I/O performance.
  • LUSE LUNs should NOT be used with Virtual LUNs.

Keywords

SUN, StorageTek, 9990, 9985, SCSI Reservation, Conflict Retries

Feedback

Rating: 1 - Lowest 2 3 4 5 - Highest (0 Ratings)   

Did this article help you?
This article resolved my issue.
This article did not resolve my issue.
This article helped but additional information was required to resolve my issue.
What can we do to improve this information? (2000 or fewer characters)
Submit
Rating: 1 - Lowest 2 3 4 5 - Highest (0 Ratings)   
Actions