Analyzing SCSI Reservation conflicts on VMware Infrastructure 3.x and vSphere 4.x
Details
ESX 3.0.x, ESX 3.5, or ESX 4.0 VMkernel logs contain the following messages:
SCSI: vm 1043: 5522: Sync CR at 64 SCSI: vm 1043: 5522: Sync CR at 48 SCSI: vm 1043: 5522: Sync CR at 32 SCSI: vm 1043: 5522: Sync CR at 16 SCSI: vm 1043: 5522: Sync CR at 0 WARNING: SCSI: 5532: Failing I/O due to too many reservation conflicts WARNING: SCSI: 5628: status SCSI reservation conflict, rstatus 0xc0de01 for vmhba1:0:7. residual R 919, CR 0, ER 3 WARNING: J3: 1970: Error committing txn to slot 0: SCSI reservation conflict
Solution
There are two main categories of operation under which VMFS makes use of SCSI reservations.
The first category is for VMFS data-store level operations. These include opening, creating, resignaturing, and expanding/extending of VMFS data-store.
The second category involves acquisition of locks. These are locks related to VMFS specific meta-data (called cluster locks) and locks related to files (including directories). Operations in the second category occur much more frequently than operations in the first category. The following are examples of VMFS operations that require locking metadata:
Creating a VMFS datastore
Expanding a VMFS datastore onto additional extents
Powering on a virtual machine
Acquiring a lock on a file
Creating or deleting a file
Creating a template
Deploying a virtual machine from a template
Creating a new virtual machine
Migrating a virtual machine with VMotion
Growing a file, for example, a Snapshot file or a thin provisioned Virtual Disk
If the VMkernel log contains the messages described in the Details section, follow this procedure:
Note: The list of arrays is not exhaustive and will be revised when other arrays are identified reporting these errors.
Follow these steps to resolve potential sources of the reservation:
Try to serialize the operations of the shared LUNs, if possible, limit the number of operations on different hosts that require SCSI reservation at the same time.
Increase the number of LUNs and try to limit the number of ESX hosts accessing the same LUN.
Reduce the number snapshots as they cause a lot of SCSI reservations.
Do not schedule backups (VCB or console based) in parallel from the same LUN.
Ensure correct Host Mode setting on the SAN array.
LUNs removed from the system without rescanning can appear as locked.
When SPs fail to release the reservation, either the request did not come through (hardware, firmware, pathing problems) or 3rd party apps running on the service console did not send the release. Busy virtual machine operations are still holding the lock.
Note: Use of SATA disks is not recommended in high I/O configuration or when the above changes do not resolve the problem while SATA disks are used.
If your array is not listed above and none of the above points eliminate the log messages, file a support request with VMware Support and note this KB Article ID in the problem description. For more information, see How to Submit a Support Request.