VMware
 

Knowledge Base

Search the Knowledge Base:

Products:
Search In:
 

Insight Manager may cause SCSI reservation conflicts

Symptoms

  • Excessive reservation conflicts (0xbad0022) across all LUNs, including hosts that do not have virtual machines running on them
  • In a boot from SAN configuration, the boot LUN (Service Console) becomes mounted as read-only
  • Repeated messages in the VMkernel logs:

    Jun 15 05:42:33 esxhost vmkernel: 11:13:30:24.439 cpu0:1036)BC: 810: FileIO failed with 0x0xbad0006(Limit exceeded)

Resolution

In this case, the SCSI reservation conflicts are caused by the Fibre Channel agent (cmafcad) and the cmahost agent that are part of HP Insight Manager for ESX. This was identified through the process of elimination.

 

The cmahost agent queries LUNs at regular intervals. The cmafcad agent queries the LUN as well, to determine whether or not it is an HP array. This particular agent must only be installed in an exclusive HP array environment.
 
The HP article c01519875 recommends disabling this agent unless you have the following arrays:
  • HP StorageWorks Modular Smart Array (MSA) 1000
  • HP StorageWorks Modular Smart Array (MSA) 1500
  • HP StorageWorks Modular Smart Array (MSA) 2000

Note: For more information, see the entire article.

The implemented query process results in requiring a reservation on the LUN. The reservation may not be released in a timely manner, causing other ESX hosts on the cluster to lose access to the LUN and generate the errors in the /var/log/vmkernel log.
 
When investigating SCSI reservations in conjunction with HPIM agents, the following recommendations apply:
  • Upgrade the HPIM agents to the latest available version. For more information on the latest supported agents, see http://www.vmware.com/support/esx25/doc/sys_mgmt_links.html.
  • If the issue persists, disable the fibre channel agent (cmafcad) and cmahost components of the HP Insight Manager agents and contact HP Support.

Disabling the cmafcad and cmahost agents

To disable the Fibre Agent:
  1. Log in to the ESX host service console.
  2. Stop all management agents with the commands:

    # service hpasm stop

    # service hpsmhd stop

    Note: The agents must be stopped before making this change rather than issuing a service restart command because the kill script for these agents do a check against the exclude list in cma.conf file and do not issue kill commands during a service restart to processes it does not think should be running. As a result, the problematic processes continue to run and SCSI reservation conflicts persist until they are manually killed or a reboot is initiated.


  3. Open the file /opt/compaq/cma.conf in a text editor.
  4. Add exclude cmahost cmahostd cmafcad  to the top of the file.
  5. Save the file and exit the editor.
  6. Start the managements on the host with the commands: 

    #
    service hpasm start

    # service hpsmhd start

Boot from SAN

This issue is more serious in boot from SAN environments. When the cmafcad agent queries down a non-active path for the boot LUN, a LUN ownership transfer to the passive controller is initiated on the array. As I/O is going down the active path, when the LUN ownership transfer occurs, the service console loses access to the local disk. When access to the boot LUN is lost, the ESX host remounts the file system as read-only.
 
A reboot is required to regain read-write access to the boot LUN.

Feedback

Rating: 1 - Lowest 2 3 4 5 - Highest (7 Ratings)   

Did this article help you?
This article resolved my issue.
This article did not resolve my issue.
This article helped but additional information was required to resolve my issue.
What can we do to improve this information? (2000 or fewer characters)
Submit
Rating: 1 - Lowest 2 3 4 5 - Highest (7 Ratings)   
Actions