Search the VMware Knowledge Base (KB)
View by Article ID

Connectivity to a VMFS5 datastore is lost when using VAAI ATS heartbeat (2113956)

  • 26 Ratings

Symptoms

While using the VAAI ATS heartbeat in your environment, you experience these events:
  • An ESXi 5.5 Update 2 or ESXi 6.0 host loses connectivity to a VMFS5 datastore.

  • In the /var/run/log/vobd.log file and Virtual Center Events, you see the VOB message:

    Lost access to volume <uuid><volume name> due to connectivity issues. Recovery attempt is in progress and the outcome will be reported shortly

  • In the /var/run/log/vmkernel.log file, you see the message:

    ATS Miscompare detected beween test and set HB images at offset XXX on vol YYY

  • You see error messages indicating an ATS miscompare similar to this in /var/log/vmkernel.log :

    2015-11-20T22:12:47.194Z cpu13:33467)ScsiDeviceIO: 2645: Cmd(0x439dd0d7c400) 0x89, CmdSN 0x2f3dd6 from world 3937473 to dev &#34;naa.50002ac0049412fa&#34; failed H:0x0 D:0x2 P:0x0 Valid sense data: 0xe 0x1d 0x0.

  • You may also see:

    • Hosts disconnecting from vSphere vCenter.
    • Virtual machines hanging on I/O operations.

Note: These symptoms are seen in connection with the use of VAAI ATS heartbeat with storage arrays supplied by several different vendors.

Purpose

Revert the heartbeat related activity to the legacy method by disabling ATS heartbeat in the ESXi kernel to resolve this issue.

Cause

A change in the VMFS heartbeat update method was introduced in ESXi 5.5 Update 2, to help optimize the VMFS heartbeat process. Whereas the legacy method involves plain SCSI reads and writes with the VMware ESXi kernel handling validation, the new method offloads the validation step to the storage system. This is similar to other VAAI-related offloads.

This optimization results in a significant increase in the volume of ATS commands the ESXi kernel issues to the storage system and resulting increased load on the storage system. Under certain circumstances, VMFS heartbeat using ATS may fail with false ATS miscompare which causes the ESXi kernel to reverify its access to VMFS datastores. This leads to the Lost access to datastore messages.

Note:
  • For VMFS5 datastores, ATS heartbeat setting is on by default
  • For VMFS3 datastores, ATS heartbeat setting is off by default

Resolution

To resolve this issue, you can revert the heartbeat-related activity to the legacy method by disabling ATS heartbeat in the ESXi kernel. If you suspect that ATS Heartbeat may be causing an issue with array workload or IO responsiveness, please engage with your storage vendor to determine if they recommend disabling this function.

To revert the heartbeat to non-ATS mechanisms, disable this feature on ALL hosts sharing the datastore where these errors are seen:
 
Note: These operations can be safely performed online, while the storage is in use.

 

For VMFS5 datastores:

To disable ATS heartbeat, run either the CLI command or the PowerCLI command:
  • Command line:

    # esxcli system settings advanced set -i 0 -o /VMFS3/UseATSForHBOnVMFS5

  • PowerCLI:

    Get-AdvancedSetting -Entity VMHost-Name -Name VMFS3.UseATSForHBOnVMFS5 | Set-AdvancedSetting -Value 0 -Confirm:$false

To enable ATS heartbeat, run either the CLI command or the PowerCLI command:
  • Command line:

    # esxcli system settings advanced set -i 1 -o /VMFS3/UseATSForHBOnVMFS5

  • PowerCLI:

    Get-AdvancedSetting -Entity VMHost-Name -Name VMFS3.UseATSForHBOnVMFS5 | Set-AdvancedSetting -Value 1 -Confirm:$false

For VMFS3 datastores:

To disable ATS heartbeat, run either the CLI command or the PowerCLI command:
  • Command line:

    # esxcli system settings advanced set -i 0 -o /VMFS3/UseATSForHBOnVMFS3

  • PowerCLI:

    Get-AdvancedSetting -Entity VMHost-Name -Name VMFS3.UseATSForHBOnVMFS3 | Set-AdvancedSetting -Value 0 -Confirm:$false

To enable ATS heartbeat, run either the CLI command or the PowerCLI command:
  • Command line:

    # esxcli system settings advanced set -i 1 -o /VMFS3/UseATSForHBOnVMFS3

  • PowerCLI:

    Get-AdvancedSetting -Entity VMHost-Name -Name VMFS3.UseATSForHBOnVMFS3 | Set-AdvancedSetting -Value 1 -Confirm:$false
Notes:
  • This change takes effect immediately without reboot.
  • The root node of these options is /VMFS3 regardless of the VMFS version. The last character of the option matches the corresponding VMFS version.

You can review the results of changing options with these commands:
 

For VMFS5 datastores:

 Run this command:

# esxcli system settings advanced list -o /VMFS3/UseATSForHBonVMFS5

You see output similar to:

Path: /VMFS3/UseATSForHBOnVMFS5
Type: integer
Int Value: 0
                <--- check this value
Default Int Value: 1
Min Value: 0
Max Value: 1
String Value:
Default String Value:
Valid Characters:
Description: Use ATS for HB on ATS supported VMFS5 volumes


For VMFS3 datastores:

 Run this command:
 
# esxcli system settings advanced list -o /VMFS3/UseATSForHBonVMFS3

You see output similar to:

Path: /VMFS3/UseATSForHBO nVMFS3
Type: integer
Int Value: 0  
          <--- Check this value
Default Int Value: 0
Min Value: 0
Max Value: 1
String Value:
Default String Value:
Valid Characters:
Description: Use ATS for HB on ATS supported VMFS3 volumes


This reversion of VMFS heartbeat activity is preferred instead of globally disabling VAAI or ATS when using applicable storage systems. Although some storage systems require that the heartbeat-related activity be reverted to the legacy methodology, they still handle non-heartbeat-related ATS commands normally and there are dramatic performance and scale benefits to the use of ATS even if ATS should not be used for VMFS heartbeats.

Impact/Risks

Disabling ATS heartbeat results in the ESXi host using plain SCSI reads and writes to update its heartbeat on VMFS datastores.

Below is the impact of disabling the ATS heartbeat processing:
  • Acquiring a HB slot (starting heartbeat) – Not impacted by this change.
  • Periodic/Routine heartbeat updates – Is affected by this change.
Note: This change disables or enables using ATS primitive for creating or updating VMFS heartbeat and does not change ATS primitive configuration itself. The latter would have required host reboot after changes but this is not the case here.

Additional Information

This issue is not limited to one vendor. If the datastores are on IBM Storwize and San Volume Controller, the ATS heartbeat must be disabled per IBM recommendation. For more information, see the IBM advisory  Host Disconnects Using VMware vSphere 5.5.0 Update 2 and vSphere 6.0.
 
Note: The preceding link was correct as of October 22, 2015. If you find the link is broken, provide feedback and a VMware employee will update the link.

Tags

disable ats heartbeat, Cannot disable VAAI heartbeat, Error: Lost access to volume xx due to connectivity issues. Recovery attempt is in progress and outcome will be reported shortly, ats heartbeat enabling or disabling VAAI ATS heartbeat, enable ATS heartbeat, disable ATS heartbeat, ATS miscompare issue

See Also

Update History

07-02-15 - mkhalil - Added a note in the "impact" section that this does not affect ATS primitive configuration. 08-02/16 - mkhalil - Removed ESXi 5.1 since this was introduced in 5.5 not earlier.

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

Feedback

  • 26 Ratings

Did this article help you?
This article resolved my issue.
This article did not resolve my issue.
This article helped but additional information was required to resolve my issue.

What can we do to improve this information? (4000 or fewer characters)




Please enter the Captcha code before clicking Submit.
  • 26 Ratings
Actions
KB: