Increasing the disk timeout values for a Linux 2.6 virtual machine
search cancel

Increasing the disk timeout values for a Linux 2.6 virtual machine

book

Article ID: 310339

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

These issues occur when the guest operating system timeout values are exceeded for attached storage disks. This may be caused by an underlying storage problem or due to brief transient pauses during normal operations (such as path failover). To accommodate transient events, the VMware Tools increases the SCSI disk timeout to 60 seconds for Virtual Infrastructure 3 and 180 seconds for vSphere 4 and higher.

If the default increase is insufficient for a given environment or if an increase is desired and the VMware Tools is not or cannot be installed, the SCSI device timeout within the Linux guest operating system can be increased manually.

Note: Increasing the SCSI Timeout value is to alleviate the issue of slow Failover/Transient storage conditions. It is not designed to mitigate prolonged underlying storage conditions such as APD/PDL. There may be underlying SAN storage problems that require addressing if:



Symptoms:
  • Inconsistent Linux operating system performance when disks are located on SAN-based datastores.
  • The Linux guest operating system may experience intermittent issues when stored on SAN-based datastores.
  • During a host storage path failover, a Linux guest operating system reports a system error or file system error, encounters a kernel panic, or becomes unresponsive.


Environment

VMware vSphere ESXi 5.1
VMware ESX 4.0.x
VMware ESXi 3.5.x Embedded
VMware ESX Server 3.5.x
VMware ESXi 4.1.x Embedded
VMware ESX Server 3.0.x
VMware ESXi 3.5.x Installable
VMware vSphere ESXi 6.5
VMware ESXi 4.0.x Embedded
VMware vSphere ESXi 5.0
VMware ESXi 4.0.x Installable
VMware vSphere ESXi 5.5
VMware vSphere ESXi 6.0
VMware ESX 4.1.x
VMware ESXi 4.1.x Installable

Resolution

Starting with Linux 2.6.13 kernels the timeout value for a Linux block device can be set using the sysfs interface. This does not affect kernels below 2.6.13. Increasing the disk timeout value for every disk attached to the virtual machine can prevent the issue from recurring.

  • Check the current timeout for every generic SCSI device in Linux sysfs using the command:

    find /sys/class/scsi_generic/*/device/timeout -exec grep -H . '{}' \;

  • The timeout value for an individual disk can be modified using the sysfs interface. For example:

    echo 180 > /sys/block/ sdc/device/timeout

    Note: This change does not persist across reboots.

  • The VMware Tools installer creates a udev rule at /etc/udev/rules.d/99-vmware-scsi-udev.rules that sets the timeout for each VMware virtual disk device and reloads the udev rules so that it takes effect immediately. This rule is applied during each subsequent startup. For example, this is the udev rule from vSphere 4.x:

    # Redhat systems
    ACTION=="add", BUS=="scsi", SYSFS{vendor}=="VMware, " , SYSFS{model}=="VMware Virtual S", RUN+="/bin/sh -c 'echo 180 >/sys$DEVPATH/device/timeout'"

    # Debian systems
    ACTION=="add", SUBSYSTEMS=="scsi", ATTRS{vendor}=="VMware " , ATTRS{model}=="Virtual disk ", RUN+="/bin/sh -c 'echo 180 >/sys$DEVPATH/device/timeout'"

    # SuSE / Ubuntu systems
    ACTION=="add", SUBSYSTEMS=="scsi", ATTRS{vendor}=="VMware, " , ATTRS{model}=="VMware Virtual S", RUN+="/bin/sh -c 'echo 180 >/sys$DEVPATH/device/timeout'"


Additional Information

For more information, see Storage path failover might cause kernel panic in Linux kernels if using a virtual LSILogic adapter (Parallel or SAS) (1010759).Troubleshooting a virtual machine outage across multiple hosts connected to the same array
Storage path failover might cause kernel panic in Linux kernels if using a virtual LSILogic adapter (Parallel or SAS)
Incremento de los valores del tiempo de desconexión de discos para las máquinas virtuales Linux 2.6
Linux 2.6 仮想マシンのディスク タイムアウト値を増やす
增加 Linux 2.6 虚拟机的磁盘超时值