Support > Knowledge Base
Knowledge Base

Search the Knowledge Base: |
Search the Knowledge Base: |
System does not clear RAID controller cache during shutdown
Details
On systems with ESX 4.0 or ESXi 4.0 installed, if the RAID controller battery backup unit is completely discharged after a shutdown, or if a locally attached disk is removed and not returned to the system, data corruption might occur because the RAID controller cache is not cleared while shutting down the server.
Note: The typical battery discharge period is approximately 48 hours.
You may experience these symptoms, depending on what is not flushed from the RAID cache:
- Failure to boot
- Loss of customized configuration
- Data loss
- A message indicating that the cache is cleared (or being flushed) during the power-on self-test (POST) after rebooting ESX 4.0 or ESXi 4.0 systems. This is an indication that the write-back cache is not flushed to disk during the previous system shutdown.
This issue can only occur when all of the following conditions are true:
- RAID controllers use these drivers (the attached script helps detect the affected drivers):
- megaraid2
- megaraid_sas
- aacraid
- The cache policy in the controllers is set to write-back cache
Note: To check if the cache policy is set to write-back cache, boot the host machine and check the controller BIOS. For more information, refer the documentation for the specific RAID controller.
- The RAID controller battery backup unit is completely discharged
- SAS or SCSI storage is directly attached to the server
Solution
ESX and ESXi 4.0 patches that address this issue are now available. For more information, see:
Before downloading the appropriate patch, first ensure that this article applies to your system.
Determining if this article applies to your system
To check if the RAID controller is using any of the affected drivers with local storage attached, use script attached to this article. It is is a perl script for use in:
- vSphere Management Assistant (vMA) 4.0 virtual appliance that is configured to manage ESX 4.0 and/or ESXi 4.0 hosts
- ESX 4.0 Service Console
- vSphere Command-Line Interface (vCLI) 4.0 (on Windows and Linux)
Note: The script does not work in ESXi 4.0 TechSupportMode shell.
Note: The script does not check for installed patches. It only checks the hardware for applicability to this article. Running this script will return the same results with or without the patch installed.
Using the script with the ESX 4.0 Service Console
To use the script with the ESX 4.0 Service Console:
- Download the attached script file.
- Transfer the file to a VMFS volume accessible by the ESX 4.0 hosts to be tested. If there is no shared storage available with VMFS volume on it, transfer the file to the ESX 4.0 hosts' local disk. For example, the /root or /tmp directory.
- On the ESX Service Console, expand the file with the command:
# cd /vmfs/volumes/<volume-name> (or to the location where you stored the file)
# tar zxvf chkdrvr.tgz
-
Run the script:
# ./chkdrvr.pl
The output indicates if this article is applicable to your host. If it is applicable, download the appropriate patch. If it is not applicable, do not proceed further.
Using the script with vMA 4.0
Notes:
- For details about installing the vMA 4.0 Virtual Appliance, see the vMA 4.0 Release Notes.
- For details about configuring and using the vMA 4.0 Virtual Appliance, see the vMA 4.0 Guide.
- Using the script with passwords with special characters requires escape characters. For example, type Pa\\\$\\\$w0rd instead of Pa$$w0rd ( 3 backslashes before each special character).
To use the script with vMA 4.0:
- Download the attached script file.
- Transfer it to the vMA 4.0 Appliance.
- Expand the file with the command:
# tar zxvf chkdrvr.tgz
To use the script with vCenter Server as the managed target in vMA 4.0 Virtual Appliance:
- Logon to vMA 4.0 Virtual Appliance as vi-admin.
- Register the vCenter server as the managed target with the command:
# sudo vifp addserver <vCenter-hostname>
- Enter your vCenter's user name and password.
If the authentication is successful, run the command:
# vifpinit <vCenter-Hostname>
The prompt shows the current context is now the vCenter host name or IP address. For example:
[vi-admin@vma4-mk ~][vcenter04.acme.com]$
If you receive and error and the prompt does not look like the example, do not proceed with until you verify that you are using the correct credentials for the vCenter you attempted to register.
- Run the script:
# ./chkdrvr.pl --vihost <host-name>
where <host-name> is the ESX/ESXi host name as it appears in vCenter inventory
The output indicates if this article is applicable to your host. If it is applicable, download the appropriate patch. If it is not applicable, do not proceed further.
-
Repeat for each ESX/ESXi host managed via vCenter 4.0 to which you are currently connected.
To use the script with ESX/ESXi 4.0 hosts as the managed targets in vMA 4.0 Virtual Appliance:
- Logon to vMA 4.0 Virtual Appliance as vi-admin.
- Register the ESX/ESXi 4.0 hosts as the managed targets so that you can use the FastPass facility provided by vMA 4.0. Run the command:
# sudo vifp addserver <ESX/ESXi-hostname>
Enter your ESX/ESXi 4.0 host's user name (with root privilege) and password.
- Repeat the previous step for each ESX/ESXi 4.0 host you manage from this vMA 4.0 Virtual Appliance.
- To verify the list of registered hosts, run the command:
# vifp listservers
- Change the context to the first server that you want to check for this issue. Run the command:
# vifpinit <ESX/ESXi 4.0 host name>
The prompt shows the current context is now the ESX/ESXi host name. For example:
[vi-admin@vma4-mk ~][esxi02.acme.com]$
- Run the script without any arguments:
# ./chkdrvr.pl
Note: If you do not use the FastPass facility, use the following syntax for running the script on vMA 4.0:# ./chkdrvr.pl --server <ESX/ESXi 4.0 host name> --username <user with root privilege> --password <password>- Repeat the previous 2 steps for each ESX/ESXi host to be checked.
- The output indicates if this article is applicable to your host. If it is applicable, download the appropriate patch. If it is not applicable, do not proceed further.
Using the script on vCLI 4.0
Note: On Linux vCLI, using passwords with special characters requires escape characters. For example, type Pa\\\$\\\$w0rd instead of Pa$$w0rd ( 3 backslashes before each special character). This is not required when using vCLI on Windows.
To use the script on vCLI 4.0:
- Download the attached script file to the system where vCLI 4.0 is installed.
- Expand the file.
- On Linux. use the command:
# tar zxvf chkdrvr.tgz
- On Windows, use a tool like WinZip or WinRar then move the expanded file to:
%ProgramFiles%\VMware\VMware vSphere CLI\bin
Notes:- Change to the above directory before proceeding.
- When you extract the file, you may need to rename it to chkdrvr.pl.
- On Linux. use the command:
- Run the script.
- On Linux, run the command:
./chkdrvr.pl --server <ESX/ESXi host name> --username <user with root privilege> --password <password>
- On Windows, run the command:
chkdrvr.pl --server <ESX/ESXi host name> --username <user with root privilege> --password <password>
- On Linux, run the command:
- The output indicates if this article is applicable to your host. If it is applicable, download the appropriate patch. If it is not applicable, do not proceed further.
Preventing this issue
To prevent this issue, download and install VMware ESXi 4.0, Patch ESXi400-200907401-BG or VMware ESX 4.0, Patch ESX400-200907401-BG.
If, however, you plan on shutting down the system for an extended period of time before installing the patch, follow this procedure to prevent this issue:
- Reboot the ESX/ESXi. For more information, see ESX 4.0 and ESXi 4.0 shutdown and reboot commands (1013193).
- During Power-On-Self-Test (POST) press the hot key for Boot Device Order or equivalent. This allows the RAID controller's BIOS to load (which flushes the cache if needed), then it pauses and displays the list of boot devices.
- Power off the system using the power switch.
Note: Do not remove the direct-attached disks from one system and attach them to another identical system's RAID controller. The cache content on the original system's RAID controller may not have been flushed to the disks if you were unable to follow the above procedure.
Note: If the system's RAID controller is not equipped with a backup battery, and the RAID controller supports "Write-Back" Cache even in the absence of a backup battery, DO NOT power off your system until you have followed the above 3 steps. Otherwise, the "Write-Through" Cache option must be used instead to prevent data corruption in case of accidental power failure regardless of the issue documented in this article.
Attachments
Feedback
Actions
- KB Article: 1012794
- Updated: Nov 3, 2009
- Products:
VMware ESX - Product Versions:
VMware ESX 4.0.x
VMware ESXi 4.0.x Embedded
VMware ESXi 4.0.x Installable

