Knowledge Base

The VMware Knowledge Base provides support solutions, error messages and troubleshooting guides
 
Search the VMware Knowledge Base (KB)   View by Article ID
 

Using esxtop to identify storage performance issues for ESX / ESXi (multiple versions) (1008205)

Details

This article provides information about esxtop and latency statistics that can be used when troubleshooting performance issues with SAN-connected storage (Fibre Channel or iSCSI). 

Note: In ESXi 5.x, you may see messages indicating that performance has deteriorated. For more information, see Storage device performance deteriorated (2007236).

Solution




The interactive esxtop utility can be used to provide I/O metrics over various devices attached to a VMware ESX host.

Configuring monitoring using esxtop

 To monitor storage performance per HBA:
  1. Start esxtop by typing esxtop at the command line.
  2. Press d to switch to disk view (HBA mode).
  3. To view the entire Device name, press SHIFT + L and enter 36 in Change the name field size.
  4. Press f to modify the fields that are displayed.
  5. Press b, c, d, e, h, and j to toggle the fields and press Enter.
  6. Press s and then to alter the update time to every 2 seconds and press Enter.
  7. See Analyzing esxtop columns for a description of relevant columns.
Note: These options are available only in VMware ESX 3.5 and later.

To monitor storage performance on a per-LUN basis:
  1. Start esxtop by typing esxtop from the command line.
  2. Press u to switch to disk view (LUN mode).
  3. Press f to modify the fields that are displayed.
  4. Press b, c, f, and h to toggle the fields and press Enter.
  5. Press s and then 2 to alter the update time to every 2 seconds and press Enter.
  6. See Analyzing esxtop columns for a description of relevant columns.

To increase the width of the device field in esxtop to show the complete naa id:

  1. Start esxtop by typing esxtop at the command line.
  2. Press u to switch to the disk device display.
  3. Press L to change the name field size.

    Note: Ensure to use uppercase L.

  4. Enter the value 36 to display the complete naa identifier.

To monitor storage performance on a per-virtual machine basis:
  1. Start esxtop by typing esxtop at the command line.
  2. Type v to switch to disk view (virtual machine mode).
  3. Press f to modify the fields that are displayed.
  4. Press b, d, e, h, and j to toggle the fields and press Enter.
  5. Press s and then 2 to alter the update time to every 2 seconds and press Enter.
  6. See Analyzing esxtop columns for a description of relevant columns.

Analyzing esxtop columns

Refer to this table for relevant columns and descriptions of these values:

Column Description
CMDS/sThis is the total amount of commands per second and includes IOPS (Input/Output Operations Per Second) and other SCSI commands such
as SCSI reservations, locks, vendor string requests, unit attention commands etc. being sent to or coming from the device or virtual machine being monitored.
In most cases CMDS/s = IOPS unless there are a lot of metadata operations (such as SCSI reservations)
DAVG/cmdThis is the average response time in milliseconds per command being sent to the device
KAVG/cmdThis is the amount of time the command spends in the VMkernel
GAVG/cmdThis is the response time as it is perceived by the guest operating system. This number is calculated with the formula: DAVG + KAVG = GAVG

These columns are for both reads and writes, whereas xAVG/rd is for reads and xAVG/wr is for writes. The combined value of these columns is the best way to monitor performance, but high read or write response time it may indicate that the read or write cache is disabled on the array. All arrays perform differently, however DAVG/cmd, KAVG/cmd, and GAVG/cmd should not exceed more than 10 milliseconds (ms) for sustained periods of time.

Note: VMware ESX 3.0.x does not include direct functionality to monitor individual LUNs or virtual machines using esxtop. Inactive LUNs lower the average for DAVG/cmd, KAVG/cmd, and GAVG/cmd. These values are also visible from the vCenter Server performance charts. For more information, see the Performance Charts section in the Basic System Administration Guide.

If you experience high latency times, investigate current performance metrics and running configuration for the switches and the SAN targets. Check for errors or logging that may suggest a delay in operations being sent to, received, and acknowledged. This includes the array's ability to process I/O from a spindle count aspect, or the array's ability to handle the load presented to it.

If the response time increases to over 5000 ms (or 5 seconds), VMware ESX will time out the command and abort the operation. These events are logged; abort messages and other SCSI errors can be reviewed in these logs:
  • ESX 3.5 and 4.x – /var/log/vmkernel
  • ESXi 3.5 and 4.x – /var/log/messages 
  • ESXi 5.x - /var/log/vmkernel.log
The type of storage logging you may see in these files depends on the configuration of the server. You can find the value of these options by navigating to Host > Configuration > Advanced Settings > SCSI > SCSI.Log* or SCSI.Print*.

Additional Information

You can also collect performance snapshots using vm-support. For more information, see Collecting performance snapshots using vm-support (1967).

For translated versions of this article, see:


Tags

storage-performance poor-performance slow-performance

Request a Product Feature

To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page.

Feedback

  • 150 Ratings

Did this article help you?
This article resolved my issue.
This article did not resolve my issue.
This article helped but additional information was required to resolve my issue.
What can we do to improve this information? (4000 or fewer characters)
  • 150 Ratings
Actions
KB: