VMware
 

Knowledge Base

Search the Knowledge Base:

Products:
Search In:
 

Using esxtop to identify storage performance issues

Details

This article provides information about esxtop and latency statistics that may cause performance issues with SAN connected storage (either FC or iSCSI).

Solution

The esxtop utility can be used to measure how much I/O is moving across various devices. The esxtop utility is interactive. As such, pressing certain keys changes the view.
 

Configuring monitoring using esxtop

 
To monitor storage performance per HBA:
  1. Start esxtop by typing esxtop at the command line.
  2. Press d to switch to disk view (HBA mode).
  3. Press f to modify the fields that are displayed.
  4. Press b, c, d, e, h, and j to toggle the fields and press Enter.
  5. Press s, then 2 to alter the update time to every 2 seconds and press Enter.
  6. See Analyzing esxtop columns for a description of relevant columns.
To monitor storage performance per LUN:

NoteThis option is only available in ESX 3.5 and later.
  1. Start esxtop by typing esxtop from the command line.
  2. Press u to switch to disk view (LUN mode).
  3. Press f to modify the fields that are displayed.
  4. Press b, c, f, and h to toggle the fields and press Enter.
  5. Press s, then 2 to alter the update time to every 2 seconds and press Enter.
  6. See Analyzing esxtop columns for a description of relevant columns. 
To monitor storage performance per virtual machine:

NoteThis option is only available in ESX 3.5 and later.
  1. Start esxtop by typing esxtop at the command line.
  2. Type v to switch to disk view (virtual machine mode).
  3. Press f to modify the fields that are displayed.
  4. Press bd, e, h, and j to toggle the fields and press Enter.
  5. Press s, then 2 to alter the update time to every 2 seconds and press Enter.
  6. See Analyzing esxtop columns for a description of relevant columns.

Analyzing esxtop columns

 
The following table lists the relevant columns and a brief description of these values.
 
Column Description
CMDS/s This is the number of IOPS (Input/Output Operations Per Second) being sent to or coming from the device or virtual machine being monitored
DAVG/cmd     This is the average response time in milliseconds per command being sent to the device
KAVG/cmd This is the amount of time the command spends in the VMkernel
GAVG/cmd This is the response time as it is perceived by the guest operating system. This number is calculated with the formula: DAVG + KAVG = GAVG
 
These columns are for both reads and writes, whereas xAVG/rd is for reads and xAVG/wr is for writes. The combined value of these columns is the best way to monitor performance, but high read or write response time it may indicate that the read or write cache is disabled on the array.
 
All arrays perform differently, but DAVG/cmd, KAVG/cmd, and GAVG/cmd should not exceed than 10 milliseconds (ms). These values should not exceed 100 ms for a sustained period of time.

Note: ESX 3.0.x cannot monitor individual LUNs or virtual machines. Many inactive LUNS on the HBA can lower the average of DAVG/cmd, KAVG/cmd, and GAVG/cmd.
 
If you experience high latency times, look at the switches (either FC or TCP) and the SAN for errors that may indicate a delay in commands being sent to and acknowledged from the SAN. This includes the array's ability to process IO's from a spindle count aspect, or the array's ability to handle the load being presented to it.
 
If the response time goes over 5000 ms (or 5 seconds), SCSI aborts occur in the logs. If a command is sent to an array and is not acknowledged within 5000 ms, the command is aborted.

Feedback

Rate this article:
(11 Ratings)

Did this article help you?
This article resolved my issue.
This article did not resolve my issue.
This article helped but additional information was required to resolve my issue.
What can we do to improve this information? (4000 or fewer characters)
Email address (optional)
Submit
Rate this article:
(11 Ratings)
Actions