High latency is due to a large I/O block size that needs to be split up into smaller I/O sizes in order to be transmitted to the storage device. When this I/O split occurs, the measure of latency is against the entire command, not the individual chunks.
Therefore, you must wait for all chunks to complete to the array before reporting back on the latency. This results in a false positive when high latency is observed to the array, when a performance problem does not actually exist.
Note: Reducing the
Disk.DiskMaxIOSize advanced setting in the ESX/ESXi host does not improve the latency results because the Guest operating system is the one issuing large I/O block size. For more information, see
Tuning ESX/ESXi for better storage performance by modifying the maximum I/O block size (1003469).
The Windows registry may be altered to issue smaller I/O block sizes, resulting in lower latency. However, this is merely a false positive. For more information, see:
While one may think that issuing a virtual machine migration or virtual machine deployment can be considered a large block operation, in reality the vmkernel issues I/Os block size of 32k. So, this is significantly smaller than what the guest operating system issues, resulting in no high latency for this operation.