Definition of the advanced NFS options

Products

VMware vSphere ESXi

Issue/Introduction

This article provides information on Advanced NFS settings.

Warning: VMware does not recommend making changes to the default settings.

Environment

VMware vSphere ESXi 5.1
VMware ESX 4.0.x
VMware vSphere ESXi 6.0
VMware ESX Server 3.0.x
VMware ESXi 4.0.x Installable
VMware ESXi 3.5.x Embedded
VMware ESXi 3.5.x Installable
VMware vSphere ESXi 6.7
VMware ESX Server 3.5.x
VMware ESXi 4.1.x Embedded
VMware vSphere ESXi 6.5
VMware vSphere ESXi 5.0
VMware ESX 4.1.x
VMware vSphere ESXi 7.0.0
VMware ESXi 4.0.x Embedded
VMware ESXi 4.1.x Installable
VMware vSphere ESXi 5.5

Resolution

To access the NFS advanced options:

Log in to the VMware vSphere Client.
Select the ESXi host.
Click the Configuration tab.
Click Advanced Settings.
Select NFS.

List of each setting with a brief description:

Note: If the NFS HeartBeat advanced option values are modified, you must reboot the ESXi host.

NFS.DiskFileLockUpdateFreq

The time between updates to the NFS lock file on the NFS server. Increasing this value increases the time it takes to recover stale NFS locks. For more information, see NFS Locking.
NFS.HeartbeatFrequency

This sets how often the NFS heartbeat world runs to see if any NFS volumes need a heartbeat request. For more information, see NFS Heartbeats.
NFS.HeartbeatTimeout

The amount of time before stoping a heartbeat request. For more information, see NFS Heartbeats.
NFS.HeartbeatDelta

The amount of time after a successful GETATTR request before the heartbeat world issues a heartbeat request for a volume. If an NFS volume is in an unavailable state, an update is sent every time the heartbeat world runs (NFS.HeartbeatFrequency seconds). For more information, see NFS Heartbeats.
NFS.HeartbeatMaxFailures

The number of consecutive heartbeat requests that must fail before the server is marked as unavailable. For more information, see NFS Heartbeats.

NFS.LockRenewMaxFailureNumber

The number of lock update failures that must occur before the lock is marked as stale. For more information, see NFS Locking.
NFS.LockUpdateTimeout

The amount of time before a lock update request is stopped. For more information, see NFS Locking.
NFS.MaxConnPerIP

This option defines the maximum number of TCP/IP connection per NFS Server. This option is removed after ESXi 5.0
SunRPC.MaxConnPerIP

This option defines the maximum number of TCP/IP connection per NFS Server. This is applicable for ESXi 5.1 and later.
NFS.MaxVolumes

The maximum number of NFS volumes that can be mounted. The TCP/IP heap might be required to increase to accommodate the number of NFS volumes configured. Refer KB Increasing the default value that defines the maximum number of NFS mounts on an ESXi/ESX host

NFS.ReceiveBufferSize

This is the size of the receive buffer for NFS sockets. This value is chosen based on internal performance testing. VMware does not recommend adjusting this value.
NFS.SendBufferSize

The size of the send buffer for NFS sockets. This value is chosen based on internal performance testing. VMware does not recommend adjusting this value.
NFS.SyncRetries

The number of retries before synchronous IO fails (10 seconds per retry).

NFS.VolumeRemountFrequency

This determines how often an NFS volume attempts to mount when it is initially unable to mount. After a volume is mounted it never needs to be remounted. The volume is marked unavailable if it loses connectivity to the NFS server but it still remains mounted.

Note: Some of these advanced options might be missing or additional options added depending on the version of ESXi host.

NFS Locking

NFS locking on ESXi does not use the NLM protocol. VMware has its own locking protocol. NFS locks are implemented by creating lock files on the NFS server. Lock files are named.lck-fileid, where fileid is the value of the fileid field returned from a GETATTR request for the file being locked.

When a lock file is created, an update is periodically (everyNFS.DiskFileLockUpdateFreqseconds) sent to the lock file to let other ESX hosts know that the lock is still active. The lock file updates generate small (84 byte) WRITE requests to the NFS server. Changing any of the NFS locking parameters changes how long it takes to recover stale locks. This formula can be used to calculate how long it takes to recover a stale NFS lock:

(NFS.DiskFileLockUpdateFreq * NFS.LockRenewMaxFailureNumber) + NFS.LockUpdateTimeout

This example demonstrates this equation using the default values in VMware ESX 3.5:

X is the length of time it takes to recover from a stale NFS lock.

X = (NFS.DiskFileLockUpdateFreq * NFS.LockRenewMaxFailureNumber) + NFS.LockUpdateTimeout X = (10 * 3) + 5 X = 35 seconds

If any of these parameters are modified, it is very important that all ESX hosts in the cluster use identical settings. Having inconsistent NFS lock settings across ESX hosts can result in data corruption.

Note: Locking types can differ in versions of NFS see NFS protocols

NFS Heartbeats

NFS heartbeats are used to determine whether or not an NFS volume is still available. NFS heartbeats are actually GETATTR requests on the root file handle of the NFS Volume. There is a system world that runs everyNFS.HeartbeatFrequencyseconds to check if it needs to issue heartbeat requests for any of the NFS volumes. If a volume is marked available, it only issues a heartbeat if it has been NFS.HeartBeatDelta seconds since it issued a successful request (for example, a successful GETATTR, READ, READDIR, etc) that does not modify the attribute for that volume. The NFS heartbeat world always issues heartbeats for NFS volumes which are marked unavailable.

Use this formula to calculate the Minimum and the Maximum time for an ESX to mark an NFS volume as unavailable:

Minimum time to mark a volume down(minTime) = (HeartbeatFrequency * (HeartbeatMaxFailures - 1)) + HeartbeatTimeout

Maximum time to mark a volume down(maxTime) = HeartbeatDelta + HeartbeatFrequency + minTime

Default values for these variables in ESX/ESXi 4.x:

HeartbeatFrequency 9
HeartbeatTimeout 5
HeartbeatDelta 5
HeartbeatMaxFailures 3

Default values for these variables in ESXi 5.x:

HeartbeatFrequency 12
HeartbeatTimeout 5
HeartbeatDelta 5
HeartbeatMaxFailures 10

TCP/IP Heap size

The TCP/IP stack in versions of ESX prior to ESX 4.0 does not handle mbuf allocation failures gracefully (purple diagnostic screens). Because of this, it is very important that the TCP/IP heap size is adjusted to accommodate usage by NFS. Unfortunately there are many consumers of the TCP/IP stack, and it is difficult to precisely calculate how large the TCP/IP heap needs to be. The safest way to calculate the TCP/IP heap size, given the number of NFS volumes configured, is to linearly scale the default values up to the supported maximum specific to the version of ESXi. For more information, seeIncreasing the default value that defines the maximum number of NFS mounts on an ESXi/ESX host (2239).

For example, the default number of NFS volumes is 8 and the default min/max sizes of the TCP/IP heap are 6 MB / 30 MB respectively. This means the TCP/IP heap size for a host configured with 32 NFS volumes should have the min/max TCP/IP heap sizes set to 24 MB / 120 MB.

ESXi 6.0, 6.5, 6.7:
Default Net.TcpipHeapMax is 512MB. Default send/receive socket buffer size of NFS is 256K each. So each socket consumes ~512K+.For 256 shares, it would be ~128M. The default TCPIPheapMax is sufficient even for 256 mounts. Its not required to increase.

Additional InformationFor translated versions of this article, see:

日本語: NFS の詳細オプションの定義 (2094317)
简体中文: 高级 NFS 选项的定义 (2101456)