VMware vSphere 5.x and 6.x support with NetApp MetroCluster (2031038)
What is vMSC?
A VMware vSphere Metro Storage Cluster configuration is a VMware vSphere certified solution that combines synchronous replication with array-based clustering. These solutions are implemented with the goal of reaping the same benefits that high-availability clusters provide to a local site, but in a geographically dispersed model with two data centers in different locations. At its core, a VMware vMSC infrastructure is a stretched cluster. The architecture is built on the idea of extending what is defined as local in terms of network and storage. This enables these subsystems to span geographies, presenting a single and common base infrastructure set of resources to the vSphere cluster at both sites. In essence, it stretches network and storage between sites. All supported storage devices are listed on the VMware Storage Compatibility Guide.
What is a NetApp MetroCluster™?
NetApp® MetroCluster™ software is a solution that combines array-based clustering with synchronous replication to deliver continuous availability and zero data loss at the lowest cost. Administration of the array-based cluster is simpler because the dependencies and complexity normally associated with host-based clustering are eliminated. MetroCluster immediately duplicates all your mission-critical data on a transaction-by-transaction basis, providing uninterrupted access to your applications and data. And unlike standard data replication solutions, MetroCluster works seamlessly with your host environment to provide continuous data availability while eliminating the need to create and maintain complicated failover scripts.
NetApp MetroCluster™ enhances the built-in high availability and nondisruptive operations of NetApp hardware and ONTAP® storage software, providing an additional layer of protection for the entire storage and host environment. Whether your environment is composed of standalone servers, high-availability server clusters, or virtualized servers, MetroCluster seamlessly maintains application availability in the face of a total storage outage. Such an outage can result from loss of power, cooling, or network connectivity; a storage array shutdown; or operational error. This is because MetroCluster is an array- based active-active clustered solution, eliminating the need for complex failover scripts, server reboots, or application restarts. In fact, most MetroCluster customers report that their users experience no application interruption when a cluster recovery takes place.
What is MetroCluster TieBreaker?
NetApp MetroCluster Tiebreaker software checks the reachability of the nodes in a MetroCluster configuration and the cluster to determine whether a site failure has occurred. The Tiebreaker software also triggers an alert under certain conditions. The Tiebreaker software runs on a RHEL physical or virtualized servers, and it’s best deployed at a third location for independent visibility to all MetroCluster nodes.
The Tiebreaker software monitors each controller in the MetroCluster configuration by establishing redundant connections through multiple paths to a node management LIF and to the cluster management LIF, both hosted on the IP network.
The Tiebreaker software monitors the following components in the MetroCluster configuration:
- Nodes through local node interfaces
- Cluster through the cluster-designated interfaces
- Surviving cluster to evaluate whether it has connectivity to the disaster site (FCVI, storage, and intercluster peering)
When there is a loss of connection between the Tiebreaker software and all the nodes in the cluster and to the cluster itself, the cluster will be declared as "not reachable" by the Tiebreaker software. It takes around three to five seconds to detect a connection failure. If a cluster is unreachable from the Tiebreaker software, the surviving cluster (the cluster that is still reachable) must indicate that all links to the partner cluster are severed before the Tiebreaker software triggers an alert.
Automatic Unplanned Switchover in MetroCluster
In MetroCluster configuration, the system can perform an automatic unplanned switchover (AUSO) in the event of a site failure to provide nondisruptive operations.
In a four-node and eight-node MetroCluster configuration, an AUSO can be triggered if all nodes at a site are failed because of the following reasons:
Because there is no local HA failover available in a two-node MetroCluster configuration, the system performs an AUSO to provide continued operation after a controller failure. This functionality is similar to the HA takeover capability in an HA pair. In a two-node MetroCluster configuration, an AUSO can be triggered in the following scenarios:
In the event of an AUSO, disk ownership for the impaired node's pool0 and pool1 disks is changed to the disaster recovery (DR) partner. This ownership change prevents the aggregates from going into a degraded state after the switchover.
After the automatic switchover, you must manually proceed through the healing and switchback operations to return the controller to normal operation.
MetroCluster is a combined hardware and software solution, with deployment flexibility including 2-nodes to 8-nodes options. Specific hardware is required to create the shared storage fabric and inter-site links. Consult the NetApp Interoperability Matrix (IMT) for supported hardware. On the software side, MetroCluster is completely integrated into Data ONTAP - no tools or licenses are required to leverage MetroCluster. After the MetroCluster relationships are established, data and configuration is automatically continuously replicated between the sites, manual effort is not required to establish replication of newly provisioned workloads. This not only simplifies the administrative effort required, it also eliminates the possibility of forgetting to replicate critical workloads.
These requirements must be satisfied to support this configuration:
MetroCluster configurations protect data by using two physically separated, mirrored clusters, separated by a distance of up to 300 km. Each cluster synchronously mirrors the root and data aggregates. When a disaster occurs at one site, an administrator can trigger a failover and begins serving the mirrored data from the surviving site. In Data ONTAP 9.2, MetroCluster consists of a 2-node HA pair at each site, allowing the majority of planned and unplanned events to be handled by a simple failover and giveback within the local cluster. Only in the event of a disaster (or for testing purposes), is a full switchover to the other site required. Switchover, and the corresponding switchback operations transfers the entire clustered workload between the sites.
This figure shows the basic MetroCluster configuration. There are two data centers, A and B, separated at a distance up to 300km with ISLs running over dedicated fibre-links. Each site has a cluster in place, consisting of 2-nodes in an HA pair. In this example, cluster A at site A consists of nodes A1 and A2, and cluster B at site B consists of nodes B1 and B2. The two clusters and sites are connected through two separate networks which provide the replication transport. The Cluster Peering network is an IP network used to replicate cluster configuration information between the sites. The shared storage fabric is an FC connection and is used for storage and NVRAM synchronous replication between the two clusters. All storage is visible to all controllers through the shared storage fabric.
Note: This illustration is a simplified representation and does not indicate the redundant front-end components, such as Ethernet and fibre channel switches.
The vMSC configuration used in this certification program was configured with Uniform Host Access mode. In this configuration, the ESX hosts from a single site are configured to access storage from both sites.
In cases where RDMs are configured for virtual machines residing on NFS volumes, a separate LUN must be configured to hold the RDM mapping files. Ensure you present this LUN to all the ESX hosts.
vMSC test scenarios
This table outlines vMSC test scenarios:
|Scenario||NetApp Controllers Behavior||VMware HA Behavior|
|Controller single path failure||Controller path failover occurs. All LUNs and volumes remain connected.
For FC datastores, path failover is triggered from the host and the next available path to the same controller will be active.
All ESXi iSCSI/NFS sessions remain active in multimode configurations of two or more network interfaces.
|ESXi single storage path failure||No impact on LUN and volume availability. ESXi storage path fails over to the alternative path. All sessions remain active.||No impact|
|Site 1 or Site 2 single storage node failure||Since there is an HA pair at each site, a failure of one node transparently and automatically triggers failover to the other node.||No impact|
|MCTB VM failure||No impact on LUN and volume availability. All sessions remain active.||No impact|
|MCTB VM single Link failure||No impact. Controllers continue to function normally.||No impact|
|Complete Site 1 failure, including ESXi and controller||In the case of a site-wide issue, the MetroCluster switchover operation allows immediate resumption of service by moving storage and client access from the Site 1 cluster to Site 2. The Site 2 partner nodes begin serving data from the mirrored plexes and the sync destination Storage Virtual Machine (SVM).||Virtual machines on failed Site 1 ESXi nodes fail. HA restarts failed virtual machines on ESXi hosts on Site 2.|
|Complete Site 2 failure, including ESXi and controller||In the case of a site-wide issue, the MetroCluster switchover operation allows immediate resumption of service by moving storage and client access from the Site 2 cluster to Site 1. The Site 1 partner nodes begin serving data from the mirrored plexes and the sync destination Storage Virtual Machine (SVM).||Virtual machines on failed Site 2 ESXi nodes fail. HA restarts failed virtual machines on ESXi hosts on Site 1.|
|Single ESXi failure (shutdown)||No impact. Controllers continue to function normally.||Virtual machines on failed ESXi node fail. HA restarts failed virtual machines on surviving ESXi hosts.|
|Multiple ESXi host management network failure||No impact. Controllers continue to function normally.||A new Master will be selected within the network partition. |
Virtual machines will remain running. No need to restart virtual machines.
|Site 1 and Site 2 simultaneous failure (shutdown) and restoration||Controllers boot up and resync. All LUNs and volumes become available. All iSCSI sessions and FC paths to ESXi hosts are re-established and virtual machines restarted successfully. As a best practice, NetApp controllers should be powered on first and allow the LUNs/volumes to become available before powering on the ESXi hosts.||No impact|
|ESXi Management network all ISL links failure||No impact to controllers. LUNs and volumes remain available.||If the HA host isolation response is set to Leave Powered On, virtual machines at each site continue to run as storage heartbeat is still active. Partitioned Hosts on site that do not have a Fault Domain Manager elect a new Master.|
|All Storage ISL Links failure||No Impact to controllers. LUNs and volumes remain available.
When the ISL links are back online, the aggregates resync.
|System Manager - Management Server failure||No impact. Controllers continue to function normally.
NetApp controllers can be managed using Command Line.
|vCenter Server failure||No impact. Controllers continue to function normally.||No impact on HA. However, the DRS rules cannot be applied.|
- vSphere Storage DRS or vSphere Storage I/O Control support in a vSphere Metro Storage Cluster environment (2042596)