INFINIDAT InfiniBox Active-Active Replication with VMware vSphere Metro Storage Cluster (vMSC)
search cancel

INFINIDAT InfiniBox Active-Active Replication with VMware vSphere Metro Storage Cluster (vMSC)

book

Article ID: 330043

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

This article provides information on using Infinidat InfiniBox Active-Active Replication with VMware vSphere Metro Storage Cluster (vMSC) configuration.

Resolution

Introduction to vMSC

VMware vSphere® Metro Storage Cluster (vMSC) is a specific configuration which combines storage based synchronous replication with VMware vSphere capabilities. vMSC is commonly referred to as stretched storage clusters or metro storage clusters.

Introduction to InfiniBox Storage Array

The InfiniBox® enterprise storage array delivers faster-than-all-flash performance, high availability, and multipetabyte scale for mixed application workloads. Zero-impact snapshots and active/active replication dramatically
improves business agility, while FIPS-validated data-at-rest encryption eliminates the need to securely erase decommissioned arrays. With InfiniBox, enterprise IT organizations and cloud service providers can exceed their service level objectives while lowering the cost and complexity of their petabyte-scale storage operations.

For more information please visit: www.infinidat.com

InfiniBox Active-Active Replication

InfiniBox Active-Active replication provides zero-RPO and zero-RTO, enabling mission critical business-services to keep operating even through a complete site failure:

  • Symmetric synchronous replication solution, applications can be geographically clustered.
  • Fully integrated into InfiniBox, allows simple management of application spread across data centers.

Hosts may be connected to InfiniBox:

  • Via FC (Fibre Channel), requires InfiniBox release 5.0.0 or above
  • Via iSCSI, requires InfiniBox release 5.5.10 or above

Requirements

InfiniBox:

  • Two InfiniBox 5.0 or later systems
  • Ethernet connectivity between the InfiniBox systems, used for data replication
  • Maximum of 5ms RTT latency between the InfiniBox systems

VM vSphere:

  • vSphere version 6.0 EP 20, 6.5, 6.7 or 7.0
  • Specific connectivity using FC and iSCSI is described below per storage access topology

Solution Overview

Deploying VMware vSphere Metro Storage Cluster with InfiniBox Active-Active replication provides a highly available and resilient solution for protecting applications availability and data. vMSC solutions are classified into two distinct types that are based on a fundamental difference in how the vSphere hosts access the storage systems, which influences design considerations.

  • Uniform host access - vSphere hosts on both sites are all connected to the storage systems across both sites. LUN paths presented to vSphere hosts are stretched across the sites.
  • Non-uniform host access - vSphere hosts at each site are connected only to the local storage system at the same site. LUN paths presented to vSphere hosts from storage nodes are limited to the local site.

InfiniBox Active-Active replication is supported with both uniform and non-uniform host access types. 

Uniform Storage Access

When configuring uniform host access, the ESXi hosts can access the datastores through both InfiniBox systems - the system that exists on the same site (local) and the remote system in the other site. 

  • Typically, the datastore paths to the remote system will be less optimized than the paths to the datastore on the local system, due to the added latency. (extra travel to the remote site)
    • The InfiniBox system can intelligently hint the ESXi hosts about which paths serve I/O optimally.
    • Therefore, to achieve the optimal performance, each host will utilize only the local paths in each site when issuing I/O to the storage, as long as its local system is online. 

All the ESXi hosts in the vSphere stretched cluster should be connected to both InfiniBox systems. 

  • Each host should use two initiators connected to two separate fabrics/networks.
  • Each initiator should be connected to all three hosts on both InfiniBox systems, 6 nodes in total.
  • This gives a total of 12 paths from each host to every LUN.
  • The hosts can access both InfiniBox systems and will be able to see paths to the datastore from both systems.

Storage Failure Scenario

If a system in a specific site goes offline, the ESXi hosts in that site can continue to access the storage over the nonoptimized paths to the remote system in the other site.
  • In the following example System A goes offline, The ESXi hosts in Site A can still access the storage through the Non-optimized paths to the system in Site B.
  • VMs running on hosts in Site A continue to run uninterrupted. 

Non-Uniform Storage Access

When configuring non-uniform host access, the ESXi hosts on each site can access the storage only through the local InfiniBox system - the system that exists in the same site.
  • Each host will use only the local (optimized) paths in each site when issuing I/O to the storage.
The ESXi hosts in each site are connected only the InfiniBox system in the same site.
  • Each host should use two initiators connected to two separate fabrics/networks.
  • Each initiator should be connected to all three hosts on its InfiniBox systems.
  • This gives a total of 6 paths from each host to every LUN.
  • The hosts can access both InfiniBox systems and will be able to see paths to the datastore from both systems .

Storage Failure Scenario

If a system in a specific site goes offline, the ESXi hosts in that site cannot continue to access the storage.
  • Therefore, the affected VMs must be recovered on the hosts in the other site. This can happen automatically by vSphere HA.
  • In the following example System A goes offline, the ESXi hosts in Site A cannot access the storage. vSphere HA can automatically restart the affect VMs from Site A on the hosts in Site B. 

How InfiniBox Handles Failures

InfiniBox has two mechanism to handle failures for Active-Active replication: Witness and preferred system.
  • If an InfiniBox system becomes unavailable, e.g. power outage of the entire site, the peer system will provide access to all the volumes.  
  • If the replication link between the systems fails, then datastores will continue to serve I/O on one of the systems. Each datastore has a property in InfiniBox that defines its preferred-system, which will remain online. 

InfiniBox Witness

The witness is an arbitrator entity residing in a 3rd site (separate from the two InfiniBox systems involved in ActiveActive replication), that acts as quorum in case of replication link failures. The witness is a lightweight stateless
software deployed as a VM. 
  • If the witness is down or inaccessible, replication link failure will result in InfiniBox systems keeping volumes online according to their preferred-system settings.

Preferred system 

Each volume that is undergoing Active-Active replication has a definition for preferred system, which the witness uses to make correct decisions. 
  • If the witness is not available to the systems, the decision on which side stays active will be done per replica based on the preferred system.

Storage Failover 

InfiniBox Active-Active replication failover is fully automatic and does not require any storage administrator intervention.

Storage Replication Resynchronization and Recovery

InfiniBox Active-Active recovery is completely automatic; no storage administrator intervention is necessary to trigger a re-sync and recover replication.

If the InfiniBox systems got disconnected, the replication will internally fallback to async mode. Once the connectivity between the systems recovers, synchronization jobs will start replicating the missing data to the lagging system. During this time, from disconnection and through the re-sync progress the Active-Active volumes on the synchronized system serve I/O operations, while the remote side will be in lagging state until all data is synchronized between the volumes.
  • Once the volumes are nearly in sync, they will smoothly transition to Sync replication mode, with no I/O disruption. The host paths to the lagging side will be automatically restored, allowing the hosts to perform I/O operations through both systems again.