Configuring a shared disk resource for Windows Server Failover Cluster (WSFC) and migrating SQL Server Failover Cluster Instance (FCI) from SAN (RDMs) to vSAN
search cancel

Configuring a shared disk resource for Windows Server Failover Cluster (WSFC) and migrating SQL Server Failover Cluster Instance (FCI) from SAN (RDMs) to vSAN

book

Article ID: 326490

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

vSAN 6.7 Update 3 and later releases support SCSI-3 Persistent Reservations (SCSI3-PR) on a virtual disk level required by Windows Server Failover Cluster (WSFC) to arbitrate access to a shared disk between nodes. Support of SCSI-3 PRs enables configuration of WSFC with a disk resource shared between VMs natively on vSAN datastores.
This KB article describes how to configure a disk resource shared between VMs natively on vSAN datastores, and the steps to migrate a SQL Server Failover Cluster Instance from SAN to vSAN.
For the supported platform and OS versions, see SQL Server Failover Cluster Instance on VMware vSAN Native for detailed information.
 

Supported vSphere features:

  • Live vMotion inside of a cluster and WSFC nodes should reside on different ESXi hosts
  • vSAN stretched cluster

Unsupported features:

  • Snapshots
  • Live (online) storage vMotion
  • Hot-extension of a shared disk 
  • Hot add memory/vCPU
  • vSphere Fault Tolerance (FT)
  • Suspend or resume of a VM hosting a node of the cluster
  • Mixed environments, such as configurations where one cluster node is running a different version of ESXi than another cluster node


Resolution

Steps to configure shared disks for SQL Server Failover Cluster on WSFC on vSAN

  1. Ensure that an Active Directory environment is available.
  2. Create virtual machines (VMs) under the guide of the OS and SQL Server supportability guide on a preferred datastore for VM home.
  3. Power on all VMs, configure the hostname, IP addresses and join all VMs to an Active Directory domain. Enable Failover Cluster feature on each VM.
  4. Power off all VMs and nodes of the cluster (this is recommended although adding shared disk can be an online operation).
  5. Choose one VM to configure the virtual disks (we will be using VM1 as an example). Add one or multiple Paravirtual SCSI controllers (up to four) to the VM, and change SCSI bus sharing to “physical”, see the screenshot below.
  1. Add VMDKs from a vSAN datastore (Eager Zeroed Thick Provisioning is not required) to be used as shared disk resources. Disk sharing should be left as No Sharing (default). Disk Mode must be set to Independent – Persistent. Attach to the controller(s) created in step 5. Note: DO Not add the VMDKs for vSAN iSCSI LUNs. 
Disk-sharing.png

Proceed with other VMs, nodes of a cluster. Add the same number of vSCSI controllers as on the first node.
  1. Share the VMDKs by pointing to disks on the first cluster node (VM1) using ADD NEW DEVICE-> Existing Hard Disk. Ensure to maintain the same disk SCSI IDs as on the VM1.
  2. Power on VM1, login to Windows OS and open disk management mmc, make sure the added shared disks can be managed by the OS, initialize and format disks.
    Power on other VMs and the nodes of a cluster, if the operations adding controllers and disks were offline operations.
  3. On VM1, launch Failover Cluster mmc and use Cluster Validation Wizard to validate the nodes of WSFC.
  4. Create a Windows Failover Cluster, you may choose Disk Only as the quorum of the cluster.
  5. Add all required shared disk resources.
  6. Install SQL Server cluster on one node and add other nodes to the cluster.
  7. After installation of the SQL Server cluster, test failover of SQL Server cluster and make sure the SQL Server cluster works fine.
     


Migration Steps from SAN to vSAN

Note that before migration, backup is highly recommended to avoid potential data loss during migration. The migration is an offline operations, and the duration is mainly for data copy. Make sure the offline time window is enough before moving forward.

To migrate a SQL Server FCI cluster using pRDMs as clustered disk resources to vSAN, follow the steps below:
  1. Stop the SQL Server Cluster Role from the Windows Failover Cluster Manager
  2. Shut down all the VMs hosting nodes of Windows Failover Cluster gracefully, by clicking Power-> shut down guest OS or within the Guest OS
  3. Migrate the first node of a cluster to vSAN by choosing Change storage only in the Migrate wizard. The migration process will convert pRDMs to VMDKs, and apply the desired vSAN policies for clustered disks in the migration wizard.
  4. Power on the first node and validate that clustered disk resources are visible in the Windows Failover Cluster Manager and SQL Server Cluster Role can be started, and you may keep it online.
  5. Detach pRDMs used to host clustered disk resources from all remaining nodes of the cluster, which are not migrated to vSAN yet.
  6. (Optional) Migrate the remaining nodes to vSAN, if non-shared disks are planned to be migrated to vSAN as well.
  7. Attach disk resources back to remaining nodes of the cluster pointing to VMDKs from the first node stored on the vSAN datastore. Ensure that vSCSI controllers hosting disks are configured to use physical mode, Use ADD NEW DEVICE-> Existing Hard Disk.
  8. Start up virtual machines one by one and make sure that the SQL Server Cluster Role is online on the first node, try failover from the active node to the passive node to check if the other nodes can start SQL Server Cluster Role normally.

After executing the steps above, the SQL Server cluster can provide service from vSAN. It can manage the virtual disk by using vSAN policy including expanding the disk, changing policy to follow the Best Practice to run SQL Server on vSAN.


Failback or Rollback from vSAN to SAN

The recommended migration and rollback: migrate a first node using steps above, power on the VM, and start up the SQL Server Cluster role. If it is failed due to the disk issue, all other nodes still have original pRDMs attached and can be started immediately. And you may add back the pRDM disks to the first node following removing the virtual disks from vSAN.

If you want to roll back the configuration from vSAN to SAN, given the previous steps did not delete data from the RDM disks, you may follow the steps below to migrate disks back to SAN using pRDMs.
  1. Stop the SQL Server Cluster Role from the Windows Failover Cluster Manager
  2. Shut down all the VMs of the Windows Failover Cluster gracefully, by clicking Power-> shut down guest OS
  3. Remove all the virtual disk from the VMs
  4. Share the RDM disks across virtual machines using previous setting for disks under different SCSI Controllers.
  5. Start up the virtual machine and bring the SQL Server Cluster Role back online from Windows Failover Cluster Manager.

Migration FCI Cluster between vSAN Datastores (vSAN 6.7U3 and later)

To perform migration of a FCI cluster between vSAN datastore(s),  like from one vSAN datastore to another on different data center (or different cluster), follow the steps below. Target vSAN datastore should be at least on the version 6.7 Update 3.

  1. Power off all VMs/nodes of a WSFC cluster
  2. Migrate (using cold Storage vMotion) the first VM/node with the disks which were created from the VM
  3. Remove all shared disks from all other VMs in the WSFC cluster, which have not been migrated yet
  4. Migrate (if non-shared disks should be migrated as well) all other VMs/nodes of a cluster
  5. Add shared disks back to all other VMs/nodes pointing to the disk from the first migrated VM


Additional Information