Tech ONTAP Blogs
Tech ONTAP Blogs
Configuring a SQL Server multi-subnet failover cluster with 4 nodes in different availability zones involves using two subnets in different availability zones acting as the active nodes and two nodes in different regions acting as the passive nodes or DR nodes. This configuration provides both high availability and disaster recovery solutions with SQL Server 2022. It helps customers save on licensing costs, and in the event of a failover, it provides instance-level protection for your database, as SQL Server Agent jobs, certificates, and server login, which are stored in your system databases and physically stored in the shared storage, are moved.
Work with your storage administrator and network architect to ensure everything is properly configured for high availability and disaster recovery (DR).
To create the volumes, these are the properties needed to proceed:
In the Storage tiering section, set the following:
Repeat the process to create all the other volumes in the Oregon region. Once completed, all volumes should be successfully created.
Select the file system to create the volumes.
1. Select "Create Volume".
2. Select the storage virtual machine from the drop-down menu if multiple SVMs are available (virgsva).
3. Provide the following details:
o Volume name: data_cp
o Volume style: flexVol
o Volume size: 20 GB
o Volume type: Data Protection (DP)
In the Storage tiering section set the following:
o Capacity pool tiering policy: none
click on create.
Repeat the process to create all the other volumes in the Virginia region. Once completed, all volumes should be successfully created.
3. When opening the iSCSI Initiator for the first time, it will display a warning that iSCSI is not running. Click "Yes" to start the iSCSI service and then reopen the iSCSI Initiator.
4. In the iSCSI Initiator, Move to the "iSCSI Properties" tab.
5. Click on the "Discovery" tab.
6. Select "Discover Portal".
7. Enter the iSCSI IP address of the file system.
8. Copy the ISCSI endpoint from the file system.
9. Enter the iSCSI IP address of the file system.
10. Click on "Advanced".
11. From the "Local adapter" drop-down menu, select "Microsoft iSCSI Initiator".
12. From the "Initiator IP" drop-down menu, select the appropriate IP address.
13. Click "OK".
14. Repeat the same process with the second IP address.
Follow the same steps on the DR nodes.
1. In the iSCSI Initiator, Move to the "iSCSI Properties" tab.
2. Click on the "Discovery" tab.
3. Select "Discover Portal".
4. Copy the iSCSI endpoint from the file system.
5. Enter the iSCSI IP address of the file system.
6. Click on "Advanced".
7. From the "Local adapter" drop-down menu, select "Microsoft iSCSI Initiator".
8. From the "Initiator IP" drop-down menu, select the appropriate IP address.
9. Click "OK".
10. Repeat the same process with the second IP address.
All iSCSI IP addresses are now added.
1. Click on the "Targets" tab.
2. The initiators should be present. Now, let's establish the connection.
3. Select one of the inactive initiators and click on "Connect".
4. A window will pop open. Click on "Enable" and then "Advanced".
5. From the "Local adapter" drop-down menu, select "Microsoft iSCSI Initiator".
6. From the "Initiator IP" drop-down menu, select the appropriate IP address.
7. From the "Target portal IP" drop-down menu, select one of the iSCSI IP addresses
8. Click "OK".
9. Repeat the same process with the second iSCSI IP address and click "OK".
10. Both iSCSI IP addresses are now connected.
11. Repeat the same process on all 4 nodes, connecting the iSCSI initiator.
• Creating the LUNs
Luns are created using CLI. GUI version not available.
Copy the administrator IP from the file system and log in to any CLI of choice.
Here is the script to create the RW LUN on the active site. (Oregon)
1. LUNs are created using the CLI as the GUI version is not available.
2. Copy the administrator IP from the file system. (TriSubSQL)
3. Log in to any CLI of your choice.
4. Use the following script to create the RW LUN on the active site (Oregon):
lun create -vserver trisqlva -path /vol/data/VoltriData -size 15gb -ostype windows_gpt -space-allocation enabled
lun create -vserver trisqlva -path /vol/log/Voltrilog -size 15gb -ostype windows_gpt -space-allocation enabled
lun create -vserver trisqlva -path /vol/logdirectory/VoltriSNAPLOG -size 15gb -ostype windows_gpt -space-allocation enabled
5. Use the lun show command to verify that the LUN has been created. Luns are created but are currently unmapped.
6. Script to script a LUN igroup and adding the igroup
lun igroup create -vserver trisqlva -igroup sqlfcisub -initiator iqn.1991-05.com.microsoft:winfcisqloren01.nimorg.com -protocol iscsi -ostype windows
lun igroup create -vserver trisqlva -igroup sqlfcisub -initiator iqn.1991-05.com.microsoft:winfcisqloren02.nimorg.com -protocol iscsi -ostype windows
igroup add -vserver trisqlva -igroup sqlfcisub -initiator iqn.1991-05.com.microsoft:trisqlfcoren01.nimorg.com
igroup add -vserver trisqlva -igroup sqlfcisub-initiator iqn.1991-05.com.microsoft:trisqlfcoren02.nimorg.com
7. The igroup is fully configured.
8. Mapping the LUNs
9. Here is the script to map the LUNS.
lun mapping create -vserver trisqlva -path /vol/data/VoltriData sqlfcisub -lun-id 1
lun mapping create -vserver trisqlva -path /vol/log/Voltrilog sqlfcisub -lun-id 2
lun mapping create -vserver trisqlva -path /vol/logdirectory/VoltriSNAPLOG sqlfcisub -lun-id 3
Cli version of mapped LUNs.
Bringing the disks Online
1. Select any active node. (WinFCISQLoreN01)
2. Click on the Windows icon.
3. Select "Disk Management".
4. In Disk Management, click on "Action" in the menu.
5. Select "Rescan Disks" to pull in the new disks.
6. Right-click on disk 1 to bring online.
7. Select "Online" to bring the disk online.
8. Right-click again on Disk 1.
9. Select "Initialize Disk".
10. A pop-up window will open. Select "GPT (GUID Partition Table)" and click "OK".
11. Click on the empty space of the disk.
12. Select "New Simple Volume".
13. The New Simple Volume Wizard will open. Click "Next".
14. The volume size is picked up by default. Click "Next".
15. The driver’s letter is picked up by default but can be changed from the drop-down menu. Click "Next".
16. Provide the volume name and click “Next”
17. Click "Finish" to complete the disk setup.
18. Repeat the same process for the other two disks.
All disks are completely set up.
19. Open File Explorer.
20. Click on "This PC" to validate the presence of the disks.
21. Move to node two. (WinFCISQLoreN02)
22. Click on the Windows icon.
23. Select "Disk Management".
24. In Disk Management, click on "Action" in the menu.
25. Choose "Rescan Disks" to bring the disks online.
26. The disk mode should be offline since it is a shared disk.
Creating the DP Lun and SnapMirror relationship.
SnapMirror relationship is created using CLI. Follow these steps:
1. Copy the administrator IP of the DR file system. (subsqlfsx)
2. Log in to any CLI of your choice.
3. Use the following script to create a SnapMirror relationship between the source and the destination volume:
snapmirror policy create -vserver virgsva -policy sqldr -type async-mirror
snapmirror create -source-path trisqlva:data -destination-path virgsva:data_cp -type xdp -policy sqldr -schedule 10min
snapmirror create -source-path trisqlva:log -destination-path virgsva:log_cp -type xdp -policy sqldr -schedule 10min
snapmirror create -source-path trisqlva: logdirectory -destination-path virgsva: logdirectory_cp -type xdp -policy sqldr -schedule 10min
4. The SnapMirror relationship has been created.
5. Next, initialize the SnapMirror relationship using the following command:
snapmirror initialize -destination-path virgsva:log_cp -source-path trisqlva:log
snapmirror initialize -destination-path virgsva:data_cp -source-path trisqlva:data
snapmirror initialize -destination-path virgsva: logdirectory_cp -source-path trisqlva:logdirectory
6. The SnapMirror relationship is fully initialized.
7. Use the following script to create an igroup and add initiators to it:
lun igroup create -vserver virgsva -igroup subfcivir -initiator iqn.1991-05.com.microsoft:winfcisqlvirn01.nimorg.com -protocol iscsi -ostype windows
lun igroup create -vserver virgsva -igroup subfcivir -initiator iqn.1991-05.com.microsoft:winfcisqlvirn02.nimorg.com -protocol iscsi -ostype windows
igroup add -vserver virgsva -igroup subfcivir -initiator iqn.1991-05.com.microsoft:winfcisqlvirn01.nimorg.com
igroup add -vserver virgsva -igroup subfcivir -initiator iqn.1991-05.com.microsoft:winfcisqlvirn02.nimorg.com
8. The igroup has been fully added to the volumes.
Mapping the volumes to the DR node.
9. Use the following script to map the volumes to the DR node:
lun mapping create -vserver virgsva -path /vol/data_cp/VoltriData subfcivir -lun-id 1
lun mapping create -vserver virgsva -path /vol/log_cp/Voltrilog subfcivir -lun-id 2
lun mapping create -vserver virgsva -path /vol/logdirectory_cp/VoltriSNAPLOG subfcivir -lun-id 3
10. The LUN is fully mapped to the DR node.
11. Move over to any of the DR nodes (WinFCISQLvirN01)
12. Click on the Windows icon.
13. Right-click on the Windows icon.
14. Select "Disk Management".
15. In Disk Management, click on "Action" in the menu.
16. Select "Rescan Disks" to pull up the disks in the system.
17. The disks should be online.
18. Right-click on Disk 1.
19. Select "Online" to bring the disk online.
20. Repeat the same process to bring all the other disks online.
21. Repeat the same process on the second DR node. (WinFCISQLvirN02)
22. The volumes should all be online.
23. Verify that the volumes are present in File Explorer.
All volumes are completely set up on all four nodes, both the Active nodes (Oregon) and the DR nodes (Virginia).
Cluster validation is a crucial step in ensuring the reliability, performance, and availability of a cluster setup. Cluster validation checks that all nodes in the cluster are configured consistently. This includes network settings, storage configurations, and software versions. Validation ensures that storage is correctly configured and accessible by all nodes in the cluster. This includes verifying that shared storage is properly set up and that there are no issues with disk access or performance.
To validate the cluster, follow these steps:
1. Open the Server Manager.
2. Click on "Tools" in the menu.
3. Select "Failover Cluster Manager".
4. On the upper right side or in the middle of the Failover Cluster Manager, select "Validate Configuration".
5. The "Validate a Configuration Wizard" will open. Click "Next".
6. Browse for all the nodes that will be part of the cluster and add them.
7. Once all the 4 nodes are added to the cluster, click "Next".
8. Select "Run all tests (recommended)" and click "Next".
9. On the confirmation page, it will list all the tests to be run. Click "Next".
10. Allow the tests to run to completion.
11. The cluster validation tests have been completed.
1. The cluster validation tests have been completed.
2. After a successful cluster validation, check the box "Create the cluster now using the validated nodes".
3. Click the "Finish" button.
4. The "Create Cluster Wizard" will open. Click "Next".
5. Enter the cluster name following your company's naming conventions.
6. Uncheck the box "Add all eligible storage to the cluster”. Click "Next".
Click "Next".
7. The cluster is now being created. Allow it to run.
8. Once the process is complete, click "Finish".
Cluster configuration involves adding a static IP, adding storage, and configuring a quorum. To add the cluster IP, follow these steps:
1. Log into your AWS console.
2. Select the node (WinFCISQLoreN01)
3. Click on "Actions"
4. Select "Networking"
5. Select "Manage IP addresses"
6. Extend eth0 and click "Assign new IP address".
7. Enter the IP address following the subnet addressing. We are going to assign two IP addresses:
1. One IP for the Cluster.
2. The second IP for SQL Server installation.
8. Check the box "Allow secondary private IPv4 addresses to be assigned".
9. Click "Save".
10. Click "Confirm".
11. On the "Networking" tab, you can see the secondary IP addresses.
12. Repeat the same process on all nodes in the cluster.
13. Go back to your Windows Cluster Manager.
14. The cluster will be in a failed state, right-click on the IP address and select "Properties".
15. In the properties window, check the box "Static IP address" and enter the secondary IP address generated on the EC2.
16. Click "Apply".
17. Select the "Advanced Policies" tab.
18. Uncheck all the other nodes.
19. Click "Apply" and then "OK".
.
20. Repeat the same procedure for the other 3 nodes.
21. All IP addresses are added.
22. Click on the cluster (ClusSQLWinFCI) to bring the cluster up.
The cluster is online.
To add the disk to the cluster, follow these steps:
1. Ensure the cluster is online.
2. In the Failover Cluster Manager, click on "Disks".
3. At the upper right, select "Add Disk".
4. All the available disks will be listed. Click "OK".
5. The disks are available online.
6. To rename the disks from "Cluster Disk", follow these steps:
o Ensure the disks are available and online.
o In the Failover Cluster Manager, right-click on the disk you want to rename.
o Select "Properties".
o Enter the new name for the disk.
o Click "Apply" and then "OK".
o Repeat the same procedure for the other disks.
7. All disks are properly configured.
The cluster is properly configured. The next step is to install SQL Server.
To create a computer object for SQL Server, follow these steps:
1. Open Server Manager.
2. Click on "Tools" in the menu.
3. Select "Active Directory Users and Computers".
4. Right-click on the "Computers" folder.
5. Select "New" and then choose "Computer" from the menu.
6. Enter the computer name following your company's naming conventions and best practices.
Click "OK" to finish.
7. Right-click on the domain name.
8. Choose "Delegate Control".
9. The "Delegation of Control Wizard" will open. Click "Next".
Click "Add".
10. Select "Object Types".
11. Check the box for "Computers" and uncheck the other objects. Click "OK"
.
12. Enter the SQL Server computer object. Click "OK".
13. With the computer object selected, click "Next".
14. Check the box "Create a custom task to delegate”. Click "Next".
15. Check the box "Only the following objects in the folder".
16. Check the boxes for "Computer objects", "Create selected objects in this folder", and "Delete selected objects in this folder". Click "Next".
17. Check the box "Full Control". The other boxes will automatically get checked. Click "Next".
Click on finish
18. To give the cluster full permission to all four nodes, right-click on the cluster computer object in Active Directory.
19. Select "Properties".
20. Select the "Security" tab. Click "Add" to add all four nodes in the cluster.
21. Select all the nodes in the cluster and the service account.
22. Give full permission by checking the "Full Control" box.
23. Click "Apply" and then "OK".
Now that our cluster and computer object have full permissions, we can proceed with installing SQL Server.
To install SQL Server, follow these steps:
1. Open the media file on your computer.
2. Select "Setup".
3. In the SQL Server Installation Center, select "Installation".
4. Click on "New SQL Server failover cluster installation".
5. Allow the setup to run.
6. In the "Edition" tab, enter your edition and product key. Click "Next".
7. On the "License Terms" page, accept the terms. Click "Next".
8. On the "Microsoft Update" page, click "Next".
9. Click "Next" on the "Install Setup Files" page and allow the installation of Failover Cluster Rules. If there are no failures, proceed with the setup. If there are failures, fix them and then proceed.
Note: If you did not run cluster validation, it will fail at this point.
10. Select the key features to install.
11. On the "Instance Configuration" page, enter the SQL Server computer object created in Active Directory with full permission.
12. Select "Named instance" and enter the instance name following your company's naming conventions.
13. On the "Cluster Resource Group" page, click "Next".
14. Select all the disks to be added and click "Next".
15. Enter the Cluster Network Configuration IP address. This IP was generated on the EC2 node. Click "Next".
16. On the "Server Configuration" page, enter the SQL Server account name and password with full permission.
17. Click on the check box "Grant Perform Volume Maintenance Task privilege to SQL Server Engine Service". Click "Next".
18. On the "Database Engine Configuration" page, click "Add Current User".
19. Check the box "Mixed Mode" and provide the password. Click "Next".
20. Enter the required information. Click "Next"
21. Click on the "Data Directories" tab.
22. Select the disk created for the data directories. Click "Next".
23. On the "Ready to Install" page, review the summary of all the features that will be installed.
24. Allow the installation to run to completion.
The installation is completed.
To install SQL Server on the second node, follow these steps:
1. Open the media file on the second node.
2. Select "Setup".
3. In the SQL Server Installation Center, select "Installation".
4. Choose "Add a node to a SQL Server failover cluster".
5. Allow the setup process to run.
6. Select "Evaluation" and then click "Next".
7. Check the box to accept the License Terms and then click "Next".
8. Click "Next" on the "Microsoft Update" page.
9. Click "Next" on the "Product Updates" page.
10. Allow the setup files to run, checking for potential problems.
11. The cluster node is selected. Click "Next".
12. Uncheck the box for DHCP.
13. Check the box for IPv4.
14. Enter the secondary IP address belonging to the node. Click "Next".
15. Click "Yes" on the pop-up window.
16. Enter the password for the service account.
17. Check the box "Grant Perform Volume Maintenance Task privilege to SQL Server Engine Service". Click "Next".
18. On the "Ready to Add Node" page, click "Install".
19. Allow the installation to run to completion.
Repeat step 1-13 on the third node
Enter the SQL Server IP address (WinFCISQLoreN02)
Repeat the same process on the fourth node (WinFCISQLvirN02)
We have completed the installation of SQL Server.
To install SQL Server Management Studio (SSMS), follow these steps:
1. Locate the media file for SQL Server Management Studio on your computer.
2. Open the media file.
3. Click on "Install"
4. Allow the installation to run to completion.
5. The setup is completed.
To perform a failover, we need to pause or stop the two active nodes (WinFCISQLoreN01 & WinFCISQLoreN02) in Oregon,
The state of the FCI cluster before the failover.
Cluster management IP all active.
Status of the Cluster role (Running)
All SQL Server Nodes are running.
All the disk/Volumes are online.
Checking the status of SnapMirror relationship
Deleting 3 tables on the database (DemoDB) before failover. (dbo. OrderItems, dbo.orders & dbo. Products)
DemoDb Database before failover.
State of DemoDb database after tables are deleted
Putting the Active Nodes on paused. (WinFCISQLoreN01 & WinFCISQLoreN02)
State of the SQL Server role after failover: Failed
Breaking the SnapMirror Relationship between Active Region and DR Region.
1. Log in to the CLI of the DR File System: subsqlfsx
o Copy the administrator IP of the DR file system.
o Log in to any CLI of your choice using SSH.
script to Break the SnapMirror Relationship:
snapmirror break -destination-path virgsva:data_cp
snapmirror break -destination-path virgsva:log_cp
Status of the volumes
On the cluster, the manager refreshes or performs a failover to any node in the DR region to bring the SQL Server resources online.
Failover over to node 1(WinFCISQLvirN01)
Log in to SQL Server Management Studio (SSMS) to verify that all deleted tables remain deleted.
The state of the volumes becomes read/write (RW) at the DR region (Virginia) after breaking the SnapMirror relationship.
Let’s create a database (Dr_DemoDB) at the DR region (Virginia).
Let create a Folder at the volume level.
Bring the Nodes Online and resyncing the data.
With all nodes having read and write permissions, there is a failover between the nodes.
The DR_DemoDB created at the DR nodes is missing. We need to perform data resync between the active region (Oregon) and DR region (Virginia)
DR_DemoDB present at the DR node.
All nodes are failover.
Resyncing the data.
Script to resync the volumes at the Source region (Virginia) to DR region (Oregon)
snapmirror resync -source-path virgsva:data_cp -destination-path trisqlva:data
snapmirror resync -source-path virgsva:log_cp -destination-path trisqlva:log
Resyncing the volumes on the DR region
snapmirror resync -source-path virgsva:data_cp -destination-path trisqlva:data
snapmirror resync -source-path virgsva:log_cp -destination-path trisqlva:log
Verifying data has resync at the active region (Oregon)
To make the source volume active you need to break the relationship. And move the volume from DP to RW.
To install SnapCenter some prerequisites must ne meets.
You must have installed the dotnet-hosting 8.5 version and Powershell-7.4.2
To install SnapCenter locate the media file and click on open.
"Click 'Next' on the welcome wizard.
Click 'Next' again and allow the installation to run to completion.
Click 'Finish'. The installation took about 5 minutes to complete.
Open SnapCenter application with any browser of your choice.
Enter username and password
Click on the Get Started to follow the steps in configuration.
Add storage connections. Click on the hyperlink; it will take you directly to the storage system. Enter both FSx file systems.
File system Virginia
File system Oregon
Both File systems were added.
Add your domain into SnapCenter.
Domain added.
Add your credentials: click on credentials on the top and select new enter your credentials information.
Add the host. Click the host and select new enter the host information by checking the box Microsoft SQL Server and Microsoft Windows and click on submit.
It is going to validate the host, ensuring that it meets the requirements, register the host, and finally install the SQL Server plugin.
SQL server plugins are being added to all the host.
SQL Server Plugin fully installed.
Click on the disk to make sure all the disks are present.
Moved to the resource tab to make sure all the sql server resources are presence.
Click on the configure log directory.
The FCI Instance is pull in click on browser and select the file path
Click on save. The overall status changes to running. SnapCenter is full configure to backup SQL Server databases.
To create a Full Backup policy, go to Settings where the Policy is selected by default. Click on 'New' to start creating a policy.
Enter the policy name and a brief description of the policy.
Under Policy Type, select Full Backup.
Select the schedule frequency as per PTO/RPO
Under Replication and Backup, select 'Update SnapMirror after creating a local Snapshot copy' and from the drop-down menu, select 'Hourly'.
Skip the script and Verify tab.
Click finish on Summary.
Still in the Policy section, click 'New' and enter the name and a detailed description.
Under Policy Type, select Log Backup.
Under Schedule Frequency, select Hourly.
Under Replication and Backup, check the box for 'Update SnapMirror after creating Snapshot copy'.
Move to the Secondary Policy Label and select 'Hourly' from the drop-down menu.
Skip the Script and Verification tabs, and click 'Finish' on the Summary tab
Full and Log Backup policies have been fully created.
Move to the Resource tab and select the database (DemoDB) to back up.
Select the policy from the drop-down menu and click the plus sign to set the schedule.
If there are multiple verification servers, select the appropriate one and click on the Load Locator to pull up the destination volumes.
Click 'Finish' on the Summary page.
Click on 'Backup Now' and then click on 'Backup'.
Backup completed.
SnapMirror backup Topology
Performing a Log backup Policy.
Select 'Modify' from the top right and click 'Next' on the Protect Database page.
Select the policy from the drop-down menu, click the plus sign, and set the time.
Click on the Load Locator to plus up the volumes.
Skip the Notification tab and click 'Finish' on the Summary tab.
Run an on-demand log backup to verify everything is working correctly, with no failures or warnings.
Select 'Backup Now', choose the log backup policy, and click 'Backup'.
Log backup completed.
Restoring SQL Server Database to the DR Region Using SnapCenter
The following steps must be followed.
• To bring the DR node up and running, the SnapMirror relationship must be broken.
• Activate DR in SnapCenter
• Proceed to restore the database.
Let's delete some tables and then restore the database.
Open Node 1 (WinFCISQLore01) to view the state of the database before deleting tables’ (dbo. Table4 and dbo. Table5.)
The database states after the tables have been deleted.
Pause the two active nodes to trigger a failover, bringing the DR nodes online.
Script to break the SnapMirror relationship.
snapmirror break -destination-path virgsva:data_cp
snapmirror break -destination-path virgsva:log_cp
snapmirror break -destination-path virgsva:logdirectory_cp
Open the Cluster Manager and manually fail over to the DR node (WinFCISQLvir01)
On the SnapCenter server, go to Settings, select Global Settings, and click on Disaster Recovery
Check the box for "Enable Disaster Recovery" and click Apply.
When a pop-up window appears to enable Disaster Recovery, click OK
Move to the Resources tab and select the database to restore.
Since the database is in a failed state, click on Mirror Copies and select the most recent backup copy
.
After selecting the most recent backup, click on Restore.
The restore wizard opens with the destination volumes populated. Click Next.
Leave the default setting, "Restore the database to the same host where the backup was created.”
Choose one of the four log restore options; in this case, select the option "Restore by a log backup until."
Check the box "Overwrite the database with the same name" during restore.
Click Next on the Post-Restore Options.
Click Next on the Email Settings.
Click Finish on the Summary.
Move to the Monitor tab to track the progress of the job. The restore job has been completed.
Move over to the node (WinFCISQLvirN01) to verify that the tables have been successfully restored.