SQL Server

SQL Server High-Availability: How and Why?

Sushil Naik

March 15, 2024

Have questions or want a demo?

We’re here to help! Click the button below and we’ll be in touch.

This guide provides a detailed walkthrough of the procedures to establish a high-availability (HA) configuration for Microsoft SQL Server, guaranteeing database robustness and minimizing downtime during failures or maintenance operations.

Setup

Node configuration

Node 1: Primary Node is in one availability zone
Node 2: Secondary Node is in another availability zone.

Provisioning Process

The provisioning process for MSSQL High Availability (HA) Availability Groups on Windows Server Failover Clusters (WSFC) involves below key steps.

For Domain Independent Failover Cluster Groups

Node 1 and Node 2 need to permit communication over TCP, ICMP, and UDP protocols.
The IP address entries for both Node 1 and Node 2 should be added to the "hosts" file located at C:\windows\system32\drivers\etc on both machines.

For Active Directory Failover Clusters

Create a sense of trust and establish connectivity between two Windows machines within the same Active Directory environment.

Create Windows Failover Clusters and configure AAG

Installing a Windows Server Failover Cluster provides high availability by allowing multiple servers to work together, ensuring seamless service continuity in case of a node failure.

Install-WindowsFeature -Name Failover-Clustering -IncludeManagementTools

Add Node 1 and Node 2 to the failover cluster
Create Always-on Availability Group on Node-1 MSSQL Server with FailoverMode=Automatic and CommitMode=Synchronous

Fail-over Scenarios

Scenario 1: Node 1 Failure

If Node 1 (Primary) goes down, the Node 2 takes over as Primary.
Node 2 becomes the new Primary Instance.
When Node 1 recovers, it synchronizes with the Primary (Node 2) and acts as a Failover Replica. Fast-Start Failover is automatically enabled.

Scenario 2: Node 2 Failure

If Node 2 (Standby) goes down, Node 1 continues as the Primary.
DB Role Monitor will update the status as Node 2 down and SRE team will look into it.
Upon Node 2's recovery, it syncs with Node 1 and continues as a Fail-over Replica.

This applies to EBS failure, Compute Failure and AZ failure.

Patching

High availability patching is available for all nodes in both the Primary and Fail-over Replica configurations, ensuring uninterrupted operations.

The patching process follows this sequence:

Apply the new CU version to the Failover Instance (Node 2).
Execute a switchover, designating Node 2 as the new Primary.
Apply the new CU version to Node 1, which was the previous Primary

Have questions or want a demo?

We’re here to help! Click the button below and we’ll be in touch.

Get a Demo

These might also interest you

Oracle

SQL Server High-Availability: How and Why?

Have questions or want a demo?

Setup

Node configuration

Provisioning Process

For Domain Independent Failover Cluster Groups

For Active Directory Failover Clusters

Create Windows Failover Clusters and configure AAG

Fail-over Scenarios

Scenario 1: Node 1 Failure

Scenario 2: Node 2 Failure

Patching

Have questions or want a demo?

Related articles

Best Practices for Oracle Table Design with Tessell

Best Practices for MySQL Table Design with Tessell

What is DBaaS (Database as a Service)?

Get the best performance while being very cost effective.