Tech ONTAP Blogs
Tech ONTAP Blogs
Are you moving your self-managed SQL Server instances to Google Cloud, but are concerned about maintaining high availability, controlling licensing costs, and ensuring consistent backup and recovery practice? Google Cloud NetApp Volumes (GCNV) now supports iSCSI shared storage with the Flex Unified service level, so now you can deploy the same reliable cost savings model in the cloud that you trust OnPrem.
Google Cloud NetApp Volumes (GCNV) Flex Unified, with iSCSI block storage, allows you to maintain the same OnPrem FCI model by delivering cloud‑native shared iSCSI storage. GCNV provides enterprise‑grade performance with ONTAP data‑management capabilities, including snapshots, thin clones, backups, and cross‑region replication all in a fully managed service!
This blog walks through the SQL Server HA architectures, how FCI is deployed using GCNV Regional iSCSI volumes, how application‑consistent snapshots and clones work, and how to build cross‑region DR for SQL Server.
Most organizations use one of two architectures to deploy a MSSQL database solution, Failover Cluster Instance (FCI) architecture or Always On Availability Groups (AOAG) architecture. Both of these provide a highly available solution, but FCI is generally considered more efficient. The architectures are described below.
A FCI architecture is a single SQL Server instance installed across multiple Windows Server Failover Cluster (WSFC) nodes. It uses shared storage, meaning only the active node accesses the database files at any time. If the active node fails, WSFC automatically fails the instance over to another node with zero data loss because both nodes share the same underlying LUNs. FCI provides instance‑level protection, covering logins, jobs, SQL Agent metadata, and system databases.
The AOAG architecture consists of multiple SQL server instances that need to replicate their data to separate LUNs to maintain high availability. One instance maintains read/write and the other instances are read-only. When the primary instance has a change to its data, MSSQL must replicate the change to all the read-only instances. If the primary read/write instance fails, one of the read-only instances becomes the primary.
While maintaining high availability, the AOAG architecture requires several copies of the same volume, thereby increasing storage costs. MSSQL replication must also be configured to all the other read-only instances, which requires expensive MSSQL Enterprise licensing. Additionally, to achieve required performance to the database, very often larger compute instances than required by MSSQL need to be deployed to avoid a hypervisor bottleneck. The larger compute instances then cause the MSSQL Enterprise license costs to increase even further.
However, enterprises adopting FCI with iSCSI shared storage on Google Cloud NetApp Volumes obtain high availability all while simplifying the architecture and drastically saving on costs. Since the databases do not need to be replicated among several MSSQL instances, enterprises with the FCI architecture can save on storage costs, and can deploy the less expensive SQL Server Standard licenses. They can also right size their VMs to precisely only what MSSQL needs - knowing that network performance is not affected by VM size. This results in a large cost savings as seen at Cut SQL Server Costs by up to 50% with Google Cloud NetApp Volumes . As a bonus, we can let the storage layer handle backups and disaster recovery lowering CPU usage on the VMs.
The following steps are required to deploy the FCI Architecture in Google Cloud with Google Cloud NetApp Volumes
These steps mirror traditional on‑prem SAN design, minimizing change and easing migrations of existing SQL deployments.
One of the most powerful high‑availability features for SQL Server FCI deployments on Google Cloud NetApp Volumes is the use of Regional storage pools. These types of pools replicate data synchronously across two independent zones within a region, ensuring that shared block storage remains available even if an entire zone goes down. They deliver 99.99% availability for shared LUNs. This matches the expectations for mission‑critical applications requiring High Availability. Regional pools can be provisioned with independent capacity, throughput (up to 5,120 MiB/s), and IOPS (up to 160k IOPS). Volumes used for databases are deployed within the pools.
Since Regional GCNV pools protect the database against a zonal failure, the WSFC nodes are intentionally deployed so that the primary WSFC node resides in the same zone as the primary GCNV pool, while the standby WSFC node is placed in the replica zone. This ensures that SQL Server FCI has consistent shared block storage accessible from both nodes across the two zones.
This design directly complements Windows Server Failover Clustering, enabling fully automated failover with no manual intervention, even during zonal outages. Let’s go through an example.
Let’s take a concrete example:
GCNV synchronously mirrors storage blocks between these two zones, so both nodes see the same shared iSCSI LUNs at all times.
When both zones are available:
This matches the expected behaviour of on‑prem SANs providing dual‑controller synchronous mirroring — except now it’s cloud‑native.
If us‑east1‑b experiences a zonal outage due to infrastructure, networking, or power disruptions, the following will occur on the storage layer:
Google Cloud NetApp Volumes Regional pools are specifically built to ensure availability during zonal failure events. But what happens to the compute (Windows Server Failover Cluster) layer?
Access to the database, both compute and storage layers, is maintained throughout the zonal failover process. This is exactly how customers expect SQL FCI to behave on‑prem — and it now works the same way in Google Cloud.
One of the biggest advantages of running SQL Server on Google Cloud NetApp Volumes is the ability to take instantaneous storage level snapshots for backup, rapid restore, cloning, and Dev/Test.
By default, GCNV creates crash‑consistent snapshots—which are suitable for many workloads but may contain unflushed data because client caching is not synchronized. For workloads like SQL Server that maintain their own sophisticated buffer pool and transaction logging, crash‑consistency is safe, but recovery involves going through crash‑recovery roll‑forward/roll‑back logic upon restore.
However, for mission‑critical SQL Server databases and FCI deployments, organizations often require application consistent snapshots, ensuring a clean recovery point that aligns with how SQL Server manages in‑flight I/O and page flushing.
GCNV fully supports application consistent snapshots through a simple workflow that combines SQL Server’s native T‑SQL quiesce mechanisms with ONTAP’s instantaneous snapshot engine. This workflow provides a true application‑consistent recovery point without the overhead of a streaming full backup. Unlike a backup, an app‑consistent snapshot requires no streaming I/O, has negligible performance impact and uses almost no additional storage capacity initially (copy-on-write).
To reach full application consistency, use the following steps: the workflow is:
This workflow provides a true application‑consistent recovery point without the overhead of a streaming full backup.
Restoring from an application‑consistent snapshot on GCNV is nearly instantaneous and
Because the snapshot was taken after SQL had flushed all buffers, recovery time inside SQL Server is minimized—no crash‑recovery roll‑back/roll‑forward operations.
This enables:
An additional benefit is that any application‑consistent snapshot can be used to create thin clones, giving immediate, writable copies of the database for:
You can now spin up a full environment with production‑grade data—without duplicating terabytes of storage.
GCNV supports SnapMirror‑based cross‑region replication, allowing SQL Server databases stored on iSCSI LUNs to be replicated asynchronously to another Google Cloud region.
Replication schedules include every 10 minutes for low‑RPO DR.
This architecture provides:
Google Cloud NetApp Volumes Flex Unified, with iSCSI block storage, brings enterprise‑grade SAN capabilities to Google Cloud—critical for organizations running SQL Server workloads that demand predictable performance and consistent shared storage. Running SQL Server FCI on Google Cloud NetApp Volumes iSCSI gives organizations the best of both worlds: the robustness and familiarity of an on‑prem SAN architecture combined with the agility and cost efficiency of the cloud.
Ready to get started? Head to the Google Cloud NetApp Volumes console and experience the power of NetApp Volumes block storage today! Contact a specialist for more information.