StorageGRID and ONTAP S3 support: Differences, similarities, and integration

Tudor · ‎2022-11-08

The NetApp® StorageGRID® object storage solution is used on premises to store unstructured data at scale. It helps organizations maintain a geographically dispersed datastore for unstructured data that is scalable and simple to use. StorageGRID offers native support for Amazon Simple Storage Service (Amazon S3) APIs and delivers industry-leading innovations such as automated lifecycle management to store, to secure, to protect, and to preserve unstructured data cost effectively.

For a long time, StorageGRID was the only NetApp portfolio product that supported Amazon S3 APIs —which reminds me of the cliché that if the only thing you have is a hammer, then everything looks like a nail. Like any aphorism, it contains a bit of truth. But then NetApp ONTAP® 9.8 was released, which added S3 support, in addition to the plethora of protocols that it was already supporting. For more information about ONTAP S3 support, I highly recommend the technical report TR-4814.

The good news is that we now have more options and more tools to meet the needs of our customers’ S3 workloads. We now have the industry-leading, object-centric, massively scalable StorageGRID and the unified storage systems that run ONTAP, such as NetApp AFF and FAS.

The overlap in S3 support also means that we need to advise our customers on choosing the right solutions. No environment is static. Data growth and the need for data management are constant. Luckily, there are multiple integration points between ONTAP and StorageGRID, so customers can present their data on the right system, at the right time, and with the correct level of performance.

In this post, I explore the differences between S3 support in ONTAP and in StorageGRID, as well as the integration points that make ONTAP and StorageGRID a solution instead of two separate products.

Architecture

StorageGRID presents a single namespace across multiple, geo-distributed sites. ONTAP has some tricks up its sleeve when using the MetroCluster feature, but a traditional ONTAP cluster is data center–specific. StorageGRID also benefits from having integrated load balancers to distribute load between storage nodes.

Information lifecycle management engine

StorageGRID has powerful policy rules that enable data management and placement across data centers. The level of protection can change over time (for example, new objects get two copies, and older objects are erasure-coded). The data protection that ONTAP gets from its file system (commonly known as the NetApp WAFL® file system) applies to all objects equally, and there is no concept of erasure coding.

Information lifecycle management (ILM) policies, though, do a lot more than customize the data protection schema. In this blog post, after my high-level overview of the two products, I will discuss ILM in more detail.

Tiering

Both products can tier cold data to lower-cost storage, including public cloud. Interestingly, the tiering protocol is also S3, which opens the door to cross-product tiering.

Scalability

By its very nature, StorageGRID can scale to a higher node count and a higher number of tenants, buckets, and users than ONTAP can. For example, StorageGRID supports 10 million buckets and 10 billion users, compared with 12,000 buckets and 96,000 users for ONTAP. The higher node count is relevant because StorageGRID presents a single namespace across a geo-distributed architecture—up to 16 different sites and a total of 200 nodes.

Multiprotocol support

ONTAP is the ultimate unified storage platform. S3 support ups the ante, daring you to find a more versatile storage and data management product. This attribute is truly where ONTAP shines. ONTAP can create an S3 endpoint in two different ways, and it is important to understand the differences.

The first option to create an S3 endpoint in ONTAP is to create a bucket (I use ONTAP System Manager). An object bucket in ONTAP is underpinned by a FlexGroup volume. This approach has the advantage of spreading the load between the ONTAP cluster nodes, and the bucket size can be easily expanded. However, enabling NFS or SMB for a bucket is not possible.

The second option became available with ONTAP 9.12.1. Starting with that release, an S3 endpoint can also be added to a storage VM (SVM) that already supports SMB or NFS. In other words, you can access the same file through either protocol (NFS, SMB, or S3). Although this option is immensely flexible, the underlying data structure is the original file system. Multiprotocol support can be enabled only through this method.

For a simple understanding and memorization technique, let’s codify that an ONTAP bucket implies the S3 protocol only. And, alternatively, multiprotocol support implies a file system that can also be accessed through S3 (requires ONTAP 9.12.1).

S3 API support

StorageGRID gets the check mark here. For example, StorageGRID offers S3 Object Lock and S3 Select APIs. In general, StorageGRID has a wider range of S3 API support and compatibility, to the tune of over 90% of the Amazon S3 APIs being supported. ONTAP has a significantly smaller aperture. As a result, some applications are known to be incompatible with ONTAP, but they do work with StorageGRID.

At the time of this writing, some of the applications that worked only with StorageGRID were Hadoop S3A, MinIO, Splunk, and Veeam. ONTAP is always expanding its list of supported APIs, however, so I do expect further changes there.

StorageGRID ILM, data protection, and metadata—with great flexibility comes great responsibility

Not only did I appropriate Spider-Man’s leitmotif, I dare say I improved it (mine sort of rhymes). The point is that when it comes to StorageGRID, its ILM rules are very powerful and are intertwined with its local data protection and geo-distributed replication (for disaster recovery). ILM policies can also make decisions based on the object metadata (take that, NFS).

A real-world example might help here. Let’s say that I have access to a StorageGRID namespace that spans three geo-distributed sites: New York; Raleigh, North Carolina; and San Jose, California. Because I’m a very important person, my files should be triple-protected. Here is the workflow:

I copy all my files to the New York site, where I am physically located.
I use S3 APIs to add a tag to each file. The tag is called IMPORTANT and the tag value is “Tudor’s Files”.
I then create an ILM policy that states that any object with the tag value of “Tudor’s Files” is to maintain two copies in the New York site, one copy in Raleigh, and another in San Jose.

A couple notes about the preceding example. First, it is clearly made up when it comes to my importance, but it is not made up when it comes to the flexibility of StorageGRID ILM policies. Second, please note how the StorageGRID concept of local protection and disaster recovery is very fluid. My files are protected, but what about the rest of the objects in the grid? For the StorageGRID admin at large, it is important to understand which objects get what levels of local protection and site recoverability.

We should also note how different the StorageGRID ILM protection rules are from ONTAP data protection. ONTAP data protection applies to all objects and files equally. To be exact, the protection applies to WAFL data blocks, which lack any awareness of or opinion about whether they belong to a file, an object, or a LUN. Disaster recovery is similarly block based. This method is very efficient, but the replicated data is in a separate namespace and is not accessible unless and until there is an administrative decision to fail over to the other side. This all-or-nothing approach does have some advantages. It is a well-understood, time-tested approach to disaster recovery that is familiar to seasoned ONTAP administrators. It is also an air-gapped disaster recovery mechanism.

Finally, I will submit to you my ultimate argument as to why the StorageGRID ILM policies are “the key” difference between ONTAP and StorageGRID. ILM policies can be used to remove unwanted data. If you are an ONTAP admin, this idea will horrify you to your very core. But on the flip side, what if you were asked to delete some specific files for legal reasons? Maybe some new privacy rule. Wouldn’t be nice to be able to identify and subsequently delete objects based on some criteria?

Like I said – with great flexibility comes great responsibility.

If you are interested only in what ONTAP and StorageGRID offer in terms of S3 support, then you can skip the rest of this section (my feelings aside). But I happen to think that knowing how things work helps you better understand the differences between two products that apparently do the same thing. In other words, context is important.

If you look at the ONTAP architecture, especially when it comes to the AFF and FAS hardware, the basic proposition is that you have a resilient and shared (my emphasis) disk subsystem that is dual-pathed to two controllers. (The subsystem is protected by the NetApp RAID DP® feature, and less frequently by NetApp RAID-TEC™ technology or RAID 4). When a write I/O is received, the ONTAP controller stores the I/O in its battery-backed NVRAM and acknowledges the write back to the client. When paired with an SSD or NVMe disk subsystem, this architecture lends itself to blazing-fast response times for S3 workloads. For an informative and humorous read on this topic, I recommend Joe Scott’s blog, Extreme S3 Performance with Confluent Tiered Storage + ONTAP.

The StorageGRID architecture is based on a shared-nothing (again, my emphasis) approach. When StorageGRID appliances are used, the compute node is paired with a NetApp E-Series or EF-Series storage subsystem. The node’s data is therefore protected by local RAID (most often in the form of disk pools, but other RAID configurations are also supported). The main advantage (and a competitive differentiator) is that a local disk failure does not increase cross-grid network traffic. As a matter of fact, when using StorageGRID with our purpose-built appliances, we can offer up to fifteen 9’s of data durability, compared to Amazon’s eleven 9’s.

Unlike ONTAP, the StorageGRID compute node is not HA paired. Therefore, the local data protection is insufficient, and data must be replicated or erasure-code-protected across multiple nodes.

The corollary here is that the network between the StorageGRID nodes is critical to the performance of the grid and to the response time back to the client. Unlike ONTAP, StorageGRID is geo-distributed, which means that objects that are also replicated over WAN deserve extra consideration. Remember that objects are not geo-distributed by default. An ILM policy is necessary to determine the protection schema, and not all objects would necessarily have the same protection. Still, the question is relevant.

Let’s look at Tudor’s important files. As I mentioned, I am a very important person, and I don’t want my write response times in New York to be limited by the network connection to San Jose. In this case, the ILM policy should be configured for dual commit. Dual commit means that StorageGRID writes two copies as quickly as possible before acknowledging the I/O back to the client. Subsequently, the ILM policy replicates my files to Raleigh and to San Jose, but that replication is performed asynchronously so that I can move on to the next very important task of my very important day.

Hardware requirements

Both ONTAP and StorageGRID can be deployed as software-defined storage on commodity hardware. In this section, however, I discuss only the storage appliances that each product supports.

ONTAP clustering is based on HA pairs, and scale-out is based on additional HA pairs. The minimum ONTAP cluster is two nodes, and it can scale up to 24 nodes (12 HA pairs).

StorageGRID starts with three storage nodes (and one or more administrative nodes) and can scale up to 200 nodes. Let me explain: If you are using our load balancers (SG100 or SG1000), then these nodes also serve as the administrative node. You can also use VMs as the administrative and load balancer nodes. These nodes do not store any object data.

The StorageGRID storage node minimum number is three nodes, and when creating a geo-distributed namespace, you need a minimum of three nodes per site as well. The reason is not so much about protecting data (after all, a dual copy does not require three nodes), as it is about the StorageGRID metadata. The metadata is critical for both the protection and the integrity of the stored data, as well as for the ability of our ILM policies to do effective work. The StorageGRID metadata is stored in a Cassandra database, which utilizes a shared nothing architecture of its own. Cassandra data resides on each storage node in a dedicated partition, and it is configured with a Replication Factor of three (for StorageGRID, this Cassandra replication factor cannot be modified). Unlike ONTAP, it is possible to expand a grid one node at a time. However, when you are planning a capacity expansion, you must consider the ILM rules that govern the data placement. Here is a good overview of this topic: Guidelines for adding object capacity.

Making a choice

Choosing the right solution must conform with the preceding general guidance. For example, if the primary application is S3 native, StorageGRID is a great starting point. StorageGRID can scale to hundreds of billions of objects behind a native load balancer and can tie multiple locations in a single object store.

On the other hand, if S3 is not the primary use case but is just one of the protocols that need to be supported, ONTAP unified storage tips the scales.

It is also important to understand the Cartesian plane of performance and capacity. Not all data is equally important all the time. As you move up along the capacity axis, having the right system for the right data becomes more important. That brings me to the idea of a solution rather than a product.

Both ONTAP and StorageGRID can scale along the performance and capacity axes. On the ONTAP side, the AFF and FAS product lines break along the flash or spinning media support. StorageGRID has its own flash and non-flash offerings. Both products can be installed on proprietary hardware or on commodity servers, further increasing the number of dollar/TB price points that are available to our customers. Even so, additional value is unlocked when data can move between the two products. The following table summarizes the various integration points between StorageGRID and ONTAP.

This table presents possible ONTAP and StorageGRID configurations only. Both ONTAP and StorageGRID can integrate with other products, including public cloud solutions.

Platform one*	Data management policy	Platform two**	Benefits
ONTAP	FabricPool or NetApp Cloud Tiering	StorageGRID	Tiering of cold data blocks for space and cost management on the primary NAS or SAN tier. As a general rule, we recommend considering StorageGRID as the tiering target when the tiered data exceeds 300TB or when StorageGRID is used for multiple use cases (other S3 applications or tiering and backup).
ONTAP	NetApp S3 SnapMirror®	StorageGRID	Sync functionality for ONTAP S3 buckets to a StorageGRID object store.
ONTAP	SnapMirror Cloud	StorageGRID	Replication of NetApp ONTAP Snapshot™ copies to a StorageGRID bucket.
StorageGRID	StorageGRID CloudMirror	ONTAP	Replication functionality for StorageGRID buckets or objects to a site that does not have a StorageGRID footprint.
StorageGRID	Cloud Storage Pool	ONTAP	Tiering or archiving of StorageGRID objects to a physically separated infrastructure.

* Platform one is the data source.

** Platform two is the data target, as a result of the data management policy action that is specified.

Widening the aperture a little more to include NetApp Cloud Manager orchestration yields two more entries to the preceding table.

Platform one*	Data management policy	Platform two**	Benefits
ONTAP	NetApp Cloud Backup	StorageGRID	Backup (incremental forever) to a StorageGRID object store.
Any platform	NetApp Cloud Sync	Any platform	Data copying by using the S3 protocol.

“We all have powers of one kind or another” (as we learned from Spider-Man)

The key point to remember is flexibility. StorageGRID is a versatile, enterprise-wide object storage solution that provides better availability and data durability than the public cloud. Powerful ILM policies provide in-the-grid data protection, site replication, and data life-cycle functionality. Regardless of the starting point—ONTAP or StorageGRID—it is always possible to either switch between them or, use them together to place data on the correct system with the appropriate level of protection and performance. Large data centers can easily warrant both products, serving primary workloads and tiering from the higher-cost to the lower-cost systems. Backup and vault sites can be the domain of vast object repositories (optimal for StorageGRID). Satellite sites, on the other hand, can benefit from a unified storage management system (ONTAP) that replicates data to core and/or vault locations.

To learn more about S3 support in ONTAP and in StorageGRID, I recommend two sessions from NetApp INSIGHT® 2022:

1020, Objects in ONTAP: Using multiprotocol NAS data with S3 workloads.
1205, Object storage is taking over the world

Both sessions are available on demand on the NetApp TV™ streaming platform.