Right sizing Amazon FSx for NetApp ONTAP instances for performance

MicheleP

Deploying a new infrastructure always requires some pre-work to make sure that the hardware, or virtual hardware, you selected will meet all, or at least your most important, requirements. Among those, performance ranks high.

In this post I will illustrate a step-by-step guide that can help you make the right choices when it comes to sizing Amazon FSx for NetApp ONTAP (FSx for ONTAP) appropriately to provide optimal performance for your workload(s).

Some things to keep in mind

Keep in mind that the primary focus of this article is to size your FSx for ONTAP by performance requirements only, putting cost aside. Of course, I’m aware cost is a fundamental factor in these equations, but I want to make sure that you understand all the moving parts that can influence the performance of the solution. Once you come up with a final architecture, you should definitely factor in the costs to make sure that your TCO requirements are also being met.

The reason I want to isolate performance here is that I’ve seen too many cases where a customer has an infrastructure performing so far below expectations that they have to redesign it from the ground up. And when I ask why they chose that infrastructure, the answer is almost never “It was the least expensive,” rather it’s more like “We actually didn’t know how to size.”

I assume that the reader already has a basic understanding of FSx for ONTAP infrastructure, but here is a summary of the most important characteristics.

FSx for ONTAP high availability (HA)

FSx for ONTAP is based on one or more HA pairs, each HA Pair has one Node serving the data and another standing-by. File modifying operations are mirrored between the two Nodes to make sure that in the event of a downtime, planned or unplanned, there will be no disruption for the running workloads.

FSx for ONTAP has two options for HA: deploying in a single Availability Zone (single AZ) or across two AZs within the same AWS Region (multi-AZ).

FSx for ONTAP Gen-1 and Gen-2

As of August 2024, you can deploy FSx for ONTAP Gen-1 (first generation) or, where available, FSx for ONTAP Gen-2 (second generation).

Gen-1 specs:
- Throughput capacities available are: 128 MBps, 256 MBps, 512 MBps, 1024 MBps, 2048 MBps and 4096 MBps.
- Delivered as a single HA pair with 4 GBps throughput capacity and 200,000 op/s max performance.

Gen-2 specs:
- Throughput capacities available are: 384 MBps, 768 MBps, 1,536 MBps, 3,072 MBps and 6,144 MBps.
- Multi-AZ deployment is available with 1 HA pair only that will deliver up to 6 GBps throughput capacity and 200,000 SSD Op/s
- Single-AZ deployments of 384 and 768 MBps throughput capacities use a single HA pair. Deployments with throughput capacities of 1,536 MBps and higher can run up to 12 HA pairs.
- With 12 HA pairs running 6,144 MBps it is possible to get as much as 72 GBps and 2,400,000 SSD Op/s.

Very important to know for the preceding stats: These throughput values refer to the maximum disk throughput. More on this later.

This article will focus on choosing the right FSx for ONTAP throughput capacity for single HA pair deployments only. Also, the SSD tier capacity sizing will not be discussed here.

Performance influencers in FSx for ONTAP

The following picture is a basic representation of a client/server FSx for ONTAP infrastructure. It also shows the three key performance influencers: Network I/O, cache (in-memory and NVMe), and disk I/O. All three are determined by the throughput capacity of the file system chosen during the initial deployment.

Network I/O is the maximum throughput between the client and FSx for ONTAP, expressed in MBps. The network I/O value will be greater than whatever your disk I/O throughput maximum. For instance, a 512 MBps FSx for ONTAP can deliver up to 625 MBps network I/O.

Cache can come in two possible forms. All FSx for ONTAP throughput capacities have an in-memory cache for read workloads. Some of them also have an extension of the in-memory cache, based on NVMe. The greater the throughput selected, the greater the total cache size will be.

Disk I/O is the maximum throughput of the SSD tier, and it’s expressed with both disk MBps and disk Op/s. The chosen throughput capacity will determine the max MBps, while for Op/s there are other factors to consider. Let’s take a look at them.

Explaining disk I/O

When you first deploy FSx for ONTAP, there are a few parameters you need to specify, one of which is the SSD capacity in GB. By default, AWS will assign the file system three disk Op/s per GB provisioned. That means that if you deploy 2048 GB SSD capacity, you will get 6144 disk Op/s maximum for your file system. If you need more than that, you can manually specify the provisioned disk Op/s quantity. But here’s the most important part: each throughput capacity also comes with a max disk Op/s, and that will be the ultimate limit.

Here’s an example: You’ve deployed a Gen-1 128 MBps with 1024 GB SSD capacity. You will get (3 x 1024) = 3,072 disk Op/s set as limit. Mind you, 128 throughput capacity comes with a max disk Op/s set at 6,000 (more on this later). However, because you provisioned 3072, this will be the actual limit. But if you need more, you can manually change it, up to 6,000.

On the other hand, if you’re deploying a Gen-1 128 MBps with 10,240 GB of SSD capacity, you’ll get (3 x 10,240) = 30,720 provisioned disk Op/s. However, the max is still 6,000 disk Op/s, so you won’t be able to use all of the 30,720 disk Op/s.

The choice of which FSx for ONTAP throughput to use is entirely driven by disk I/O performance requirements, MBps and Op/s. You must make your choice assessing both, so that the FSx for ONTAP disk I/O will be capable of providing enough performance to accommodate them.

Explaining caching

Now that we understand disk I/O, I want to quickly go back to caching to explain an important sizing concept: Disk Op/s are consumed only when the workloads access data that is not cached in your file system’s in-memory or NVMe cache. In fact, the read cache can help lower the need for additional provisioned disk Op/s.

Let’s look again at our Gen-1 128 MBps example: In a multi-AZ deployment it will come with in-memory cache plus 150 GB of NVMe cache. Depending on your workload’s read pattern—caching is typically more beneficial for random read workloads—caching allows your total client Op/s to go beyond the set limits of 6,000 disk Op/s and 128 MBps. The ultimate limit for MBps is determined by network I/O.

One other influencer: Burst credits

There is one other performance influencer that we haven’t mentioned earlier: Burst credits. All the throughput values I show and discuss in this article are called “baseline values,” which is the minimum throughput that you can expect from your file system, 24/7. However, the file system can operate at higher throughput values, especially to support spiky workloads. That’s where burst credits come in.

For certain lengths of time, FSx for ONTAP lets you burst both network I/O and disk I/O to higher bandwidths. This bursting is accomplished with a credit mechanism, which allocates throughput and Op/s based on average utilization. Note that not all throughput capacities support bursting: Gen-1 supports 128 and 256 MBps in all regions, and 512 MBps in selected regions. Gen-2 supports burst credits for 384, 768 and 1,536 MBps.

AWS provides tables with both baseline and burst values, for each throughput capacity. Before we proceed, I want to make it clear that you should never use burst values when sizing, only use the baseline. Burst is meant to accommodate occasional spikes, not to serve your average workload characteristics.

As you can see there are three tables (you will find it in the “Impact of throughput capacity on performance” section.):

One table shows the maximum throughput values for read and write for both Gen-1 and Gen-2.
The other two show the network I/O, disk I/O, baseline, and burst values for all Gen-1 and Gen-2 throughput capacities.

These last two tables are the ones you will use for sizing (be sure to check the table with the region you are interested in). Please note also that these tables always show data for single-AZ and multi-AZ, which we will discuss next.

Single-AZ or multi-AZ?

It’s important to understand that the choice between single-AZ or multi-AZ deployment is not just about infrastructure redundancy. It’s also about performance.

Some key differences between single-AZ and multi-AZ deployments:

Caching: All multi-AZ throughput capacities come with a NVMe read cache. This is an extension of in-memory cache, and FSx for ONTAP will primarily use it to serve metadata and random read workload. So, if your read workload is composed of random reads (say 25% or more), you should consider this factor. Also, operations served by cache have a shorter response time.

Note that with single-AZ, the NVMe cache is supported by the Gen-1 2,048 and 4,096 throughput capacities, and the Gen-2 6,144 MBps throughput capacities.

Maximum write throughput: As we’ve previously seen in the AWS table, the single-AZ maximum write throughput is lower compared to what multi-AZ delivers.

On the other hand, there is no difference between single- or multi-AZ options when it comes to sequential read workload.

With regards to infrastructure redundancy and applications, AWS recommends choosing single-AZ if you require a “cost-optimized solution for use cases such as development and test environments, or storing secondary copies of data that is already stored on premises or in other AWS Regions, by only replicating data within a single-AZ.”

AWS says to use multi-AZ “for use cases such as business-critical production workloads that require high availability to shared ONTAP file data and need storage with built-in replication across AZs.”

Correct sizing, yes, but relaxed

One of the great, if not the greatest, benefits of cloud computing is how easy it is to adjust your infrastructure to changes in your workload(s), and FSx for ONTAP is no different. I want to emphasize that it is very important to perform correct sizing from the get-go but remember that in most cases there is room for adjustments.

AWS allows customers to change some FSx for ONTAP performance characteristics after the initial deployment, such as scaling throughput capacity up or down. Yes, up or down! You may have oversized your FSx for ONTAP and now understand scaling down will give you the same performance at a lower cost. Or maybe you underestimated and undersized your FSx for ONTAP, in which case you can scale up. What else can you change? In general, you can also change throughput capacity and adjust disk provisioned Op/s.

I also want to reiterate that there are some characteristics that you cannot change after your initial deployment:

Gen-1 vs. Gen-2
Multi vs. single-AZ
Number of HA pairs for multi-AZ
Number of HA pairs for single-AZ Gen-2 smaller than 1,536 MBps

If you need to change one of these, then you must deploy a new FSx for ONTAP and migrate the data.

What information do you need to perform a correct FSx for ONTAP sizing?

To size FSx for ONTAP correctly, you’ll need what we call workload characteristics. These include:

Peak read MBps
Peak write MBps
Peak combined read + write MBps
Peak combined read + write Op/s
Is read workload random or sequential? Or a mix?
The AWS Region

You may wonder why this information is needed. Let’s begin with read and write: Any storage system will behave differently when handling these two types of operations. Writes typically are more CPU intensive and, as discussed already, for FSx for ONTAP we need to take into consideration the time and resources needed for the replication to the stand-by node. On the other hand, read operations are more memory (cache) and disk intensive.

You may also be wondering why we’re using peak numbers and not averages. Both can be valid, but in my experience choosing peak is the better option so that FSx for ONTAP will be able to serve all workload scenarios without degradation.

About maximum reads and writes

As you have seen in AWS tables, maximum read MBps is identical between single-AZ and multi-AZ, and it’s equal to disk I/O MBps baseline value. However, maximum write (MW) MBps is not identical between single-AZ and multi-AZ deployments (multi-AZ can go higher).

I want to highlight here a basic rule to follow for determining MW:

For Gen-1 in US East (N.Virginia), US East (Ohio), US West (Oregon), and Europe (Ireland)

Single-AZ MW = MIN (disk I/O throughput, 1,000 MBps)
Multi-AZ MW = MIN (disk I/O throughput, 1,800 MBps)

For Gen-1 in all other regions

Single-AZ MW = MIN (disk I/O throughput, 750 MBps)
Multi-AZ MW = MIN (disk I/O throughput, 1,300 MBps)

For Gen-2 in all supported regions

Single-AZ MW = MIN (Disk I/O Throughput, 1,100 MBps)
Multi-AZ MW = MIN (Disk I/O Throughput, 2,100 MBps)

Note: Disk I/O throughput is the disk I/O baseline MBps for each throughput capacity.

On top of this, you also should list any other requirements you may have, such as “Required infrastructure redundancy.” Now that you have everything, you can move forward.

As an example, let’s say you assessed both MBps and Op/s and decided that a 512 MBps Gen-1 FSx for ONTAP file system would be the ideal choice. The next step is to choose between single-AZ and multi-AZ deployment. Here you must evaluate what’s more important for you. Is it infrastructure redundancy or random read latency? Once all these questions have been answered, you’ll be able to make the right choice.

Let’s perform a couple of example sizings

Example 1

After looking at your monitoring platform for the last week of data, you came up with the following:

Peak read MBps: 400
Peak write MBps: 450
Peak combined read + write MBps: 500
Peak combined read + write Op/s: 30,000
80% random read
Single-AZ preferred, for cost
Region is ca-central-1

About the MBps and Op/s numbers: Looking at AWS tables, the minimum throughput capacity you need is 1,024 MBps. Why isn’t 512 MBps good enough? Because of disk Op/s: A 512 MBps deployment can serve up to 18,750 Op/s in ca-central-1, which will not be enough. So, you need to pick 1,024 MBps size, which can serve up to 40,000 disk Op/s.

Moving forward, you can see that both a 1,024 MBps single-AZ and a 1,024 MBps multi-AZ deployment can accommodate your peak MBps-Op/s requirements. So, which one are you going to choose?

80% of the read workload is random, which is quite high, so it might make sense to have an NVMe cache which for 1,024 MBps is only available using multi-AZ
Costs for single-AZ are preferable.

My recommendation is to have all the information available before making your final call. In a case like this, I would perform a PoC with both solutions, write down the results, and then get all the stakeholders together to assess the differences. The choice will depend on what’s more important:

Is cost the most important factor? You need single-AZ.
Is your application’s read latency more important? The PoC result will certainly prove that multi-AZ will be better.
Are you deploying a business-critical application? Then you may want to follow the AWS recommendations and go for multi-AZ.

The above sizing was done with Gen-1. If we were to deploy with Gen-2, then what? Let’s see.

We need 30,000 disk Op/s, and want the smallest capacity that can satisfy this requirement. Based on the AWS performance specifications table, we can see that 768 MBps can serve up to 25,000 disk Op/s. That’s not enough. Let’s move one larger: 1,536 MBps can serve up to 50,000 disk Op/s, which is good, so that’s your choice. You can now go through the same decision-making process and eventually compare costs.

Example 2

Here your primary workload is sequentially writing to FSx for ONTAP, with no reads. You also need between 700 and 800 MBps. There’s no preference for infrastructure redundancy and any AWS Region is acceptable as long as it’s in the US.

We know we need around 700 MBps, so at least 1,024 throughput capacity is required. Let’s look at our options, using the rules specified above.

With Gen-1, in US East or US West:

Single-AZ = MIN (1,024, 1,000) = 1,000 MBps. A 1024 MBps could be fine.
Multi-AZ = MIN (1024, 1800) = 1,024 MBps. Same, use the 1024 MBps size FSx for ONTAP.

With Gen-1, all other regions:

Single-AZ = MIN (1024, 750) = 750 MBps. May not be enough, this may not be a viable solution.
Multi-AZ = MIN (1024, 1300) = 1,024 MBps. Again, choose the 1,024 MBps option.

At this point we have a choice between three solutions. Which one? Again, the advice is to run a PoC to verify you can meet all your workload requirements and then add AWS recommendations and TCO to the mix. Is it a critical application? Choose multi-AZ. Is it a backup-only workload? Choose single-AZ. Once again, similar calculations can be done in the case of Gen-2.

Conclusions

In my experience sizing cloud resources is a little trickier than on-prem sizing, but it is more flexible, allowing you to make fixes later. But still, that’s not an excuse to size incorrectly.

With this article I hope I have provided enough information to allow you to size your new—or existing—FSx for ONTAP workloads in a more effective way.

To find out more check out the performance documentation for FSx for ONTAP.