The Role of QoS in Delivering Storage as a Service (Part 2)

Disk Array


By Eduardo Rivera, Senior Storage Architect, NetApp IT


NetApp IT is on a journey to offer its customers storage as a service. In part one of this blog, I discussed how we embraced IO Density to help us better profile and deploy our applications across our storage infrastructure. We developed a three-tier service catalog that offers storage as a predictive and easily consumable service to our customers. The second step in this journey was tapping into the power of clustered Data ONTAP®’s adaptive Quality of Service (QoS) feature to assure performance stability.


QoS—Corralling the Herd

The adoption of clustered Data ONTAP’s QoS feature is a key component of our storage-as-a-service model. In a nutshell, QoS enables us to place IO limits on volumes (it can also work at the storage virtual machine (SVM) or file level) in order to keep the applications using those volumes within their IOPS “swim lane.” This prevents one application from starving other applications of performance resources within the same storage array. QoS can be implemented dynamically and without interruption to application data access.


In our storage catalog model, we assign a QoS policy per volume for all the volumes that exist within a given cluster. The QoS policies themselves enforce a particular IOPS/TB objective. Hence, if we have a volume that is consuming 1TB of capacity and the service level objective (SLO) is to provide 2048 IOPS/TB, the QoS policy for that volume would set an IOPS limit of 2048. If that same volume in the future grows to 2TB of consumed space, then the QoS policy would be adjusted to 4096 IOPS/TB to maintain an effective 2048 IOPS/TB. In a live environment with hundreds, or even thousands, of individual volumes and where storage consumption continuously varies (as the application writes/deletes data), manually managing all the QoS policies would be close to impossible. This is where Adaptive QoS comes in.


Adaptive QoS is a tool developed by NetApp. Its sole purpose is to monitor consumption per volume and dynamically adjust each volume’s QoS policy so that it matches the desired IOPS/TB SLO. With this tool, we are able to provision volumes at will and not worry about having to manage all the necessary QoS policies.


With QoS and Adaptive QoS, we are able to easily provide predictive storage performance tiers upon which we can build the actual storage service catalog.


Building the Storage Service Catalog

With the pre-defined service levels and the ability to manage IO demand with Adaptive QoS, we were able to build a storage infrastructure that not only delivers capacity but also predicts performance. Leveraging clustered Data ONTAP’s ability to cluster together controllers and disks that offer various combinations of capacity and performance, we built clusters using different FAS and AFF building blocks to deliver varying tiers of performance. Then Adaptive QoS was used to enforce the performance SLO per volume depending on where that volume resides.


Moving a volume between service levels is also quite simple using clustered Data ONTAP’s vol-move feature. Adaptive QoS is smart enough to adjust the policy based on where the volume sits. By defining a service level per aggregate, we are also defining a multitude of service levels within a particular cluster through which we can move our data around. Addressing changes in performance requirements is easy; we just move the volume to a higher performing high availability (HA) pair using vol-move.


Data-Driven Design

Together, IO Density and QoS have revolutionized how we view our storage. It has made us much more agile. The IO Density metric forces us to think about storage in a holistic manner because we operate according to a data-driven—not experience-based—storage model. We don’t need to look at whether we have enough capacity or performance, but can check to see if we have enough of both. If we nail it, they run out at the same time.


The same is true with the QoS service level approach. Our storage infrastructure is much simpler to manage. Clustered Data ONTAP gives us granular control of resources at the volume level; our QoS policies now act as the controller. Best of all, this new storage approach should enable us to deliver a storage service model that is far more cost efficient than in the past while supporting application performance requirements.


The NetApp-on-NetApp blog series features advice from subject matter experts from NetApp IT who share their real-world experiences using NetApp’s industry-leading storage solutions to support business goals. Want to learn more about the program? Visit or download the new ebook, "IT Infrastructure Stability: Building a Foundation for Agility in Business Apps."


NetApp IT ebook