Active IQ Unified Manager Discussions

How to monitor SSD cache to know when to add more?


System we have:


Cluster Data ONTAP 8.3

SAS flash pool aggregate at 82TB and has a 4 SSD disk group with 558.14GB of flash space


I would like to know how to go about monitoring the flash storage and decide when we need to add more disks. What tools and metrics should we incorporate.


We have OnCommand Unified Manager 6.2 with Performance Manager 1.1 implemented, but new to these products.





Section 5.2 in TR-4070 may helps. Other part of this report is also helpful for cache sizing.

View solution in original post



To know when to add more SSD the Automated Workload Analyzer (AWA) feature can also be helpful.  In the same TR mentioned earlier, section 6, is discussion of it.


If you want to see this graphically, and lots more, then you could also go for the advanced performance monitoring solution of Harvest + Graphite + Grafana.  I wrote a blog here that has a video of what you get and the steps you need to take to make it happen.  It could be overkill just for this cache monitoring, but if you want more details across the board it could be a good fit.


Here is an example of the flash pool stats it captures (pardon the boring graph and crazy lines when there is an IO but the SE lab is pretty idle):




The AWA panel is blank because awa hasn't been started on an aggregate.  If you start it (see the TR for details how) you get something like this (from a different lab system that is busy but has no flash pool today):



So AWA is telling me if I had ~400 GB of SSD I would serve 81% of read IOPs and 42% of write IOPs from it.  We can also see that over time it is relatively steady, so my active data size is stable and not influenced by some large batch jobs or similar activity.


Hope this helps!



Chris Madden

Storage Architect, NetApp EMEA (and author of Harvest)

Blog: It all begins with data




Thank you for showing me this. Just got done viewing the video and it rocks. This is exactly what we need as a service provider. The NetApp Harvest tool is long time coming... away with perfstats!!!



Section 5.2 in TR-4070 may helps. Other part of this report is also helpful for cache sizing.

View solution in original post


Thanks for the direction to this TR. I ran the AWA a couple of weeks ago and it helped us. We're below the estimates and am working on add more cache.


NetApp on Discord Image

We're on Discord, are you?

Live Chat, Watch Parties, and More!

Explore Banner

Meet Explore, NetApp’s digital sales platform

Engage digitally throughout the sales process, from product discovery to configuration, and handle all your post-purchase needs.