Hopefully I don't start a religious war here, but I have a question about interoperability between the V-Series filers (a V3160 in our case) and FAST Cache on the EMC Clariion CX4 series, specifically whether it's a good idea to enable FAST Cache on LUNs which are presented to the filers. More specifically, we have two classes data by and large, unstructured data (files) and virtual machines. Would enabling FAST Cache for either class of data potentially improve performance?
The short answer is that we don't yet fully understand the performance implications of a FAST-Cache enabled storage pool.
The long answer is that our testing to this point has been primarily to ensure interoperability with that feature. My job this summer is to gauge the performance benefits and develop a set of recommendations for optimal performance. We should have a Tech Report (TR) early this Fall/late Summer. I'll be sure to let you know as soon as we do.
Thanks, Dan, it's good to know you guys are working on this. My understanding of how FAST Cache works (and this is as a non-specialist) is that the SAN analyzes usage of individual blocks in a FAST Cache-enabled LUN and promotes frequently-used blocks to the flash drive cache. EMC also mentions that FAST Cache works best with i/o that is non-sequential but has medium to high locality. Given what you know about WAFL, would you guess that enabling FAST Cache could be worthwhile, or would it be a waste of resources? I ask because we have immediate performance problems, and waiting 3-6 months for a recommendation is not ideal.
The bottleneck is the back end disks, which is leading to high DRAM cache utilization and degraded performance during usage spikes. We can throw more disks at the problem, but we would prefer to find a less wasteful solution.
Here is my experience so far. I turned on FAST Cache for a single-plex aggregate hosting an NFS-mounted VMware datastore, and the write hit ratio is hanging out at about .125, and the read hit ratio is hanging out around .65 (both estimates are very rough based on a quick eyeball of the performance chart), with variances down to .5 and up to .89 for reads and .05 and .36 for writes. What's interesting (to me, anyway), is that the ratios are almost precisely reversed for the SP Cache: read hits are down in the same area as FAST Cache write hits, while write hits are actually even higher than FAST Cache read hits. Looking at the hits per second, FAST Cache read hits are significant, at >40, while everything else pales in comparison.
Overall, though, it seems as though roughly 80% of reads are coming out of either SP Cache or FAST Cache (mostly the latter), while almost all of the writes are hitting cache at some point. I know that conventional wisdom is that WAFL doesn't really benefit from write caching (or so I have read), but given that the LUN service time is relatively miniscule (topping out at 4 ms and generally staying between 1-2 ms), while the response time is >10x that amount generally, most of the data must be coming from cache, so FAST caching does seem to be boosting read performance. That's my conclusion, anyway, speaking as very much a novice when it comes to storage performance tuning.
The slide deck about NetApp Flash Pool differentiates between random writes - not benefiting from Flash Pool & random overwrites - benefiting from Flash Pool. Whilst I think I understand the difference between the former & the latter, I always though WAFL (write anywhere, hello?) can accelerate all types of writes.
Were the desktops deduped? How much space within ONTAP were they using? Just making sure, since if they were deduped, you may have as little as 20G of data, which would fit entirely in ours and the array's cache.
For perfstat, run it with these options: -F -S -f [hostname] -l [login] -t 1 -i 5
What are you using to drive the load? Is this just a subset of production desktops? Or is the load artificial?
To a point, yes, we can accelerate any write. From the start, NetApp has always written to cache (system memory) and logged the writes to NVRAM. Since the writes were logged to a battery-backed-up set of DIMMS, we could send the ACK back to the host, bypassing disk spindles altogether.
However, if we can't de-stage cache quick enough (before it fills up again), then we are at the mercy of the spindles. That is where we expect FlashPools to help. In most cases, FlashPools will allow us to get SAS drive performance out of SATA devices. FlashCache currently does this for reads, but writes were still constrained by the slower SATA spindles. FlashPools will do it for both.