I have been using flashpool for over a year now with production workload. Spec: 2x 3220 7-mode, 8.1.3P3, 12x100GB SSD, 24x 450GB 10k, 24x 2TB 7.2k.
It seems good a keeping data hot that is required by VM OSs. But we get next to nothing out of the SSD for writes, our average write is about 80 bytes (not kb). We do 7-8k IOPS during the day. The workload is perfect for SSDs (we use to use SSD), but less than 10% of writes go to the SSDs with flashpool.
We have latency in the 100s of milliseconds, which substantially effects production workloads, and disk utilisation of 100%.
We lodged a ticket when we first purchased the product due to the low SSD write hit rate, then we were told that it would not go to SSD if there is capacity to go to SAS or SATA. Which made perfect sense, but there is certainly no IOPS capacity any more. During the sales process NetApps estimate was that it could cover 46k IOPS per controller with our setup.
Price wise it is comparable to other vendors pure SSD options. I have lodged a ticket with NetApp, I will hold my judgement till they have had a proper chance to resolve the situation. But at this stage it looks like just great marketing.
As per the WAFL fuctionality, the write operations already converted into the sequential stripe before commiting onto the raid layer, and fleshpool only omtimized the overwrites which are random not sequential means it does not optimize for the random new writes as well.
So here i have confusion, which data will be eligible for flashpool optimization, i.e how to define the overwrite for a block as WAFL always write to new blocks so when and how overwrite to a block happens.
IOPS are only a small portion of what you want to look at. The same benchmark software can show little IOPS or an unreasonably high IOPS count with the right tweaking. Your main concern should be latency. I'd rather have extremely low IOPS and extremely low ms response time, than super high IOPS and just mediocre ms response time.
The “stats” command is used to show Flash Pool performance and diagnostics counters in Data ONTAP operating in 7-Mode. The “stats” command is used in the node CLI for Data ONTAP operating in Cluster-Mode. A stats preset “stats show -p hybrid_aggr” shows interactive statistics from the command line.
Can you tell us more about the workload you're running? How long are you running sqlio?
i knew that "stats" command and i am using it , According to the " stats show -p hybrid_aggr" output there are some work on the SSD disk for write and read operations .But my main concern is as follows.
1-I firtsly created a normal aggregate with 20 disks and tested with sqlio with the given parameters ,and i had collected the output .
2-Then i had converted the aggregate to a hybird aggr ,and then added the flash pool with default settings for the write and read operations as it is described in the tr-4070.Then run the same sqlio test with the same paramaters and noticed that there is no iops gain in the tests.
that confuses my mind.
Note:i had attached the sqlio parameters as below.
Couple of things.. Flashpool accelerates writes that are random only. Also, your random write block size is too large for the algorithm to pickup. Change your block size to 16k or less and retest. do a test for at least 15mins. The IOPs should grow the longer the test runs.
use the command "stats show -p hybrid_aggr -i 1" to show the read and/or writes replaced...
Exactly, I think anormal blocks (due to size) are discarded from flashpool acceleration. Random writes/reads often use 8K block size. It should be useful to show the CPU usage during workloads. Flash pool should use more CPU and it could explain why sequential writes/reads are worse with flashpool... not sure.
Ousturali, although it's an old post, give attention to Flash Pool policies, be careful with SQLIO and system cache (-BN to remove system cache // because i find that random writes on SATA are high).
I noticed these tests are only running for 60 seconds, if a steady amount of random I/O is occurring on the aggregate, the Flash Pool should warm in a matter of 10–15 minutes as a general estimate.
Remember that writes should be acknowledged at memory speeds if your system is properly architected. Part of that architecture is making sure you have sufficient drives in the subsystem to avoid the disk drives becoming the bottleneck for the system. Flash Pool is not about making writes get acknowledged at faster than memory speeds (there isn’t such a thing) but rather about the number of drives needed in the storage system to satisfy the workload requirements (avoid the disk from becoming the bottleneck). With Flash Pool you can use fewer spindles to address those random workloads that generally are the most susceptible to becoming disk bound.
You can tune the specific volume you're worried about with the "priority" command which requires advanced privileges (priv set advanced) and see if that helps, but I wouldn't do this unless you have a good reason to do so.
But as far as I can tell, Flash Pool is doing it's job.
Hope this helps, let me know if you have any other questions.