2015-12-15 01:31 PM
I'm in the process of attempting to create a flash pool on our existing NetApp infrastructure. The original configuration was a FAS3220 HA pair with 2 DS4246 full of SATA drives. This is running pure vmware on VMFS luns; needless to say, disk latency is an issue with all of these SATA disks.
I purchased a new DS2246 with 20 SAS and 4 SSD. My plan at this point is going to be to add the 4 SSD to one of the existing aggregates, create the flash pool and move any data requiring the higher iops to a volume within this aggregate. As a test, before attempting to add the SSD's to a production aggregate, I have created a test aggregate (aggr1_SAS) with my 20 SAS drives, and a raidgroup size of 12 (see below).
toaster> aggr status aggr1_SAS
Aggr State Status Options
aggr1_SAS online raid_dp, aggr raidsize=12, hybrid_enabled=on
Plex /aggr1_SAS/plex0: online, normal, active
RAID group /aggr1_SAS/plex0/rg0: normal, block checksums
RAID group /aggr1_SAS/plex0/rg1: normal, block checksums
toaster> sysconfig -V
volume aggr1_SAS (2 RAID groups):
group 1: 4 disks
group 0: 16 disks
volume aggr0 (2 RAID groups):
group 1: 11 disks
group 0: 11 disks
For starters, I'm not sure why if the raidsize has been set to 12, it shows 16 disks in group 0. Shouldn't that be 12 and 8 according to how Data ONTAP fills up raid groups by default? And my next concern and what is probably my biggest issue at the moment is this error I'm receiving when I attempt to add the 4 SSD drives to the new aggr.
toaster> aggr add aggr1_SAS -T SSD 4
Note: preparing to add 2 data disks and 2 parity disks.
Continue? ([y]es, [n]o, or [p]review RAID layout) y
aggr add: Cannot perform the operation: the total size of SSD disks in Flash Pools would exceed the system limit on this node. Limit: 600.00 GB. Attempted size: 744.73 GB.
I have read through NetApp flash pool implementation documents and haven't seen anywhere that there is a 600GB limit on the size of a flash pool. Can anyone help me figure this out? Thanks in advance.
Solved! SEE THE SOLUTION
2015-12-15 06:30 PM
Hardware universe is what shows the maximum Flash Cache and Flash Pool. Here is a screenshot for the FAS3220 with cDOT 8.1.4:
Upgrading to cDOT 8.3+ increases the value to 6.25TiB and divides the SSDs into disk pools, which means you might be able to get more use out of the SSDs. I wrote an article on it here if you're interested.
Hope that helps.
2015-12-16 06:51 AM
So then at 8.1.4P6, having 400GB SSD, that really leaves me unable to create a flash pool at all. Being that the needed raid group is going to put me over this 600GB limit almost every time.
I'm looking at hardware universe to see what version of 7-mode will allow me to accomplish what I need but I can't find anywhere that page you posted that shows the system cache limits. Where is that? Will a version like 8.2.4 7-mode allow me to do what I am trying to get done? We plan on going to CDOT 8.3 next year but because it is a disruptive upgrade to go from 7-mode to CDOT, I'm going to need to leverage a partner to assist in that project and therefore it won't be happening right away and I need to get this flash pool setup sooner than later.
Do I have any options that are still within the capabilities of 7-mode?
This is frustrating....I wish our NetApp SE would have thought about any of this before just selling us a diskshelf with the intent of increasing iops and now I can't even use it. Of course I have only myself to blame for not looking into it myself first, but I guess I kind of thought our assigned NetApp SE would have done his due dilligence to make sure that what he was selling us would at least be usable in our current environment.
2015-12-16 07:04 AM
You can create the flash pool group using only 3 disks...2 parity + 1 data + 1 unused/spare.
The screenshot I took came from the Hardware Universe. At the top you'll want to select the platforms dropdown, then "FAS/V-Series". Down below, check the box for the version of ONTAP and the model of controller you want to see and click "Show Results". In the new screen, there will be a number of links in the column for the system/ONTAP version combo, click "System Cache Limits", which will show the maximum supported Flash Cache and Flash Pool for your hardware + ONTAP version.
My apologies, I didn't realize you were on 7-mode, here is the value for Data ONTAP 8.2.4 7-mode:
So, updating to 8.2.4 will allow you to create a Flash Pool aggregate which uses all four of the SSDs in your shelf. Keep in mind that if you already have Flash Cache though that still might not be true.
With regard to transition, have you seen/heard about Copy Free Transition? We talked a bit about it on the podcast, it may be worth investigating for you as it is a path to cDOT that is, potentially, minimally disruptive.
2015-12-16 07:18 AM - edited 2015-12-16 07:22 AM
Ok I see it now. Thanks; I had gotten that far but missed the link for system cache limits.
So according to this, 8.2.4 will allow up to 1.56tb of flash pool (to answer your last statement; no we do not have any flash cache installed). With that being said, I could essentially take my 4 SSD, add them to the aggr in RAID-4 without a spare and end up with 3 disks for caching and 1 parity, correct? That would allow a single drive failure which shouldn't be a problem being that this HA pair is local and I could easily swap the failed drive quickly.
Yes I have seen a little bit about copy free transition but not enough to understand what exactly is done to make it copy free and therefore non-disruptive. I'll look that up again and read more into it.
2015-12-16 07:27 AM
You are correct, Data ONTAP 8.2.4 7-mode would allow you to use the disks in that configuration.
I do want to warn you of the danger of RAID-4...having 2 disks fail, or a single disk fail and a read error during rebuild, will lead to data loss. If two disks fail in the SSD RAID group it would cause the entire aggregate to lose data. Even with 4-hour parts replacement, it's still sometimes a bit scary.
TR-4070 has some great recommendations on sizing the Flash Pool in chapter 6. Assuming you update to Data ONTAP 8.2.1+ you will be able to use the Automated Workload Analysis tool to gauge how much Flash Pool capacity is appropriate...you may discover that you only need 2 drives worth of capacity to drastically improve performance, in which case you would still be able to use RAID-DP for the SSD RAID group.
2015-12-16 07:31 AM
Good point. I could've sworn I read that in the documentation. So just to reiterate on what you just said and what I seem to recall reading: disks added to an aggregate and then put into a flash pool, if the raid group that those disks in the flash pool are a part of fails due to disk failure, it will in fact fail the entire aggregate that it is a part of?
Thanks for all of your help, Andrew. You've given me some great information.
2015-12-16 07:33 AM
One last question, back to my original post. Do you know why when I set the raid group size to 12, once I added the disks it added 16 disks to RG0 and 4 to RG1 instead of something like 12 in RG0 and 8 in RG1?
2015-12-16 07:47 AM
If the Flash Pool RAID group fails, it will cause the entire aggregate to fail and be taken offline.
With regard to the RAID group sizes, I'm honestly not sure. The only thing that comes to mind is that if the 16 disk RG already existed, then it could have filled it before creating a new 12 disk RG. I don't have a 7-mode system to test with at the moment, so please don't quote me on that... You might try checking "aggr status -r" and see what it reports.
2015-12-16 08:04 AM
It was a brand new aggregate I created yesterday as a test, so the raid groups created were created by ontap when I added the 20 SAS disks to the aggregate; that was how it automatically created the groups, even though I specified 12 as the raidsize which is why it seemed so weird that it ended up with 16 in rg0 and then 4 in rg1.
I guess I'll look into more of that after I get these upgrades completed so I can get my flash pool running and alleviate the iop issues we are dealing with.
Thanks again for your help.