New to NetApp - FAS2020 and XenServer questions

brendanheading · ‎2010-03-28

Hi guys,

I work for a small business (~100 employees) and we're stepping into the heady world of SANs. We've been doing some VM consolidation for a while, and hosting this from standard servers with locally attached disks and software iSCSI terminators/NFS. We've been burnt a bit on investments on basic RAID arrays that didn't work out well (the vendor shall remain nameless!) so I think I can make the case for getting a platform that's a bit more stable.

We are considering purchasing a FAS2020. I'm attracted to this product in particular because it supports iSCSI and CIFS/NFS in the same box. The primary function will be to serve XenServer and Hyper-V virtual machines, and I'm also hoping that we'll be able to boot the Xen hypervisor (and our other non-virtualized servers) over iSCSI, so our servers will be diskless. I'm also hoping that we can serve standard NFS and CIFS volumes from the same FAS2020. In other words, I want to try to consolidate all my storage in our small "datacentre" into one place. At the moment there are about ten VMs and around about 20-25 servers which we may be able to consolidate. We've currently got about 3TB of data being served out over NFS/CIFS, but the server workloads are very light indeed - really just documents, images, etc.

We don't really need XenServer Essentials because our VMs tend to be long-lived (we're not regularly creating/copying/deleting the VMs), so I'm happy enough to create a couple of standard iSCSI or NFS SRs, and create volumes manually within that in the way we do right now. We have an add-on script that snapshots running VMs and copies the backup off to another server where it is streamed to tape.

It's important to me that the investment will stand for as long as possible. With the "other vendor" we were sold what turned out to be an end of line device that used PATA disks. High density PATA disks are getting hard to find. The array obviously has no cool stuff such as dedup/snapshots/etc.

I've been busy researching the FAS2020 capabilities and have a few questions. I'd be delighted if I could obtain clarity on these matters as these will help me put a sound business case to my management.

(before we get into that, are there any reputable resellers of this equipment in the UK ? I've emailed several suppliers who appear on NetApp's UK reseller list, with details of my requirements, storage needs etc, but not one of them has replied or attempted to call me! That's kind of surprising in a recession .. if you are a reputable seller and you're reading this, please reply with your details and I'll be in touch)

1. What's the difference between the FAS2020 and the FAS2020A ?

2. Is it true that each controller requires three disks for it's own use and, therefore, in an active-active configuration with dual controllers, six out of the total 12 disks will not be available for storage ? If so, what is the reason for this ?

3. If (2) is true, is it therefore possible to use small/cheaper disks for the three required for each controller, and use larger/higher performance disks for the array itself ?

4. Is it true that each controller must have at least one Aggregate assigned to it, and therefore in an active-active configuration I must create two Aggregates ?

(the alleged limitations with dual controllers are leading me to believe that it would be better for me to get a pair of FAS2020s and have them mirrored, failing over in the event of a controller failure - which is something we would probably want to do in the event of controller failures anyway)

5. Are hot spares global across the whole appliance including any expansion shelves, or are they per-Aggregate ?

6. I assume that I cannot use commodity disks in a NetApp caddy, and that I must use NetApp-branded disks ?

7. It looks like the FAS2020 is approaching EOL and that it is being heavily discounted. Can I expect that I will be able to obtain new disks and shelves for this product well into the future ? I'd expect to get at least five years of service life from it.

8. If I create a Snapshot, is the snapshot allocated out of the storage assigned to the FlexVol ?

9. If I understand correctly, my configuration steps would be :

(a)Create an Aggregate, which is NetApp's term for a RAID volume. If I understand correctly, it looks like RAID-DP is the way to go. A disk can obviously be present in only one Aggregate at any one time.

(b) On top of the Aggregate I can create as many FlexVols as I like, up to a reasonable limit, which can be of any size, and which can be overcommitted. I must keep the FlexVol to a maximum of 1TB to take advantage of de-duplication.

(c)The FlexVol can then be served out as either an iSCSI target or an NFS volume. I can then configure XenServer to use this as an SR.

Have I got this right ?

10. What's the deal with OS upgrades; is this free or is there an extra charge for newer functionality ?

adamfox · ‎2010-03-28

Wow. I'll do my best to answer these. NetApp does things a bit differently so these questions are quite common and usually handled by a Systems Engineer. Since you've had a tough time getting a reseller, I'll help with your questions as I'm an SE in the US.

1. The difference between those models is the A model is an HA pair in the same chassis. This is a very typical configuration for production environments because high availability isn't really an option anymore.

2. It's not exactly true. What's true is that each controller must have a root volume. However, that root volume can live in a aggregate that is shared with other data volumes.

3. Hopefully #2 answered this. Keep in mind that on a per GB basis, you typically have small or cheap. The cheaper disks tend to be SATA which are large while the more expensive disks tend to be FC/SAS which are smaller (but faster).

4. Yes according to #2. As to whether it makes sense to get 2 non-HA systems and mirror them, I'm not sure you will end up ahead. I suppose it depends on what problem you are trying to solve. A normal HA solution will require 1 aggregate per head but so would 2 non-HA controllers. Plus the later would require 2x the storage in order to replicate as well as a license for the replication software. So if cost and efficiency are the goals, a standard HA config wins hands down as the later requires more storage and software plus your storage utilisation (before any thin provisioning and dedup) starts at 50% and goes down from there from data replication. You'll get some, perhaps much, of that back with thin provisioning and dedup, but you'd get those savings on the HA solution as well so it's fair to take them out when comparing. Also with the HA pair, failover is built-in and automatic. With the 2nd config I expect there to be some manual intervention to get the failover.

5. Hot spares are global across the appliance, even expansion shelves assuming the same speed disks. e.g. You don't have a 7200 rpm SATA fill in for a 15K rpm FC disk. However, expect to have one spare per controller.

6. Yes this true. Supporting any disk out there would be a support nightmare. Trust me on this.

7. We have no word as yet on when the 2020 will go EOL. New drives for EOL hw is usually dependent on suppliers of those drives and we strive to offer them as long as we can. Replacement disks under your support contract will continue for 5 years after the EOA announcement.

8. Yes. Snapshot blocks are not physically moved from the volume. This is a good thing because it is a big reason why NTAP snapshots do not adversely affect performance. Moving blocks around takes lots of extra iops. ONTAP supports up to 255 snapshots per volume and you can use all of them as long as you have the space without hurting your performance. This is not the case for many storage systems out there.

9. Yes, that is a good outline of the provisioning process. The dedup limit goes up you go up in models (and sometimes in OS rev).

10. Upgrades are included in the SW support contract. So along with access to tech support, knowledge bases, etc. You get a software subscription that entitles you to download new versions as well as patches to existing releases.

Hope this helps. If you aren't getting much help from the resellers directly, that you try a local NetApp sales office as they should be able to refer you.

-- Adam Fox

brendanheading · ‎2010-03-28

Adam,

Thank you very much for that very helpful reply, much appreciated. It clears a lot up. I'd agree that the A variant is probably the one to go for since HA is a requirement for us, especially since we're doing a lot of consolidation. I also like your thinking about using HA rather than mirroring two arrays. Having two arrays does allow us to place them both in different sites. We might yet consider that as a future upgrade.

I assume when you say each controller has to have a root volume, it is not possible (or best) to put both root volumes on the same aggregate ? Your answer to #4 implies that an aggregate must be assigned to each controller, so there would have to be a minimum of two aggregates. That's fine by me, slightly less flexible than I'd like but not insurmountable.

My questions about disk allocation to controllers come from a number of postings over on the XenServer forums that I found while googling on this subject. Specifically, this one :

http://forums.citrix.com/thread.jspa?threadID=249451&start=15&tstart=0

Quote : "Also we were fitted with dual controllers with 6 assigned to each controller. 3 drives went to the OS for controller A, 3 more drives went to the OS for conroller B, and out of the remaining 6 we had 2 set aside for hot spares. Out of 12TB of drives we ended up with something crazy like 1TB of acutal usable data."

I'm sure the guy (who repeats this on several different threads) does not intend to be malicious, but it sounds like he is badly misinformed.

One of the dealers on the internet is quoting a good price for a refurbished expansion shelf with 12x 500GB drives installed. So, let's say I go for a FAS2020A and a shelf giving a total of 24 500GB SATA disks. What would be the optimal configuration with a good degree of reliability ? I hope you're going to say I should have two spares and two 11-spindle RAID-DP aggregates If so, assuming two parity disks per aggregate, that would give me 18 spindles worth of storage = 9TB.

adamfox · ‎2010-03-28

No problem.

There is certainly no problem in getting a 2nd location later. One of the nice parts of the NetApp unified architecture is that you can replicate between any models. So you could buy a bigger unit later and redeploy your older unit elsewhere.

Each controller must have at least one aggregate but that aggregate can contain both a root and various non-root volumes. It is not necessary to dedicate an aggregate for root volumes. So you may have a single aggr on each controller of a few TBs. Then you can set aside a few GBs for a root volume, then create as many data volumes as you like.

You are close on your allocations. I would recommend 1 spare per controller. 2 parity per controller and the rest is data. Now your usable space will be less than 1TB per pair of data drives due to things like formatting and other overheads. An SE can help you with those calculations. Of course you'll get much of that back in other efficiencies.

As for the guy in the Cirtrix forum's config. I can only speak to it specifically however it would appear from that sound bite that it could have been done with a much higher yield.

Follow-ups welcome.

-- Adam Fox

brendanheading · ‎2010-03-28

Adam,

Much appreciated once again, thank you.

Understood on the storage allocation. I was trying to get an idea of the raw storage (before formatting but after RAID/spares/etc overheads) to compare it with the pair of arrays we've already got whenever I'm making my case. Those arrays are configured as pairs of mirrored disks used as JBOD (it doesn't do stacked RAID in hardware!), so the yield is somewhat lower; 4.5TB per array (before formatting). So what I'm figuring is that a setup with 24 500GB disks will give me roughly the same amount of storage as our current 28-disk setup has, with better performance due to using RAID-DP across lots of spindles . And that's before the benefits associated with dedupe and over-allocation kick in.

I will try to get in touch with the local NetApp office here to see if I can get a reseller to talk to me.

regards

Brendan

jonathanp · ‎2010-03-29

Hi Brendan

I have stumbled over your post this afternoon, and wanted to post a brief reply with my contact details on as I work for a NetApp partner in the UK and I would be keen on discussing things with you in greater detail.

In the first instance our company website is www.cetus-solutions.com and our head office number is 0161 848 4315. I will drop you a private message with my email address and mobile number so you can contact me directly when you have an opportunity to do so.

Kind regards

Jonathan