Re: NFS and iSCSI network design

jyarborough · ‎2011-02-23

Let me preface this by saying I know it is primarily a VMware question (and it has been posted there also) but I really need help with the NetApp side as well.

I am having a really hard time architecting a solution for an environment I am working on. I have found documentation for NFS or iSCSI, but cannot figure out the best way to get both at the same time with the hardware we are working with. We are dealing with the following:

5 x ESXi4.1 Hosts
- 4 x 1Gb physical NICs per host for VM traffic (irrelevant)
- 4 x 1Gb physical NICs per host dedicated for storage
2 x Catalyst 3750G-E
- In a stack, not enough physical ports for both VM traffic and storage so it is only used for VM traffic
2 x Catalyst 2960G
- Separate switches dedicated to storage traffic.
- No connectivity between them. Could possibly bond 4 x 1Gb interfaces together if it would help?
2 x NetAppFAS2050s
- One is a clustered unit with two heads (total of 4 x 1Gb physical NICs, 2 per head)
- The other is a single head (total of 2 x 1Gb physical NICs)
1 x StoreVault S500
- 2 x 1Gb physical NICs

The storage devices are dedicated to VMware so all that will be on them are VMFS stores for iSCSI and NFS datastores.

We have the need to use both iSCSI and NFS on the above devices. I cannot wrap my head around how the vifs and vSwitches are going to look. One note is that we are working with vSphere Enterprise Plus so we do have vNetwork Distributed Switches and the ability to "Route based on physical NIC load" which sounds really intriguing for the NFS traffic. For iSCSI, I have always been told there should be a 1 to 1 mapping between VMKernel port and physical NIC, and each path to the SAN should be on a separate subnet to ensure that the traffic is sent/received on the expected interfaces and to allow for proper multipathing. From what I can tell with NFS, multipathing is not possible and it is recommended to team all the physical NICs into a single VMKernel port. The NFS traffic will then balance across datastores (different target IPs).

Anyway I guess a couple of questions that I am struggling with are:

Do I use vNetwork Distributed Switches for storage traffic or do I need to stick with traditional switches for some reason?
Given that we have 4 paths setup in the iSCSI configuration, how can NFS compete with that unless we have several (at least 4) NFS exports? Maybe I do not understand iSCSI multipathing entirely.
How would the networking configuration look as far as vSwitch, etc? I envision it as a single vSwitch with "Route based on physical NIC load" as the teaming. 4 x VMKernel ports with a single physical NIC active in each and a different subnet in each for iSCSI. 1 x VMKernel port with all 4 physical NICs active using the inherited "Route based on physical NIC load". On the FAS/StoreVault side I am still confused about using VIFs with aliases vs individual interfaces. Seems like individual interfaces for iSCSI makes sense while LACP or multi-mode VIF's make sense for NFS.
Where does LACP come into play? I know the 2960G's do not do cross-switch LACP, so do I do a 2 port LACP from each switch back to each host?
Am I trying to make a bowling ball fit through a garden hose? Do I need to get the storage traffic in the stack and do cross stack LACP? Do I need to break up the NFS and iSCSI traffic into two separate vSwitches with different physical NICs?

I am very much open to suggestions and any advice or articles that will help clarify. I've combed through numerous articles only to find more questions that needed answers. They all seem to be one or the other but never both on the same environment.

Any help would be greatly appreciated!

Thanks!

jasonburrell · ‎2011-02-24

A couple considerations

I would suggest stacking the 3750's and using the 2960's for VM traffic, this will make your life easier when it comes to load balancing and providing fault tolerance from a switch failure. As for your other points my comments are inline.

Do I use vNetwork Distributed Switches for storage traffic or do I need to stick with traditional switches for some reason?
vDS could reduce your complexity because you would not have to worry about creating etherchannels and you can standarize across your environment using a single management interface. Your vCenter would need to be setup in advance of this so you might want to keep your vCenter server off of the vDS if you plan on making it virtual.
Given that we have 4 paths setup in the iSCSI configuration, how can NFS compete with that unless we have several (at least 4) NFS exports? Maybe I do not understand iSCSI multipathing entirely.
I think you should etherchannel the nic's on each of your storage controllers (requies that you stack the switches) making sure one port is connected to switch1 and the other is connected to switch2 this will prevent an outage in case of a switch failure. Then for the FAS devices you would create one multimode vif for each controller with one alias for a total of 2 IP's so the IP Hash will use both nic's. Then you would export the NFS volumes based on load evenly across the IP's. I'm not 100% sure about iSCSI but I believe you would want to create individual interfaces and then use mulitpathing for redundancy. (check out link 1 below on how to do this for the FAS devices)
How would the networking configuration look as far as vSwitch, etc? I envision it as a single vSwitch with "Route based on physical NIC load" as the teaming. 4 x VMKernel ports with a single physical NIC active in each and a different subnet in each for iSCSI. 1 x VMKernel port with all 4 physical NICs active using the inherited "Route based on physical NIC load". On the FAS/StoreVault side I am still confused about using VIFs with aliases vs individual interfaces. Seems like individual interfaces for iSCSI makes sense while LACP or multi-mode VIF's make sense for NFS.
I would create one vDS with all 4 nics attached and 2 VMKernel ports one for NFS and one for iSCSI both allowed to use all NICs.
Where does LACP come into play? I know the 2960G's do not do cross-switch LACP, so do I do a 2 port LACP from each switch back to each host?
It is why I recommend stacking the 3750's, that way you can use LACP over two switches. The only other option would be to etherchannel the FAS controllers one on each switch and then treat a switch failure as a controller failure, because if switch1 dies so will all your access to the datastores. I'm not sure how to setup the FAS to treat a network outage as a controller failure, you might want to look into that.
Am I trying to make a bowling ball fit through a garden hose? Do I need to get the storage traffic in the stack and do cross stack LACP? Do I need to break up the NFS and iSCSI traffic into two separate vSwitches with different physical NICs?

As I said eariler you should stack the switches. I don't think it is worth breaking up the iSCSI traffic on to different physical nics, I think VLANs will work fine. You will probably be fine with 4 NIC's for storage traffic. I would suggest monitoring it and seeing if you really need 4 NICs for VM traffic, you would be suprised how little VM traffic there is on a normal network. People just want to think that they need 4 nics for 30 VMs but most the time they do not.

ocionetapp · ‎2011-03-07

I've been using NFS on Netapps via 10G redundant ethernet links (active/standby) for about six months now using ESX 4.1. It rocks. I've not seen utilization on the Netapps approach 4 Gig yet, and IOPs are still relatively low (~200 VMs and climbing fast).

But I'm being forced to put up some servers utilizing 8 x 1 Gig links in a manner similar to your environment.

I can't imagine why anyone would use iSCSI rather than NFS anymore, partularly now that "Load based on physical port" is a teaming option in vSphere Enterprise plus.

I've run tests with a vmware distributed switch (not the vNexus) on a single portgroup and vlan with physical uplinks to TWO DIFFERENT SWITCHS without creating a L2 loop. If works fine without 802.3ad LACP, or proprietary Cisco Etherchannel, or spanning tree. In fact the docs say "Load Based on Physical Port" is not compatible with LACP or Etherchannel. VMware takes care of the routing, putting VM1 on the first link, then VM2 on the second link, etc. It then looks for any physical port which goes over 70% bandwidth for an extended period of time, and starts to move VMs off of that port. Now, that's extremely cool. I haven't found anywhere I can change the 70% configuration, and I only know about it from reading.

So, you put the same (multiple) vlan tags on two different ports on two different physical switches, and you have an active/active load balanced team (LBT). You can scale to four or six uplinks balanced across physical swithes as well. This means that high end switches which can do LACP (or Etherchannel) across two switches are no longer necessary.

In my new design, which will house Oracle and MSSQL VMs (don't ask), I'm going active/standby to two different switches for management console, and two 3 x 1 Gig uplinks to different physical switches for both data and storage, no LACP at all, and that's it. NFS has its own portgroup, and with ESX 4.1 I have QOS or whatever VMware calls it, so I assign priority to IP storage and let it fly.

You do have to be careful configuring your portgroups. Hope this gives you some ideas. VMware + Netapps + NFS is really very nice. I suppose you could throw iSCSI in there as well.

erick_moore · ‎2011-03-11

I agree that NFS is the way to go with vSphere datastores, but iSCSI still has its place. For starters the storage QoS you mention (vSphere Storage I/O Control) doesn't work on NFS. Currently it is only available to block based protocols / VMFS volumes. The other area that iSCSI has the leg up is the ability to aggregate bandwidth. Using iSCSI MPIO in vSphere you can aggregate bandwidth with round robbin I/O down all available interfaces. Until we see pNFS support we won't have that option with NFS datastores.

Erick

radek_kubka · ‎2011-03-11

Using iSCSI MPIO in vSphere you can aggregate bandwidth with round robbin I/O down all available interfaces.

To be fair on NFS - a similar end goal can be achieved by using multiple NFS datastores & VMkernel ports & connecting each of them via different vmnic.

ocionetapp · ‎2011-03-11

Probably the vSphere "QOS" I mentioned is a misnomer -- it is based on IP, and not dependent on file systems or block level storage. Edit the dvSwitch properties and there's the "Enable Network I/O Control" checkbox. Use that with NFS and it may mistakenly be called QOS. It suppose it really is not. The bottom line is that you can protect your NFS bandwidth, and while that's not as granular as Cisco, it sure is nice.

IP is IP -- and at least in the VMware realm I see little advantage using IP to reach LUNs, except that it is cheaper than fiber channel. In the last year or so I've come to shy away from LUNs altogether, regardless of protocol, at least for vSphere.

Using LACP (or Etherchannel) with "route based on IP hash" for NFS on a vmkernel is as effective as aggregating 1 Gig iSCSI uplinks. No balancing or round-robin about it, just aggregation. And newer 10Gig environments pretty much render aggregation moot. The tests and graphs I've seen show pretty much identical performance between iSCSI and NFS, but iSCSI has more limitations in my mind. But I'll admit I'm biased (*nix guy). In my 10GigE blade environment I've tested NFS based VMs against clones sitting on LUNs accessed via 8Gig HBAs. I'm happy. I will continue to use fiber channel when somebody insists on it, but otherwise not. It's getting so that the bottleneck is neither the protocol nor the bandwidth, but the storage device itself. IOPs. So... more VMs, more storage devices. Must be a nice business to be in.

The trickiest thing about using IP storage of any sort is getting separate redundant physical routes to storage, and that's critical in virtual environments. Fiber channel makes that a foregone conclustion, but you may have a hard time convincing network managers (real network managers) that you need separate physical IP pathing. Generally they change their minds when hundreds of VM servers fall of the network due to a single 10G switch failure.

radek_kubka · ‎2011-03-11

Probably the vSphere "QOS" I mentioned is a misnomer -- it is based on IP, and not dependent on file systems or block level storage.

This is the feature which does work on iSCSI, but doesn't work (yet?) on NFS datastores:

http://www.vmware.com/products/storage-io-control/

Arguably FlexShare can deliver similar functionality - providing all datastores are on NetApp.

Regards,
Radek

ocionetapp · ‎2011-03-12

I am not referring to "Storage I/O" whatsoever.

I am referring to the capability to limit network I/O. With NFS it amounts to the same thing. I am putting limitations on IP, not storage.

I am using it successfully, and it has nothing to do with the API for storage. Nothing. It is strictly based on a vmkernel getting more bandwidth than anything else in a dvSwitch.

erick_moore · ‎2011-03-12

Yes, Network I/O control is a nice feature on distributed vSwitches. That said, if you have a run-away process on a VM connected to an NFS datastore that single VM can consume all your bandwidth to that datastore. This is where an iSCSI, FC, or FCoE mounted VMFS datastore with Storage I/O will be of benefit. You can run both Net I/O and Storage I/O together for optimal performance by guaranteeing bandwidth (Net I/O Control) and preventing a single VM from starving others for disk I/O (Storage I/O Control). NFS datastores do not yet have the capability in vSphere. If you have storage SLAs based on latency, at this point, VMFS volumes are the only way you can make such a guarantee. So while NFS is an awesome choice, and one I strongly advocate, iSCSI does still have it's place. I think certainly the tools you have outlined here make NFS a very attractive, and simple choice for customers thinking about IP storage in a vSphere environment.

Erick