Solved: Ethernet design question for VMware

nasmanrox · ‎2016-07-19

We are setting up a 2 node cluster running on cDOT 8.3.1 for production which will replicate data via SM to the DR site. As I have not much experience running VMware utilizing NFS I need your input.

Here is what I'm proposing

1. Datastore volumes and Oracle DB volumes will have dedicated four * 10G LACP link in RED (flow control off and Jumbo frame; MTU 9000)

2. CIFS User home folders' volumes / typical Windows app volumes will have another dedicated four * 10G LACP link in BLUE (flow control off and non-jumbo frame; MTU 1500)

3. Snapmirror to the DR site will have dedicated four * 1G links in GREEN (paired in a ifgrp from each node) The reason for this is I recalled Netapp's best practice is to have multi separate links instead of one LACP link made of 4 * 1G ports in order to maximize throughput.

Q1. Is this setup a overkill by having two LACP links for 1. and 2.?

Q2. As we increase more number of datastore volumes in the future, is it better to have dedicated LIF and IP for each datastore volume? I remember one of the Netapp doc was suggesting to do so.

Q3. Is it recommended to have separate VLAN for datastore and the DB traffic? I'm planning to setup separate VLAN for VMware ESXi traffic and CIFS USER folder / Windows data traffic.

We're not going to have a private IP for the datastore volumes but I believe separating the VMware traffic from the CIFS in 2. should be sufficient. Any gotchas you can suggest or recommend?

asulliva · ‎2016-07-20

If you don't have > 1Gb of bandwidth between the primary and DR site, then I don't think it will have significant impact. Just make sure it's setup for redundancy...could be as simple as ensuring the broadcast domain has 2+ ports (connected to different switches) on each host and the LIF is assigned to the correct failover group so that if one link fails it will move to the other.

LACP would accomplish the same thing, but with the added benefit of spreading the traffic across more than one link...remember though that a single primary node ICL IP address to the DR node ICL IP address will always hash to using just a single link in the LACP aggregate, so a single flow will only get the maximum throughput of a single link. Assuming the DR site is an HA pair, and the volumes are equally spread across the nodes, it would approximately balance the connections across the links. Keep in mind though that it's just the connections, not the amount of traffic...e.g. if one volume is 1TB in size and another is 1GB in size, they may use different links (assuming the DR volumes are on different nodes), but one of those connections will be busy for a lot longer period of time.

Andrew

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

View solution in original post

asulliva · ‎2016-07-19

Hello @nasmanrox,

I think it's important to remember that best practices are recommendations, not laws. Make sure you are making the right decisions for your infrastructure, and more importantly, application requirements based off of real needs, not just because it's what the best practices say.

Overkill is relative. LACP is great in that it provides redundancy for the links...if one (or more) fails, then it keeps on trucking without any intervention. If you haven't already, you'll want to make sure that you're connecting to more than one switch (e.g. Cisco Nexus VPC link) for the best high availability.
1. Remember that LACP (or any of the other link aggregation technologies) will still only provide the maximum throughput of a single link for any point-to-point connection. For example, ESXi host -> NFS datastore will always hash to the same link in the LACP port channel. That being said, the hosts (with different vmkernel interfaces) and multiple datastores (with different LIFs) will result in different hashes, and therefore spread the traffic across all the links.
2. There is some debate in the wider VMware community about the benefits of jumbo frames. The general consensus is that they do provide some performance benefit, however the increased configuration complexity leads to no net gain to the business. Ultimately, it's up to you and your network admin. One notable exception, I prefer jumbo frames for iSCSI traffic. iSCSI frames are decoded by the CPU in VMware (unless you have a hardware iSCSI card), so having lots of small frames means more CPU consumption on the ESXi host than having fewer larger frames.
The vSphere best practices (TR-4333) make the recommendation that to maximise flexibility, and potentially performance, you should create a LIF per datastore. In reality, this can be difficult for two reasons: first, because it's potentially a lot of IP addresses, and second because it's a lot of management overhead. The thought process is that you want to keep the datastore traffic always going directly to the node which hosts the volume to prevent performance issues with the cluster network (since having the LIF on a different node would necessitate the traffic traversing from the ingress/egress node to the node hosting the volume)...so, if you do a vol move, you would also want to move the LIF with it to the new node.
1. For simplicity I prefer to use a LIF per node for NFS datastores. I don't vol move volumes to different nodes except temporarily, so this alleviates any concerns I have about using the cluster network.
2. There were some edge cases which caused a noticable latency spike and increase in CPU utilization for the controllers in the early clustered Data ONTAP versions, however that has been resolved in recent versions.
3. That being said, there's some merit in simplicity and keeping as few devices in the data path as possible. So, the real answer is how much risk is your organization willing to accept by potentially having the traffic traverse the cluster network (this is pretty low risk), and if it does traverse the cluster network, will it impact other things.
VLAN separate is a matter of preference and company policy. Providing layer 2 separation of the traffic provides greater flexibility and isolation compared to using the same layer 2 domain. Remember that NFS datastore traffic is unencrypted, so any data the VMs are reading/writing that's not encrypted by the guest could potentially be snooped.
1. I made some other comments about shared layer 2 segments here.

Hope that helps. Please feel free to reach out if you have any questions.

Andrew

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

nasmanrox · ‎2016-07-20

@asulliva

Thanks for your input. What's your thoughts on my SM link setup to the DR site? 2 * 1G LACP sounds about right? so 2 LACP 1Gb links to DR instead of having 1 LACP made up 4 * 1Gb ports.

asulliva · ‎2016-07-20

If you don't have > 1Gb of bandwidth between the primary and DR site, then I don't think it will have significant impact. Just make sure it's setup for redundancy...could be as simple as ensuring the broadcast domain has 2+ ports (connected to different switches) on each host and the LIF is assigned to the correct failover group so that if one link fails it will move to the other.

LACP would accomplish the same thing, but with the added benefit of spreading the traffic across more than one link...remember though that a single primary node ICL IP address to the DR node ICL IP address will always hash to using just a single link in the LACP aggregate, so a single flow will only get the maximum throughput of a single link. Assuming the DR site is an HA pair, and the volumes are equally spread across the nodes, it would approximately balance the connections across the links. Keep in mind though that it's just the connections, not the amount of traffic...e.g. if one volume is 1TB in size and another is 1GB in size, they may use different links (assuming the DR volumes are on different nodes), but one of those connections will be busy for a lot longer period of time.

Andrew

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

nasmanrox · ‎2016-07-20

One more question. Is it recommended to have separate SVM for VMware datastore volumes and CIFS user home folder volumes? Or is it OK to share the same SVM as long as utilizing different VLANs?

asulliva · ‎2016-07-20

My personal opinion is that SVMs are for separation of privileges when you're delegating storage management tasks. For example, if the teams managing the CIFS/SMB shares is separate than the team managing the VMware datastores, then separate their permissions using an SVM. If it's the same team managing all of it, or storage management isn't delegated from the storage admin team, then there's no significant reason for multiple SVMs.

That being said, there are some edge cases...for example, if you want to use SVM DR and have the different data types failover independently, then it certainly makes sense to use multiple SVMs.

Andrew

If this post resolved your issue, please help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

ARUM · ‎2017-03-02

Delegations and SVM DR are good reasons to create multiple SVMs. Security best practices and ipspaces, too.

As Datastores are usually on a private network (not routed) and CIFS shares are accessible from users, i would not create a unique SVM. In fact, with NFS, per default, protocol is accessible from every lif attached to a SVM. Only export policy avoids user to access to LUNs. With iSCSI, reachable IP addresses are choosed by admin.

I think that there is lot of reasons for multiple SVMs :

Multitenancy

Delegation

Security

SVM limits

Functions (infinite volume)

SVM DR

Ease of use, readibility, organisation

...

RPHELANIN · ‎2016-07-19

You should cable each filer to each switch in lacp config as opposed to each filer being directly connected to a single switch in a port channel.

4 x 10g LACP for CIFS is probably overkill, but if you have the ports then why not...

A dedicated LIF per volume is advantageous, however, it can become cumbersome to manage when you have lots of volumes..

You should always at least seperate protocols with VLANS. Sperating datatstore and DB is better practice than not doing it.

nasmanrox · ‎2016-07-20

I made some changes to the visio diagram. This looks good?

@RPHELANIN wrote:
You should cable each filer to each switch in lacp config as opposed to each filer being directly connected to a single switch in a port channel.

RPHELANIN · ‎2016-07-20

looks beter 🙂

Ethernet design question for VMware

Introducing GenAI Search on NSS