Currently, we have two major accesses, NFS access by different hosts(linux, Unix etc) and the other is vmware datastore. Both are using the same major SVM. Both access are based on 2 different primary subnet via two different VLAN's.
Which part do I need to check into, and to then to found out whether or not there are any issues with this design? By using two different VLAN, we should be then okay to use the same SVM?
SVMs are only containers - VLANs are assigned to ports - and you can then use them when creating the lif within the SVM - but what's exactly your question? if you should differentiate more between vmware and the rest of your hosts? well you could but you need not.
Thanks for the correction, when I said VLAN, I really should say LIF's which are built on VLAN's
As I said, although we have several SVM's, 2 major applications access volumes through one same SVM, therefore, most of clients accessing traffic go through LIF's on the SVM.
I am wondering if this configuration would cause any traffice issues on LIF's of this SVM. The rest of SVM's is lightly used. Should not we at lease to separate these two traffices to two differnet SVM's and then differnet LIF's? Or, would this configuration cause any other issues?
Hopefully, you understand me better this time, or tell me what I am missing.
Your two major applications sharing the same SVM is not an issue in itself. They seem to be going through two different LIFs, which is fine.
However, I would look at the port or the interface group on which these LIFs are defined. I'm assuming you have a single port or interface group on which you've created your two VLANs, one for NAS file serving and another for NFS datastore.
Also, I would look at the volumes/aggregates you are using for these two applications. Are these volumes on the same aggregate? Are the aggregates on the same node?
The best way to minimize or eliminate physical contention (CPU, disk I/O, network bandwidth), is to pay careful attention to alignment between your logical constructs (LIF/VLAN/ifgrp) and your physical resources (port/volume/aggregate/node) and make sure there's no overlapping of your logical construct over your physical. Back to your situation, while your NAS file serving volumes may sit on one aggregate on one node while your vmware datastore volumes may be on another node, if your LIFs have been defined on one interface group or one port, then the two applications may contend for network bandwidth and to some degree CPU resources. It would be best to put these LIFs on two different ports or interface groups, even if they stay in the same SVM. As HONIG2012 pointed out, an SVM is a container.
Thanks a lot for your message, it was very helpful.
these two accesses are going through 2 differnt LIF's which are based on two differnet VLAN's, and which are on the same ifgrp. The ifgrp is based on 4 physical ports with 10Gbit each, in multimode-lacp mode. It is commonly assumed here that the contention on 4x10Gbit should not be a concern.
What tools / commands can I use to verify or monitor if we have any contention on the ifgrp or ports?
Having a four port ifgrp does reduce the outbound contention rate to 25%, assuming even distribution of target endpoints (IP, MAC, Port), but you could be unlucky and have your major NAS client and ESX host have the same modulo 4 value. If you're using IP address as the load balancing scheme for your ifgrp, a quick check on the addresses of your clients should give you a good idea.
A sure fire way to be sure your NAS and vmware traffic won't contend is to create a separate ifgrp (on another node is the the ultimate) and make sure the vmware datastores come from a different aggregate sitting on a different node as well.
These commands seems helpful, and will look into them further.
Configurations here look like the following:
LIFNFS-1 and LIFVMWARE-1 on node1, these 2 LIF's are via 2 diferent VLAN's and down to the same ifgrp (4 phsycial ports on node1) LIFNFS-2 and LIFVNWARE-2 on node2, same as above but 4 physical ports on node2
LIFNFS-1 and LIFNFS-2 are DNS balanced, so LIFVMWARE-1 and LIFVMWARE-2.
If I understand correctly, by using this configuration, accesses from the same type of clients should be balanced well between 2 ifgrp's and on 2 diff nodes. The only contention might happen when two different type of accesses are coincidently trying to access LIF's/VLAN on the same node therefore the same ifgrp, the contention will be then on the node level. When this situation happened, accesses should still be able to be balaced between 4 physical ports.
Your understanding of ifgrp and LIFs/VLANs is correct.
However, I have a couple of comments on your current setup. First, your use of LIFs/VLANs over a 4-member ifgrp on one node and 4 physical ports on another node is puzzling. Why not use ifgrps on both nodes?
Second, DNS balancing LIFNFS-1 and LIFNFS-2 seems reasonable enough, especially if your NFS clients coming into these LIFs are relatively transient. However, for LIFVMWARE-1 and LIFVMWARE-2, I would be more careful about which esx host mounts off of which LIF. Assuming equal I/O resources on both nodes, if you have datastore1 and LIFVMWARE-1 on node1 and datastore2 and LIFMWARE-2 on node2, then I would have half of your esx hosts mount off of LIFVMWARE-1:datastore1 and the other half mount off of LIFVMWARE-2:datastore2. I'm also assuming all your esx hosts have roughly equal amount of load. By introducing DNS load balancing, you could end up with mounts of LIFVMWARE-1:datastore2 and LIFVMWARE-2:datastore1, which would lead to indirect access to your volumes. Indirect access is where the client's request comes in through node1's LIF, goes over the cluster interconnect, before getting to the datastore on node2. More CPU overhead to process cluster interconnect traversal, which leads to higher latency. Just not good in general. If you have your datastore volumes on one node, then just use the local LIF, and use the second LIF as a failover LIF. Proper system sizing effort would have given you a good balance among CPU, disk, and network pipes where you won't have to rely on contortions like this. If you've added a lot more disk shelves over time, then make sure your network bandwidth is also scaled up to allow your storage controller to keep up.