Network and Storage Protocols
Network and Storage Protocols
I'm playing with DOT 8 and our brand new FAS3240 HA pair.
For the 1st time, we have set up the e0M interface.
The network setup is simple.
One IP address for data on production network (vif ifgrp single over two ifgrp multi).
SP and e0M: one IP address for each, on the same admin network.
Default GW is the prod network GW of course.
Until now, on all our NetApp systems, daily admin stuff was always done thru the data interface.
We are used to make use of console access thru SP/RLM/BMC, but only in case of real problem, or DOT upgrade.
I was considering using e0M for daily admin stuff but I'm facing a GW problem.
Incoming packets on the e0M interface gets answered thru the data interface (unless source IP address is also on the admin net but it's not) because the default GW is on the data network.
Consequence: problems on the data interface would also impact the e0M interface.
So in this situation, having a e0M interface is pretty useless for me !
How can I make the output packets (for e0M in packets) go thru the same e0M interface ?
There is a KB article How to specify default routers, manage multiple routers and create redundant routing schemes
I thought it could help me but it relies on /etc/dgateways.
Don't we know since YEARS that /etc/dgateways is deprecated ?
As anyone a clue how to make e0M really useful ?
Generally management network will be different from data network.
If you have a seperate network/sub-network for management purpose then you can assign an IP in that particual network range to e0M and create static route on the filer as instructed by network team.
Hope this will help a little.
Having an interface in the management network automaticaly add a static route for this network in the route list.
Communication from a host in the managment network and the filer's e0M interface will not make use of the default GW and all the traffic will be done thru e0M. Of course.
But all the communication flows aimed to e0M will not be coming from the management network (in my case, I would even say NONE).
Path for sent packet:
Host A in network N1 that is not mgmt net --> N1 GW --> ... --> filer e0M in mgmt network
way back for replies:
filer data interface --> default GW, ie data network GW --> ... --> Host A in network N1
Since ONTAP adds the gateway to a physical interface it can get confusing sometimes. Do you have "options ip.fastpath.enable" set to off ? It sounds like the default on was changed to off. When on the response goes out the same port it came in on. We used to turn it off pretty on most installs but with e0M it is sometimes easier to leave it on in order to not have to modify routing which could affect the data network.
No, "options ip.fastpath.enable" has not been modified and is set to "on" since the beginning.
I read a few thing about this option without finding out how it could solve my problem.
I still see e0M "RECEIVE Total frames" counter increasing while the same counter for TRANSMIT never changes...
I've added static route to my management host's network, with e0M GW as the GW for this net.
It works, with huge drawbacks
- it only works for host from this network
- my management host is also a "data customer" of my filer. In that case, packets are sent to the data interface with big pipe, and "comes back" from the e0M...
With fastpath enabled, any user traffic that comes in the data interface will go back out that interface and not use routing (exception is an nfs mount request which doesn't use fastpath until after the mount completes we have found). I agree the static route gets ugly when it is outside the management network in this case.
Is routed on or turned off? Do you have a case open and have you run a packet trace (pktt) to check what is going in/out of e0M? This might be a burt support can track and see if fixed.
Already deactivated routed for some tests but I didn't notice any difference.
Maybe I should give it a try with routed deactivated in /etc/rc and a filer reboot.
I can play with pktt also, but I'm afraid wireshark will show me what I can already clearly see with ifstat...
With fastpath enabled, any user traffic that comes in the data interface will go back out that interface and not use routing
I remember KB (do not have number handy) which says that fastpath applies to NFS traffic only. Unfortunately there is very little information which explains how fastpath really works.
I think it applies to all protocol including management traffic...anything that comes in from a host. We did find that nfs mount requests use routing, not fastpath...a customer with 2 interfaces on the same subnet... mount request came in .3 and the response out .2 and the client wasn't able to mount...but once mounted all traffic in .3 came back out .3. I agree completely...It would be nice to have a full description of fastpath and what it does or doesn't do with routing for all scenarios and protocols.
The interaction of mounting and fastpath depends on the OS used. Solaris will always use UDP during the initial part of mounting and fastpath has no effect on it: https://kb.netapp.com/support/index?page=content&id=3010148
But for e.g Redhat Linux the behaviour is different. If you do not explicitely specifiy proto=tcp in the mount options, Linux will use UDP. In some redhat releases it will switch to TCP if the reply from the filer does not arrive. If you explicitely specify proto=tcp the Redhat will immediately use TCP in the mount request and fastpath will be used for the reply!
In general fastpath will be used for TCP connections and only for UDP data traffic. The reply to ICMP traffic will not use fastpath.
Another issue one should be aware of is that fastpath only applies to connections initiated by a client. This is important to know if you use iSCSI and your initiators are not in the same network as your filer. With iSCSI the netapp filer regularly sends out a keep alive. Since this is not client initiated, fastpath is not applicable and the filer routing table will be used. If the filer does not explicitely have an entry for your initiator subnet, the default gateway will be used and depending on your netwerk setup you run the risk of the keep alive not ariving at the initiator. When this happens the filer will drop the iSCSI connection after 10 failed keep alive attempts!
I finally got it working.
It really depends on the protol in use. I was doing my counters tests with ping and obviously it was not a good idea...
To sum up:
- routed/RIP seems to be pointless... (as long as you provide the mandatory GWs)
- it works with telnet protocol, but if fastpath is disabled, it won't !
- it doesn't work with ping or RSH (I *think* the solaris ping I'm using is ICMP based, not UCP port based)
So it CAN work as expected, thanks to fastpath, in some cases only.
But fastpath is obscure and it does not handle every protocol the same way.
My opinion is that fastpath has been designed with the UNIQUE idea to bypass the routing tables for performances ONLY.
Very shorts dataflow (moreover with single packets) ? fastpath doesn't care and won't handle them !
So, it can help in some situations but I would only consider it as a positive side effect.
Routing abilities are still poor and this is the true problem underneath.
Thanks everyone for participating and routingguiding me on the right way
Let’s be honest – the problem you have is in no way specific to NetApp and will happen on every other system I have worked with. There is no system that would by default ignore routing table for random destinations. If you have requirement that specific networks must be reached via specific gateway you have to tell system about it. There is no workaround. Really.
I'm not asking to ignore default routing.
I'm not talking about data flow initated by the filer.
I complain that an output packet that is a reply to input packet aimed to e0M is not always send back thru the same e0M interface.
I've probably been wrong writing "poor routing abilities". You're right. More tunable/powerful routing abilities are probably seldom.
To me it's actually worse. What is missing here is not advanced routing abilities but very basic routing behaviour !