Network and Storage Protocols
Network and Storage Protocols
I'm not wanting to undo and redo a bunch of broadcast domain stuff on 3 new A400's, but I'm running out of hair trying to understand how to get the SVMs to listen on port 10000 on specific LIFs (25Gb ones). It appears this is a total game of "chance" but they hardcoded ONTAP to only listen on the e0M or cluster mgmt LIF?? Where's the procedure to tell the array "list on these LIFs for NDMP requests"?
CAB connectivity and discovery into Commvault is a piece of cake but the forced traffic over e0M I simply can't understand the logic behind.
thanks
Solved! See The Solution
I think I misunderstood your initial query. It's not a limitation on ONTAP side. For Control connection it has to talk to Management Interface (Hence Clust_mgmt comes into picture). Older systems had 100MB pipe so that could have been the issue of concern but all the Modern Filers come with 1G interface for e0M which is more than enough for NDMP Control communication exchange. I thought you were concerned about the NDMP 'Control Connection' bandwidth, hence I mentioned about moving the Clust_Mgmt to non e0M port, but looks like you are concerned about the data being backed up via cluster_mgmt LIF, no this can be controlled. There is plenty of documentation around this on netapp support/kb site and also on the commvault side.
There are two types of communication that are used by NDMP Backups configured with CAB: Control connection and Data connection.
Control Connection: This is for Management calls (Clust_mgmt)
Data Connection: Data Backup (This could be your intercluster Or Data for Data Vserver).
For DATA_CONNECT: There is something called 'ndmp Preferred Interface Role', which allows you to control which LIFs to use for "data" backup. Infact you can also control specific Ports that will be used for actual data transfer. Please see the two kbs.
The default value for 'ndmp Preferred Interface Role' for a Data Vserver is - intercluster, data. That means, it cannot use clust_mgnt interface for data backup. For Admin Vserver the first preference is 'intercluster', so if your LIFs subnet able to talk to DMA then this will be used.
For Ports:
How to configure NDMP server to listen on specific ports for data connections
https://kb.netapp.com/Advice_and_Troubleshooting/Data_Protection_and_Security/NDMP/How_to_configure_NDMP_server_to_listen_on_specific_ports_for_data_c...
You can do some testing and find out if the communications are happening on the right LIFs or not. For example, netstat on Media Agent could give the following details.
C:\Windows\system32>netstat -anp tcp | findstr [NAS clust_mgmt IP]
TCP [Media Agent]:51896 [NAS-Clust_mgmt-IP]:10000 ESTABLISHED <---- Cluster Mgmt LIF
C:\Windows\system32>netstat -anp tcp | findstr [NAS Intercluster/data IP]
TCP [Media Agent]:51929 [NAS-Intercluster-IP]:36583 ESTABLISHED <----- Intercluster LIF
This Kb is also good in understanding NDMP CAB backup and other dependencies.
https://kb.netapp.com/Advice_and_Troubleshooting/Data_Protection_and_Security/NDMP/What_is_the_LIF_Choice_order_for_Cluster_Aware_backups_in_NDMP%3F
I am sure CommVault support can also help with this, as they have dealt with this NDMP stuff (Control & Data_Connect) for large customer base.
There is no clear information (i.e KB) in this regard unfortunately. If you see the following kb, it suggests - Ensure the backup application (DMA) is not connecting to a LIF hosted on e0M for the NDMP control connection. Check with your backup application vendor for assistance
However, if the DMA must connect to Cluster_mgmt LIF (ex- hosted on e0M), in that case only way out would be to create Clust_mgmt LIF on a Port other than e0M.
When configuring CommVault for NDMP CAB, the CommVault server should connect to the cluster_mgmt LIF. If a name service has been configured to resolve the NetApp cluster name to the cluster_mgmt LIF IP, the CommVault server can use the cluster name when adding the NAS client. Alternatively, use the cluster_mgmt LIF IP address instead of a name when adding the NAS client.
So, if Commvault has to talk to the cluster mgmt service IMO this is a result of the limitations inherint in ONTAP, not Commvault. I mean if ONTAP isn't able to listen (control port) on any other LIF other than cluster-mgmt ie e0M, well, it's literally impossible to use any other LIF on the array isn't it?
Maybe someone can help me pin this down- I am OK having Commvault do control over e0M because for commands and such 1Gb is plenty. But it appears to me the real crux of the problem is - how to force all data xfer over non-cluster-mgmt LIFs? Where is this configurable? I see no other NDMP tunables other than "data port range" which doesn't seem to do anything other than make FW rules more concise. Am I missing something here or what?
thanks again!
I think I misunderstood your initial query. It's not a limitation on ONTAP side. For Control connection it has to talk to Management Interface (Hence Clust_mgmt comes into picture). Older systems had 100MB pipe so that could have been the issue of concern but all the Modern Filers come with 1G interface for e0M which is more than enough for NDMP Control communication exchange. I thought you were concerned about the NDMP 'Control Connection' bandwidth, hence I mentioned about moving the Clust_Mgmt to non e0M port, but looks like you are concerned about the data being backed up via cluster_mgmt LIF, no this can be controlled. There is plenty of documentation around this on netapp support/kb site and also on the commvault side.
There are two types of communication that are used by NDMP Backups configured with CAB: Control connection and Data connection.
Control Connection: This is for Management calls (Clust_mgmt)
Data Connection: Data Backup (This could be your intercluster Or Data for Data Vserver).
For DATA_CONNECT: There is something called 'ndmp Preferred Interface Role', which allows you to control which LIFs to use for "data" backup. Infact you can also control specific Ports that will be used for actual data transfer. Please see the two kbs.
The default value for 'ndmp Preferred Interface Role' for a Data Vserver is - intercluster, data. That means, it cannot use clust_mgnt interface for data backup. For Admin Vserver the first preference is 'intercluster', so if your LIFs subnet able to talk to DMA then this will be used.
For Ports:
How to configure NDMP server to listen on specific ports for data connections
https://kb.netapp.com/Advice_and_Troubleshooting/Data_Protection_and_Security/NDMP/How_to_configure_NDMP_server_to_listen_on_specific_ports_for_data_c...
You can do some testing and find out if the communications are happening on the right LIFs or not. For example, netstat on Media Agent could give the following details.
C:\Windows\system32>netstat -anp tcp | findstr [NAS clust_mgmt IP]
TCP [Media Agent]:51896 [NAS-Clust_mgmt-IP]:10000 ESTABLISHED <---- Cluster Mgmt LIF
C:\Windows\system32>netstat -anp tcp | findstr [NAS Intercluster/data IP]
TCP [Media Agent]:51929 [NAS-Intercluster-IP]:36583 ESTABLISHED <----- Intercluster LIF
This Kb is also good in understanding NDMP CAB backup and other dependencies.
https://kb.netapp.com/Advice_and_Troubleshooting/Data_Protection_and_Security/NDMP/What_is_the_LIF_Choice_order_for_Cluster_Aware_backups_in_NDMP%3F
I am sure CommVault support can also help with this, as they have dealt with this NDMP stuff (Control & Data_Connect) for large customer base.
OK I do see that the intercluster LIFs I have setup are listening on 10000.
I think I now see the problem which I missed earlier - even if I tell the cluster to use specific data ports, those will simply not be used if it (the cluster) decides it can communicate over 10000 for data. This I have reproduced. While not a huge deal it seems to leave some room for improvement.
The other thing, should people in the future run into this - it appears to me that for this to use preferred paths (not e0m), the array must be added to CVLT using one of the better-connected LIFs (intercluster assuming connectivity is doable). If the cluster is discovered using an FQDN which happens to resolve to the e0m IP, it tries to force it all down this path (and default port) regardless of the -data-port-range being set.