Solved: Using "volume move" on cDOT8.3 to move large NFS volumes to another node non-disruptively

enttoobad · ‎2015-05-21

Hi, anyone have any experience moving large, eg 2TB volumes to another node? These volumes are NFS-3 datastores to vmware 5.5 and there are lots of VMs running on them. I'm hoping I can just migrate the volumes without any noticeable impact on any VMs.

If anyone has experience good or bad please share. I know this is possible but have no real life experience yet as we only just transitioned to cDOT. We run 8.3, and it's 4 node cluster of FAS3250s.

Thanks.

bobshouseofcards · ‎2015-05-22

I've found volume moves work very well and are non-disruptive for the most part. I have both NAS and SAN protocols in play and regularly move volumes of size - 8TB NFS ESX datastores, 25TB general file volumes accessed via CIFS and NFS, 8TB+ volumes with LUNs for database servers.

The volume move engine of course puts additional load on the system and disk, so of course it can affect performance during the move. When possible, it is best to do the moves when there is minimal other load on the target volume/system. But, given certain sizes sometimes you have to run it for extended times. I've watched a 40TB volume move take 12 days given other loads in the total system.

Volume moves are limited per node in two ways - moves run at lower priorities, otherwise a single move running at full possible speed could starve other loads to the same aggregates or nodes. Also - each node only gets so many volume move endpoint slots. Both the source and the target count as a slot, so moves within a single node count two slots agains that node. Thus you can queue up a number of moves if you need to and they will process in a measured fashion.

As a standard data management mechanism, I have in our "NAS" cluster two nodes that basically are holding points for archive style data - loaded up with capacity MSATA but it's a small node pair. Other nodes in the cluster store the active data on larger controllers and NL-SAS capacity disks. Part of our standard operations is to move data back and forth between the two logical storage tiers within the cluster based on level of activity. Typical weeks will see around 20TB of data shifted using this system (among multiple volumes). It's also common to rebalance volumes onto different aggregates within a tier as variable size or I/O patterns arise. All the moves take place against a background of around 1500 volumes total on this four node cluster with space efficiency, full SnapMirror replications, and lots of user access.

The one issue I've encountered is due to volume "container" size. In our NAS cluster we have node types that have different max volume sizes. What is not apparent is that WAFL has an underlying "container" mechanism that impacts the size. As a volume grows, WAFL increases the size of the logical container. The logical container size never shrinks (thin provisioning not withstanding) even if the actual data does. The logical container is more a function of metadata and internal structures needed, and can reflect things like max files that might have been manually increased beyond the standard. The kicker is that while we only had 26TB of real data (should fit anywhere), the volume source had grown to a 100TB capable container (max on the node in question). Likely this was due to the actual user data being larger at some previous point and then shrinking back down. Attempting to move that volume to a node with a 70TB volume limit didn't fail exactly, it just didn't go. The volume move just stalled without doing anything. It would show an error if you displayed all data, but it sat in the queue doing nothing. It took a query in diagnostic mode APIs to pull the container size of the volume and confirm the issue. The only way to move that volume was the old fashioned manual way - Robocopy. Thankfully ODX enabled access allowed the data to move in a few days without killing the network.

Given your homogenous node cluster, and unless your datastores are under significant steady load, volume moves in cDot will be just fine.

Hope this helps.

View solution in original post

neto · ‎2015-05-22

Hi my friend,

This is neto from Brazil

How are you?

When you say to another node, you meant: SVM A - volumes on node 1 and you want to move to node 2?

How about vol move?

How is the CPU on the controllers and the disk utilization?

Thanks

neto

enttoobad · ‎2015-05-22

Hi there neto, good thanks, and you?

I mean I want to move a flex-vol from node2,aggr1 to node4,aggr1. I know I can use "volume move" but just want to ask if anyone has done this with very large, busy volumes serving datastores to vmware.

Just wondering if anyone managed to use vol move on a big NFS volume like this, and how it went.

neto · ‎2015-05-22

Hi my friend,

This is neto from Brazil

How are you?

Glad to hear from you.

Do you have numbers about CPU and % disk utilization on the source aggregate?

Thanks

neto

enttoobad · ‎2015-05-22

neto,

Source aggr is 25TB, 8TB free, so 68% full.

CPU usage is around 15%

Thanks

bobshouseofcards · ‎2015-05-22

I've found volume moves work very well and are non-disruptive for the most part. I have both NAS and SAN protocols in play and regularly move volumes of size - 8TB NFS ESX datastores, 25TB general file volumes accessed via CIFS and NFS, 8TB+ volumes with LUNs for database servers.

The volume move engine of course puts additional load on the system and disk, so of course it can affect performance during the move. When possible, it is best to do the moves when there is minimal other load on the target volume/system. But, given certain sizes sometimes you have to run it for extended times. I've watched a 40TB volume move take 12 days given other loads in the total system.

Volume moves are limited per node in two ways - moves run at lower priorities, otherwise a single move running at full possible speed could starve other loads to the same aggregates or nodes. Also - each node only gets so many volume move endpoint slots. Both the source and the target count as a slot, so moves within a single node count two slots agains that node. Thus you can queue up a number of moves if you need to and they will process in a measured fashion.

As a standard data management mechanism, I have in our "NAS" cluster two nodes that basically are holding points for archive style data - loaded up with capacity MSATA but it's a small node pair. Other nodes in the cluster store the active data on larger controllers and NL-SAS capacity disks. Part of our standard operations is to move data back and forth between the two logical storage tiers within the cluster based on level of activity. Typical weeks will see around 20TB of data shifted using this system (among multiple volumes). It's also common to rebalance volumes onto different aggregates within a tier as variable size or I/O patterns arise. All the moves take place against a background of around 1500 volumes total on this four node cluster with space efficiency, full SnapMirror replications, and lots of user access.

The one issue I've encountered is due to volume "container" size. In our NAS cluster we have node types that have different max volume sizes. What is not apparent is that WAFL has an underlying "container" mechanism that impacts the size. As a volume grows, WAFL increases the size of the logical container. The logical container size never shrinks (thin provisioning not withstanding) even if the actual data does. The logical container is more a function of metadata and internal structures needed, and can reflect things like max files that might have been manually increased beyond the standard. The kicker is that while we only had 26TB of real data (should fit anywhere), the volume source had grown to a 100TB capable container (max on the node in question). Likely this was due to the actual user data being larger at some previous point and then shrinking back down. Attempting to move that volume to a node with a 70TB volume limit didn't fail exactly, it just didn't go. The volume move just stalled without doing anything. It would show an error if you displayed all data, but it sat in the queue doing nothing. It took a query in diagnostic mode APIs to pull the container size of the volume and confirm the issue. The only way to move that volume was the old fashioned manual way - Robocopy. Thankfully ODX enabled access allowed the data to move in a few days without killing the network.

Given your homogenous node cluster, and unless your datastores are under significant steady load, volume moves in cDot will be just fine.

Hope this helps.

enttoobad · ‎2015-05-22

What a great reply - really appreciated, thanks very much indeed. I will go ahead with the moves confidently, now seeing them as "small" rather than "large"!

Really helps to hear from someone who has tried and tested this. I've only done vmware storage-vmotions previously for non-disruptive VM migrations, needed to hear that vol move works for people.

enttoobad · ‎2015-05-23

Just a footnote -I vol-moved 8x 2TB NFS volumes overnight, all successful and transparent. Thanks again for the input.

Rock · ‎2016-02-24

I have just performed a 10TB vol move into a different controller stack aggregate within the cluster and its pretty nerve wrecking - specially when you start seeing All Paths Down alert in your hypervisor (in our case VMWare). It took a good 20hours for the move, mostly because we have a 5% changelog policy dedup applied to this volume. Its panicking to see those APD alerts so the next time I am going to place the datastore on maintenance mode first from the hypervisor (which will migrate all vm's out of that datastore and their associated files), then execute the volume move within NetApp.

SeanHatfield · ‎2016-02-24

I've moved NFS datastores around many times without incident. Perhaps you're still running ESX 5.5U1, or the host settings have not been applied with the VSC. Either way its worth opening a case.

If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.

Rock · ‎2016-02-24

The host settings are applied correctly on all of our hosts. We are still on ESXi 5.5, so that could be it. The APD was in micro/seconds so it was "almost" transparent. It occurred during the cutover phase of the vol move.

bobshouseofcards · ‎2016-02-25

NetApp Bug report 844098 desribes this issue:

 When using ESX/ESXi NFS clients with NetApp storage controllers, you might 
 experience the following issues:
 
 Issue 1: Intermittent NFS APDs on VMware ESXi 5.5 U1
 
 When running ESXi 5.5 Update 1, the ESXi host intermittently loses connectivity 
 to NFS storage and an All Paths Down (APD) condition to NFS volumes is observed. 
 
 This issue is resolved in ESXi 5.5 Express Patch 04. Refer to the VMware KB 
 articles http://kb.vmware.com/kb/2076392 or http://kb.vmware.com/kb/2077414 for 
 instructions to install the patch. 
 
 Issue 2: Random disconnection of NFS exports under workloads with an excessive 
 number of requests
 
 On some NetApp storage controllers with NFS enabled, you might experience the 
 following symptoms:
 
 - The NFS datastores appear to be unavailable (greyed out) in vCenter Server, 
   or when accessed through the vSphere Client.
 - The NFS shares disappear and reappear again after few minutes.
 - Virtual machines located on the NFS datastore are in a hung/paused state when 
   the NFS datastore is unavailable.
 
 This issue is most often seen after a host upgrade to ESXi 5.x or the addition 
 of an ESXi 5.x host to the environment.

Chasing down the bug report trail indicates that the 2nd issue is corrected in all up-to-date cDots.

If you will remain on ESX 5.5 and with NFS - you definitely want the indicated patch. For me a typical data store is about 2-3TB used, higher end hits 8TB, all NFS, and I move them at will as needed without any concern for system load. VMWare never notices anything out of the ordinary.

Hope this helps.

Bob Greenwald

Lead Storage Engineer | Consilio LLC

NCIE SAN Clustered, Data Protection

Kudos and accepted solutions are always appreciated.

Rock · ‎2016-02-25

I have just double checked our version of ESXi and we have 5.5.0 Build 31168595 (Update 3a).