ONTAP Discussions

Decommission nodes/LIF's from a Cluster and remount NFS

netappmagic
6,456 Views

We wanted to decommission nodes and LIF's homed onthese nodes. The problem is that some VMware NFS Datastores and also NFS file systems are using these LIF's/IP's, is there any way non-discruptively remove these LIF's/IP's?

My understanding is that we would be downtime to remount NFS datastores or file systems. 

 

Thanks!

 

10 REPLIES 10

dbenadib
6,451 Views

Hi lad,

 

Basically you are abble to move the lif from nodes to nodes wich is a non disruptive process. So if you found that you need to keep some lifs move them to another node.

BTW you can validate lif usage using the CLI:  network connection active 

Cheers !

netappmagic
6,436 Views

  ,

 

There were concern with that approach: All network trafic will be going to these two nodes that I move these LIF's to, then it would cause unbalanced load. 

So, from long run, it'd better to clean them up, and to avoid confusion. To remove them, there will be a downtime. 

 

Make sense?

dbenadib
6,429 Views

Can you clarify ?

 

How much node do you have in your cluster ?

Which kind of maintenance are you performing ?


@netappmagic wrote:

  ,

 

There were concern with that approach: All network trafic will be going to these two nodes that I move these LIF's to, then it would cause unbalanced load. 

So, from long run, it'd better to clean them up, and to avoid confusion. To remove them, there will be a downtime. 

 

Make sense?




netappmagic
6,417 Views

I have total of 8 nodes in the cluster. Planning on replacing two of them by adding two new ones first, then take two out. In the end, there would be still 8 nodes. 

 

Understood I can lif move all lif's to the other two nodes(HA) without interruption. However, as I said, move lif's will also move all connections along with NFS datastores and file systems , it will put a lot of loads to those two nodes, causing unbalanced load.

 

I am thinking to manuall umount those NFS's connecting to two nodes going away, and remount them to two new nodes. That will have service downtime. 

Make sense?

dbenadib
6,411 Views

I would do the following:

Add 2 new nodes (Cluster will have 10 Nodes)

Validate that new nodes are correctly connected to relvant network

Create intercluster lifs (if Snapmirror)

Move all volumes to the new HA Pair 1:1 (that way you will ensure the same level of performance)

Move / Rebalance all lifs to the new HA-Pair 1:1 (that way you will ensure the same level of performance)

Ensure that no volume reside in the old HA-Pair

Clean-up old nodes

remove intercluster lifs

delete aggregates

disable HA 

Move Epsilon out of this HA Pair

Evict node by node

netappmagic
6,405 Views

Your steps looks very well.

 

However, I have not sure an information with you: two new nodes have already added into the cluster, and now there are 10 nodes with new LIF's and new everything. Loads are already balanced across all 10 nodes. 

 

Now, i just need to remove two old nodes. I can move all LIF's  to the other nodes, which will cause unbalancing, and yet leave all old LIF's with old name convention (ex, nfs-lif-node1, nfs-lif-node2..) in the cluster forever, whereas node1 and node2 should be already gone. That is why I am thinking to take a downtime and remove all old lif's...

 

Make sense?

 

 

 

 

dbenadib
6,400 Views

it makes sense.. the only issue with that is the downtime...

 

If you want to avoid downtime U have to migrate lifs. after lifs will be balanced across nodes (for better perf ensure that lif and volumes reside in the same node) and rename it according to your naming conv.

 

BR

netappmagic
6,349 Views

Thank you!

aborzenkov
6,334 Views

@netappmagic wrote:

Make sense?


No. You cannot remove nodes that host volumes. If volumes are already relocated to another nodes, any traffic to LIFs on these nodes will go via interconnect to another node(s). So moving LIFs to nodes that actually host volumes will actually improve situation by avoiding indirection via cluster interconnect.

netappmagic
5,972 Views

@aborzenkov 

You are right. Unfortunately, we didn't do what you suggested. Now throughputs to LIF's on two specific nodes are much heavier than the others. 

 

Question: How do I know if throughtputs to these LIF's are too  heavy, and causing performance issues? Or how heavy is too heavy? To me, there seems no way to tell latency on LIF's. 

 

 

Public