Unjoin Node from a Cluster

sheelnidhig · ‎2014-02-14

I am trying to unjoin a node from a cluster but it failed with the following message:

Error: command failed: [Job 1378] Job failed: Failed to disable cluster-scoped metadata volume support prior to unjoin operation. Reason: entry doesn't exist.

How can we fix this.

I tries to reboot the node.

I set the eligibility to false

It is also not a epsilon node.

Am i missing anything ?

ismopuuronen · ‎2014-02-17

Hi,

have you deleted/moved all the data volumes to a different nodes and destroyed all aggregates expect root?

Migrate (or delete) lifs from this node, modify failover groups so they don't have any ports from this node and remember to disable storage failover.

Also port sets and vlans needs to be deleted from this node.

epsilon you mentioned already...

I think that should be all.

Br.

Ismo.

sheelnidhig · ‎2014-02-18

Yes I have moved everything, or else we would receive some error message.

So for now we do not have any volume or LIF hostsed from this node.

sheelnidhig · ‎2014-02-25

I saw some port from the Node were still added to some failover groups.

After I cleaned the failover group and tried , it worked.

,Sheel

D_BEREZENKO · ‎2015-08-30

I had the same issue "Failed to disable cluster-scoped metadata volume support prior to unjoin operation":

clA::*> cluster unjoin -node clA-01 -force true

Warning: This command will forcibly unjoin node "clA-01" from the cluster. You must unjoin the failover partner as well.
         This will permanently remove from the cluster any volumes that remain on that node and any logical interfaces
         with that node as the home-node. Contact support for additional guidance.
Do you want to continue? {y|n}: y
[Job 1819] Checking prerequisites
Error: command failed: [Job 1819] Job failed: Failed to disable cluster-scoped metadata volume support prior to unjoin
       operation. Reason: All cluster-scoped volumes are still online.

I removed all the volumes on affected aggregates, lifs from affected nodes, chnaged home-ports from affected lifs and ports from manually created failover-groups.

Then I offlined a data aggregate connected to a node to be unjoined.

clA::*> aggr offline aggr05

At last job was finished with success for one node!

clA::*> cluster unjoin -node clA-01

Warning: This command will unjoin node "clA-01" from the cluster. You must unjoin the failover partner as well. After the
         node is successfully unjoined, erase its configuration and initialize all disks by using the "Clean configuration
         and initialize all disks (4)" option from the boot menu.
Do you want to continue? {y|n}: y

[Job 1843] Cleaning cluster database[Job 1843] Job succeeded: Cluster unjoin succeeded
If applicable, also unjoin the node's HA partner, and then clean its configuration and initialize all disks via the boot menu. Run "debug vreport" to address any remaining aggregate or volume issues.

Then I figured out I still have system volume online on a data aggregates connected to a node to be unjoined. Which is not possible to offline:

clA::*> vol show -aggregate aggr06
Vserver   Volume       Aggregate    State      Type       Size  Available Used%
--------- ------------ ------------ ---------- ---- ---------- ---------- -----
clA       MDV_aud_02e7b476ab4541e5b3b0a12357288980
                       aggr06       online     RW          2GB     1.90GB    5%

I tryed to move it and found it is audil log volume:

vol move start -volume MDV_aud_02e7b476ab4541e5b3b0a12357288980 -destination-aggregate aggr01

Error: command failed: Volume "MDV_aud_02e7b476ab4541e5b3b0a12357288980" in Vserver "vsm01" cannot be moved. Reason: This
       volume is configured to hold audit logs specific to the current aggregate.

Then I delited (cifs) audit logging

 vserver audit delete -vserver vsm01

And just in case I disabled system audit (Probably it was no need for this):

clA::*> security audit show
               Auditing State for              Auditing State for
               Set Requests:                   Get Requests:
               ------------------              ------------------
    CLI:       on                             off
    ONTAPI:    on                             off
    SNMP:      on                              off

clA::*> security audit modify -cliset off
clA::*> security audit modify -ontapiset off
clA::*> security audit modify -snmpset off

And system volume disapired:

clA::*> vol show -aggregate aggr06
There are no entries matching your query.

And at last second node was unjoined saccesfully!:

clA::*> cluster unjoin -node clA-02

Warning: This command will unjoin node "clA-02" from the cluster. You must unjoin the failover partner as well. After the
         node is successfully unjoined, erase its configuration and initialize all disks by using the "Clean configuration
         and initialize all disks (4)" option from the boot menu.
Do you want to continue? {y|n}: y

[Job 1864] Cleaning cluster database[Job 1864] Job succeeded: Cluster unjoin succeeded
If applicable, also unjoin the node's HA partner, and then clean its configuration and initialize all disks via the boot menu. Run "debug vreport" to address any remaining aggregate or volume issues.

And getting security audit settings back:

clA::*> security audit modify -cliset on
clA::*> security audit modify -ontapiset on
clA::*> security audit modify -snmpset on