ONTAP Discussions

Remove unhealthy nodes from the cluster

LearnfromNetApp
54 Views

FAS2750 device, system ONTAP9.16. When controller A node 1 took over controller B node 2, I started the system of node 2 and entered the menu mode to select the fourth option to clear the configuration. During the restart process, there was a waiting for giveback. Then I returned the configuration in node 1 and entered a brand new node system (node 3). In node 1, I recreated the cluster and allowed node 3 to rejoin the cluster. The configuration of cluster IP, node IP, interface lif, vserver, etc. before node 2 is still in place. I want to remove the failed node 2, but the prerequisite is to delete or migrate the vserver lif and other configurations related to node 2. However, I tried modifying the cluster IP, node IP, interface lif, and so on, which showed Error: command failed: RPC: Could't make connection [from mgwd on node]
"FAS2750-03" (VSID: -1) to vifmgr at 169.254.157.6],

I use commands
FAS2750::*> cluster remove-node -node FAS2750-02 -force true

Warning: This command will forcibly remove node "FAS2750-02" from the cluster.
You must remove the failover partner as well. This will permanently
remove from the cluster volumes that remain on that node and logical
interfaces with that node as the home-node. Contact support for
additional guidance.
Do you want to continue? {y|n}: y
[Job 1064] Cleaning cluster database
Error: command failed: [Job 1064] Job failed: Failed to delete SAN
configuration for lif with id 1032 from its current node (FAS2750-02):No
nodes are available to process the command. Verify that all nodes are
healthy using the "cluster show" command, then try the command again.

And the system alarm displays:
FAS2750::*> system he al show
Node: FAS2750-01
Alert ID: ClusterSwitchlessConfig_Alert
Resource: FAS2750
Severity: Major
Indication Time: Wed May 13 16:39:20 2026
Suppress: false
Acknowledge: false
Probable Cause: No cluster switch is detected and the switchless
option is not enabled.
Possible Effect: Communication problems and cluster connectivity issues
occur.
Corrective Actions: 1) If the cluster network is configured as a two-node
switchless cluster (TNSC), enable switchless detection
by using the "network options detect-switchless-cluster
modify -enabled true" command. No further action is
required.
2) If the cluster network is configured with cluster
switches, the nodes fail to detect the switches.
Ensure that the network interfaces on the cluster
switches connected to the node cluster ports are enabled
on both sides.
If the errors are corrected, stop. No further action is
required. Otherwise, continue to step 3.
3) Check the physical connections between the nodes and
the cluster switches. Replace network cables with
known-good cables. If the errors are corrected, stop.
No further action is required. Otherwise, continue to step 4.
4) Ensure that either CDP (for Cisco switches) or ISDP
(for NetApp CN1610 and Broadcom BES-53248 switches) is
enabled on the cluster switches.


FAS2750::*> network options detect-switchless-cluster show

Enable Switchless Cluster Detection: true

 

Big shots, please help me take a look and answer my questions. Thank you very much.

0 REPLIES 0
Public