Subscribe
Accepted Solution

Unified Manager virtual appliance 7.1 - cannot remove cluster, hung discovery

[ Edited ]

Hi all, a previously monitored cluster gave polling errors recently (I don't recall exact error but something like 'password for performance is wrong or polling failed').  I confirmed SSL certs are still valid until 2018 (I had regenerated them months ago).  Rediscovery and editing credentials both failed so I tried removing it all together and initiating a fresh discovery.  New discovery never completed and lists the following status:

 

Pairing Status (Not Paired)

Health: Discovering

Performance: Poll completed

 

2 weeks later discovery is still hanging and I cannot remove or edit it due to error "Cluster settings cannot be modified when discovery is in progress."  I've already tried following instructions in the following chain (tried on both UM and PM) but the 'sudo' command brings up no results, so that's not my problem.

http://community.netapp.com/t5/OnCommand-Storage-Management-Software-Discussions/Unable-to-remove-the-cluster-from-OCUM-7-1/m-p/129111

 

Monitoring and polling of all other clusters (same hardware and OS) is working properly, only 1 cluster is experiencing this issue.  On affected system access via SysMgr GUI, SSH and power shell all work fine without errors.  Any other ideas?

 

Virtual Appliances UM 7.1, PM 7.1 w/ full integration

 

Thanks,

 

Re: Unified Manager virtual appliance 7.1 - cannot remove cluster, hung discovery

hi

 

try this to see if it helps

https://kb.netapp.com/support/s/article/ka11A0000001XlGQAU/An-incomplete-removal-of-a-cluster-via-the-UM-dashboard-prevents-further-collection-when-th...

 

thanks

 

Jeff

 

Cannot find the answer you need?  No need to open a support case - just CHAT and we’ll handle it for you.

Re: Unified Manager virtual appliance 7.1 - cannot remove cluster, hung discovery

Thank you for this!  My symptoms were slightly different in that I had no cluster with ID '-1', however cluster with ID '19' in my case was the problematic cluster which appeared on OPM (hadn't been deleted) but did not appear on UM because the re-discovery process was hung.

 

So I modified the script attached in the article to remove cluster with ID 19, rebooted both vApps.  Since the old instance of the cluster was removed from OPM by script UM was able to finish its initial discovery.  OPM re-discovery also completed so we're back to normal running here.

 

Thank you!