Re: Message OPM "performance collection failed" + Pairing status "bad"

Chris_AUBIN · ‎2018-09-11

Hello,

On my OnCommand Unified Manager,

I have a classic "The cluster xxxx (adress IP) is not reachable through OPM or OPM performance collection failed. Cluster xxxx performance events have stopped being sent by Performance Manager server yyyy. This condition could be caused by a communication issue between Unified Manager and Performance Manager, or between Performance Manager and cluster xxxx.

I have too :

Pairing Status (Bad): The credentials provided by Performance Manager for the user with Event Publisher role are either incorrect or Performance poll is not complete or has failed.

Health: Poll completed
Performance: Poll completed

Well I am sure of nothing, for now, but all seems work, after all. Including monitoring, event... I must precise we monitore only one node, with 2 head, A and B. Not two, in the case of this plateform. It is not really a cluster of several nodes. And I didn't install Unified or performance manager, myself. So I do not know the application. In "Management" "Users", I see 0 user added. Additionnally, about it, the "Add" user is greyzed. I couldn't add anyone, even if I wanted to do it. And the certificate SSL is ok from 2017 to 2022. The "performance" tab, "Events" show me a column "Event State" "Obsolete" but the Detected time is correct...

Do you have any idea why I have these messages, from that ? If the OPM performance fail, I guess I shouldn't have datas correctly updated. Or it miss some user in the configuration ?

Best regards,

Christian

Chris_AUBIN · ‎2018-09-14

Hello,

Any idea ?

Thanks!

Best regards,

Christian

niels · ‎2018-09-14

What version of OCUM and OPM have you running? Seems to be rather old if those are two seperate servers.

OCUM and OPM have been joined into a single server with OCUM 7.0. The stand-alone versions went EOS in January.

Looks like you should upgrade, or if you don't care about historical data too much, re-install.

The minimum recommended version would be 7.3. You would need several interim steps to go from whatever version you have to 7.3, many of which are EOA/EOS and could only be optained through a special process.

regards, Niels

Chris_AUBIN · ‎2018-09-17

Hi,

First, thanks for your return! Well, ... I will try to verify that today. I'll telll it you as soon as I know it :).

Best regards,

Christian

Chris_AUBIN · ‎2018-09-17

Hello,

------------------------------------------------------------

So... On Unified Manager :

------------------------------------------------------------

# cat redhat-release
Red Hat Enterprise Linux Server release 7.3 (Maipo)

# rpm -qa (...)

ocie-serverbase-7.1.0-2017.05.J2300.x86_64
ocie-server-7.1.0-2016.05.J2345.x86_64
ocie-au-7.1.0-2016.05.J2345.x86_64

netapp-platform-base-7.1.0-2017.05.J2300.x86_64
netapp-application-server-7.1.0-2017.05.J2300.x86_64
netapp-node-4.4.7-1706041255.x86_64
netapp-ocum-7.1-1706041255.x86_64

------------------------------------------------------------

On Performance Manager :

------------------------------------------------------------

# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.3 (Maipo)

# rpm -qa (...)
ocie-serverbase-7.1.0-2017.05.J2300.x86_64

ocie-server-7.1.0-2016.05.J2345.x86_64
ocie-au-7.1.0-2016.05.J2345.x86_64

netapp-platform-base-7.1.0-2017.05.J2300.x86_64
netapp-application-server-7.1.0-2017.05.J2300.x86_64
netapp-opm-7.1.0-2017.01.N170602.0930.x86_64

------------------------------------------------------------

Versions seems coherent between them, in first view. What do you think ? I can maybe check if there is more recent, I suppose.

Best regards,

Christian

niels · ‎2018-09-17

Hi CHristian,

as you run OCUM/OPM 7.1 it's not as old as I initially thought.

Nonetheless 7.1 will become EOS end of the year. So you can still open a support call to get your issue corrected.

In the past when I observed similar communications issues, they were caused by inconsistent IDs in the OCUM and OPM database so that the clusters could not be matched netween the two.

Support has a procedure to get those corrected in case it's the actual cause.

Please have it investigated by NetApp support.

Once cleared, I suggest to upgrade.

You would need to go through OCUM 7.2, which is the first version OCUM and OPM have been merged into a single server. OCUM 7.2 contains a workflow to import historical performance data from your existing 7.1 OPM.

Once done, plan to upgrade to at least 7.3.

You can potentially just upgrade directly if you don't care about historical performance data.

If upgraded to 7.2 and later, OCUM will immediately start to collect perf data from your clusters, including up to the past 13 days from the ONTAP performance archiver files.

AS OCUM and OPM are merged, there is no communication between the servers that could fail.

Kind regards, Niels

Chris_AUBIN · ‎2018-09-17

Hi,

Thanks! So, to summarize :

1. We check with Netapp support to help to fix that.

2. Once done, we upgrade to 7.2. Then OCUM will automatically migrate datas on itself from OPM.

3. We upgrade to 7.3.

If we can't check with Netapp, we can upgrade OCUM, anyway. But, in this case, OCUM will keep his datas, while OPM will lost them and will work from scratch.

I will search if there is some process specifific to upgrade OCUM in 7.2, on Internet.

One question : OCUM can work without OPM ? If we stop OPM and if we keep OCUM alive, whithout change his version, what will it occurs ? We will just have a new message of error ? Or must we change the configuration on OCUM to permit the application to work alone ?

Well, I am not sure to be able to open a ticket to Netapp because even if the filer is supported by Netapp, it doesn't seems the case with OCUM/OPM. I tried to do it, some days ago and the service replied by saying we have to pay to have an answer...

Best regards,

Christian

niels · ‎2018-09-17

Hi Christian,

OCUM and OPM don't have their own support entitlement.

As long as you have a NetApp FAS or AFF controller under service, you can log a case for OCUM and OPM and are entitled for software upgrades.

With OCUM 7.1 and earlier, you can run OCUM without pairing it with OPM. You just won't have performance data and performance-related alerting. Health and capacity data will still be kept in OCUM.

You would need to de-register OPM from OCUM. That will clean the relationship and the DB from all performance information.

https://library.netapp.com/ecmdocs/ECMLP2553755/html/GUID-0E700C6E-D202-4B2D-BB17-092E4DF28C18.html

Depending on the level of integration between OCUM and OPM you may require support assistance to de-register the instance.

Starung OCUM 7.2 OPM is integrated with OCUM. You cannot disable performance data collection.

regards, Niels