I have a problem with OPM 1.1 and a new CDOT cluster running ONTAP 8.2P2 which is located in another network. I also have an older cluster running 8.2.2P1 on the same network as OPM and it works fine.
Anyway, I opened the port TCP/443 from OPM towards the Netapp and after that the discovery process started to work. After few minutes though I started to get the following error message on the "Manage Data Sources" page: Communication problem with the cluster: xx.xx.xx.xx, Failed to download Archive files. Error: ',xx.xx.xx.xx: Download wait timed out after 240000 millisecods. ' on try 37 out of 37.
Today those error messages are gone but I still get this error message on OPM: "Cluster_name (xx.xx.xx.xx)" is unreachable. Performance Manager is no longer monitoring this cluster.
I've created own user account for this purpose and given it "ontapi" rights.
Is there some other port towards the Netapp that has to be opened?
after adding the "http" permission to my opm user it works also with this user. In the last days of testing the OPM we sometimes had problems with reachability of the cluster, but this is only for some minutes and than everything works fine again. Maybe this has something to do with the burt? Where can I get the information of the burt, is there a public accessable documentation?
The message we get in this case: 11:24 AM, 8 Oct : Cluster xxxx (xxxx) is unreachable. Performance Manager is no longer monitoring this cluster.
Maybe you have an idea where this effect comes from.
yes everything is getting collected. And if this problem occures the OnCommand Unified Manager dashboard shows a reachability risk of the cluster which needs a lot time to be updated. It tooks a lot of time to switch green again after the problem is solved, what is very disturbing. And if you click on the risk there is no event shown, maybe because the problem is allready solved again. Thats realy annoing, when you expect it to turn green immediately after the problem is solved.
I have same problem on OCP 2.0.0RC1 talking to CDOT8.3 Cluster. Status "Network Acccess_failure". Status Message "Communication Problem with the Cluster. Failed to Download Archive Files. Error 'clustername:Download wait timed out after:240000 milliseconds.' on try 37 out of 37. What does this mean? And how to solve it. I have other CDOTs running 8.3 and 8.3.1 and even 8.2.3 which are working fine with OPM. I dont see stats for this partcular cluster on OPM. No Iops, latency, utilization nothing... Please advise.
I now have this issue. The cluster was monitoring fine until I added in 2 new nodes (AFF8080's) that were on 8.3.2. I have since downgraded them to 8.3.1 so the cluster versions are now in sync but OPM V2.0.0 is saying it "Failed to download Archive files"
We also found that when you enable FULL integration between OCUM and OCPM, the OCUM credentials are then used for OCPM which *may* not have the right permissions (in our case, it didn't have HTTP access, only ONTAPI