Subscribe

Oncommand System Manager slow while DFM is running

[ Edited ]

We have just started having a problem where Oncommand System Manager is going extremely slow.

It has never been very fast but now it has become almost unusable. 

It has happened for everyone trying to connect with various versions around 3.1.1

 

The interesting bit is if we turn off DFM 5.1 which monitors the filers its lightening fast for everyone.

 

I have also tried removing the filers from DFM 1 by 1 and once the clustered pair is removed access to it from System manager speeds up again.

 

It must be something DFM is doing like excessive api calls or SNMP requests but I am having difficulty troubleshooting. 

Any help much appreciated. 

Re: Oncommand System Manager slow while DFM is running

Hi,

 

What do you meen by DFM? I have the same problem.

 

Thanks!

Re: Oncommand System Manager slow while DFM is running

The term "DFM" refers to Data Fabric Manager, the initial product name for the now called Unified Manager 5.x (a.k.a. Operations Manager).

 

In fact it sounds the DFM service as well as System Manager are installed on the very same server/VM.

This is a not supported configuration. It's not allowed to have other services deployed alongside Unified Manager on the same server.

 

As you noticed the DFM servie is quite resource intense and issues lots of API and SNMP queries for continuous monitoring. The bigger the environement to monitor (not only systems, but objects within these systems) the more resources it needs.

 

First and foremost, both, DFM as well as System Manager run a web server. Having both on the same server might cause conflicts.

Additionally, System Manager issues SNMP queries and API calls itself. It might very well be that the sum of the DFM monitoring service and the System Manager calls force ONTAP to pause communication.

ONTAP has an internal setting to pause communicatin when a certain number of calls per second from the same source IP is detected to prevent becoming unresponsive to client protocol IO in case of a DDoS attack in order to fulfill the SNMP and API calls.

 

You should install System Manager on a server/VM seperate from the DFM service.

 

Also, do you have significant latency between the server System Manager is installed on and the storage controller(s)? As System Manager needs to issue lots of API calls, some of them sequentually rather than in parallel in order to get inventory information, every ms adds to its responsiveness. But as you said its working OK once the DFM service is disables, I assume that's not the case and it's really more related to the server/VM.

 

In case it *is* a VM, are the resources reserved? As in case the VM needs to dynamically allocate CPU and RAM it may get slowed down by the hypervisor as resources may not be available.

 

Kind regards, Niels

Re: Oncommand System Manager slow while DFM is running

[ Edited ]

Hi Neils,

 

I can confirm this issue is happening without using DFM and System manager on the same server. It is very predictable and can be recreated each time. If i only disable performance monitoring in DFM the issue is resolved so it must be a setting in DFM performance monitoring which is causing the problem.

 

I must stress we upgraded ONTAP, DFM and system manager to the latest versions at the time as advised by support but this did not help.

 

ONTAP Release 8.1.4P8

DFM 5.2.0.17147 (5.2R1P1)

System Manager 3.1

 

I only use System manager installed locally on my device.

 

I have ruled out issues with the VM or latency to the filer.

 

Increasing the following helped by making it usable but has not fixed the problem:

I am certain there is a setting or issue in DFM performance monitoring which is causing excessive or problematic connections.

 

httpd.admin.max_connections  1023

Re: Oncommand System Manager slow while DFM is running

Hi Tom,

 

could you please send me the output of the commands "dfm diag" and "dfm host diag <filername>" for each filer?

No need to post here, just send it to me: niels at netapp dot com. I'll have a look if I see something obvious.

If not you may need to open a support case.

 

Kind regards, Niels