Subscribe

ADVICE - splitting a DFM large install base from one server to two new servers

I submitted a post about the prefered version of DFM to upgrade to ... this one is more of what we intend to perform after the primary upgrade is complete and some questions have been presented.

SITUATION:

We are monitoring ( with DFM 4.0D12 ) 100 plus controllers worldwide; mainly 7.3.2P4 with  some 8.0.1P2.  Primary function is monitor, alerting, and Performance  Advisor.  They have around 2200 total volumes and 14400 qtrees created;  expected to grow steadily ( in 12 months qtress could reach 30000 ).

PROBLEMS:

Numerous of cases where the monitor.db service has stopped and backups "kill" performance advisor as the DFM database is backing up.

DESIRED SOLUTION:

Upgrade their current DFM server to the latest version of DFM ( leaning towards 4.0.1D5 or later ).  Then evacuate the current server to two new servers.  One server will monitor, alert, and report on all filers under monitoring.  The other server will execute only performance advisor - backup will be disabled to prevent loss of monitoring.  We would take a backup copy of the current and upgraded DFM server to populate the new servers; they both will have exact copies of the original database but then trimmed down to perform the specific functions.

QUESTIONS:

1.     Is it possible to configure DFM to just be a Performance Advisor system only ( turn off all un-needed monitoring or functions ) which can allow PA to still gather performance stats?  A bare-bones PA system

2.     On the system performing monitoring and alerting; can we blow away the performance advisor database?

3.     They are going to use two modern servers - 24 cores, 24 GB RAM; questions they are asking is will DFM take advantage of the cores on the system; DFM about detects them but does it actually use "more than one core"?

Thanks for your time, Emanuel

Re: ADVICE - splitting a DFM large install base from one server to two new servers

QUESTIONS:

1.     Is it possible to configure DFM to just be a Performance Advisor system only ( turn off all un-needed monitoring or functions ) which can allow PA to still gather performance stats?  A bare-bones PA system

I can think of switching off the following monitoring.dfmon and ccmon hostRbacmon and userquota mon, which are not required by PA.

2.     On the system performing monitoring and alerting; can we blow away the performance advisor database?

PA data is not stored in db they are stored as flat files in the perfdir,the location of which, can be got from the output of dfm about cli.

3.     They are going to use two modern servers - 24 cores, 24 GB RAM; questions they are asking is will DFM take advantage of the cores on the system; DFM about detects them but does it actually use "more than one core"?

we use all available cores

Regards

adai

Re: ADVICE - splitting a DFM large install base from one server to two new servers

How exactly do we do this?

"I can think of switching off the following monitoring.dfmon and ccmon hostRbacmon and uwerquota mon, which are not required by PA"

Re: ADVICE - splitting a DFM large install base from one server to two new servers

Do, dfm options list | grep -i moninterval.

[root@lnx~]# dfm options set hostRBACMonInterval=off
Changed host RBAC monitoring interval to Off.
[root@lnx ~]#

In fact,you can turn off all monitoring other than discovery ones.provided you are going to dedicate, this only for performance advisor.

regards

adai

Re: ADVICE - splitting a DFM large install base from one server to two new servers

It would be interesting if we could make DFM have specific modes selectible from the options menu ... so it turns off and on all relavant options.

-- Performance Monitoring Only

-- Alerting Only

-- Reporting Only

-- etc

I know it may not be all possible since it needs to poll data to use it for other means.

Re: ADVICE - splitting a DFM large install base from one server to two new servers

We have a setup that is probably half the number of controllers and a lot few qtrees, but we see the same behavior during backup.  We basically lose almost all DFM functionality during backups.  Is there any on-going work to fix this?

We tried using snapshot backup once, but it seemed to balloon the storage requirements immensely.  Our local NetApp partner opened a case, but nothing ever came of it.  In fact, we've almost never gotten any results for DFM support...

(Sorry to piggy-back your issue... but we've thought of splitting up monitoring as well, just haven't gotten that far)

Re: ADVICE - splitting a DFM large install base from one server to two new servers

Can you get us the output of the following ?

Dfm volume list –a | wc –l and dfm volume list |wc –l

Dfm qtree list –a | wc –l and dfm qtree list |wc –l

Dfm lun list –a | wc –l and dfm lun list | wc –l

If the difference in count of the two is more than 2x that could explain why your snapshot backup is consuming more space.

Open a case with NGS and prune your perf data. And dfm db, which will reduce your Storage requirement for snapshot based backup.

It’s the only way, IIRC only Performance Advisor data is not collected during the entrie duration of the backup.

Others work.

Regards

adai

Re: ADVICE - splitting a DFM large install base from one server to two new servers

Hi,

Here is the information... after some delay:

[root@dfm ~]# dfm volume list -a | wc -l

17894

[root@dfm ~]# dfm volume list | wc -l

1146

[root@dfm ~]# dfm qtree list -a | wc -l

20926

[root@dfm ~]# dfm qtree list | wc -l

1214

[root@dfm ~]# dfm lun list -a | wc -l

8149

[root@dfm ~]# dfm lun list | wc -l

683

I am not sure, however, how you conclude that this has some effect on the size of a snapshot backup.  If taking a snapshot simply means setting the database in "backup mode" (however this is done with Sybase), quiescing the filesystem, and taking a snapshot, why should the snapshot balloon?

The performance data is on a different NFS mount (I think in the day, we just moved the database back to local storage... ).  In any case, the details fail me now and there was never any real solution from NetApp support, so we gave up and used our limited time on things that gave us results. Sybase should really work on an NFS mount (and has worked for us internally in testing), but that doesn't seem to be supported

Now, there are some KB articles on taking snapshot backup, but it doesn't seem to be included in the standard documentation.  Even if it were, having the database on one of the systems that one is trying to manage can be a complicated (or just terrible) solution.  It seems to me that if I had a database solution that was offline for 30-45 minutes daily because of poor design choices, then I probably couldn't say that I have an enterprise class solution. Either the method of sampling data or the backup method needs to be changed.

Re: ADVICE - splitting a DFM large install base from one server to two new servers

Hi,


3.     They are going to use two modern servers - 24 cores, 24 GB RAM; questions they are asking is will DFM take advantage of the cores on the system; DFM about detects them but does it actually use "more than one core"?

we use all available cores


are you sure?

I've conducted a detailed tests using DFM 3.8.1D14 (on Windows Server Enterprise Edition 2003R2) and it appears that at least dfmserver part seems to be *NOT* scalabe. As that part is responsbile for communication with Snapdrive agents it looks like that any bottleneck there caues a lot of 'HTTP Post Errors'  which in turn in havoc whole Netapp stack (Snapmanagers, NMC GUIs, etc). My measurements indicated that e.g. not virutalized server with 4 AMD cores can be bottlenecked sometimes at 7 requests/s (a pretty low value - typcial webservers - other than embeded libzapid built into DFMserver can handle much more)... what's even more interesting one can fully reproduce that in lab (simulating a DFM server under extreme load). At this point i'm not sure it is related to DFMserver@Windows or overall it's the application design/implementation/architecture problem. What is interesting is that i couldn't drive dfmserver.exe to report more than 25% CPU used total on 4-way machine ...

Looks to me that any more serious deployment of Netapp whole stack (with SMO/SMSAP/SD with RBAC) is going to hit this problem?

Is there is any work happening on allowing of DFM to scale-up instead of only scale-out ?

-Jakub.

Re: ADVICE - splitting a DFM large install base from one server to two new servers

Hello

Another good question is ... will Protection Manager, version 4 and later on, support large memory on these host systems?  Customers are purchasing 16+ GB memory systems, some are using over 32GB.