Solved: Purging the DFM Database

kofchur · ‎2011-10-07

I have a customer that has been experiencing alot performance problems with reporting with his Operations Manager server (4.0.2). After doing some investigating, I came across this in tr-3440, OnCommand 5.0 Sizing Guide:

<<When the difference between the following commands are more than 2x then contact NGS to prune your db for improving the responce times.

<<dfm volume list -a and the volume list is 3x or more

<<dfm qtree list -a and qtree list is 2x or more

<<dfm lun list -a and lun list is 2x or more

<<dfm host list -a and host list is 2x or more

Going by the 4.0 Sizing Guide (the older tr-3440) it says that the max number of volumes and qtree for a setup with DFM, ProtMgr, and ProvMgr is 1650 volumes and 5800 qtrees. This customer of mine also has Performance Advisor running and the out put of the above commands is 85k+ for volumes (51x over limit) and 48k+ for qtrees (8x over limit)! So, it appears that the experienced performance problems, to at least some extent, are due to the massive ammounts of historical data lingering around and thus should be purged.

Now, I know of all the warnings about purging a filer object and that purging a filer object will also purge every record that it owns and hence could take a long time and you would be better off just installing the database. However, since I'm not intending to purge the filer object but just volumes and qtrees, what would be the impact of this? Will it take a long time and/or will it impact performance?

Also, if reinstalling the database would be the fastest and best way, is there a way to export the ProtMgr configs and reimport them back into the newly installed database? Otherwise we would have to document all the datasets and manually reconfigure them (yuck!).

Thanks for any input!

todd

adaikkap · ‎2011-10-08

The number you are talking about 1650 & 5800 volumes and qtrees respectively is for a DFM with all license enabled(ie Ops-Mgr, Perf Advisor,Prot Mgr and Prov Mgr) which is just 40 nodes.

But the number is 10K and 50K volumes and qtrees for a just a DFM server doing only OM + PA with 250 node.

If your difference between dfm volume list and dfm volume list -a is more than 3X and similarly for others, then its time you purge you data of those deleted instances from the DB.

As anyways you wont be able to access them unless you guys uses the -a options in the cli. It makes perfect sense to run the db pruning.

You should also thing about moving to OnCommand 5.0 which is a 65 bit architecture and takes advantage of the available compute and memory.

Pls do open a NGS case and get your db cleaned and perf data too, after that you will definitely feel the difference. BTW do you have lot of snapmanagers in your environment ?

If thats the case its the flexclone volumes/luns that is the cause for lot of these deleted instance.

Regards

adai

View solution in original post

adaikkap · ‎2011-10-08

The number you are talking about 1650 & 5800 volumes and qtrees respectively is for a DFM with all license enabled(ie Ops-Mgr, Perf Advisor,Prot Mgr and Prov Mgr) which is just 40 nodes.

But the number is 10K and 50K volumes and qtrees for a just a DFM server doing only OM + PA with 250 node.

If your difference between dfm volume list and dfm volume list -a is more than 3X and similarly for others, then its time you purge you data of those deleted instances from the DB.

As anyways you wont be able to access them unless you guys uses the -a options in the cli. It makes perfect sense to run the db pruning.

You should also thing about moving to OnCommand 5.0 which is a 65 bit architecture and takes advantage of the available compute and memory.

Pls do open a NGS case and get your db cleaned and perf data too, after that you will definitely feel the difference. BTW do you have lot of snapmanagers in your environment ?

If thats the case its the flexclone volumes/luns that is the cause for lot of these deleted instance.

Regards

adai

mark_schuren · ‎2011-11-21

I have the same problem with OC 5.0 (upgraded over time from several previous DFM versions).

My deleted object count is also very high - yes it's Snapmanagers and their temporary vol clones / lun clones that fills up the database.

Is there an official way to prune the DB from deleted objects?

Something like "DELETE FROM volumes WHERE Deleted = Yes" ?

I'd like to prune the DB of our own lab DFM - not a production customer environment...

hadrian · ‎2011-11-29

The official way to prune the DB from deleted objects is to open a support case with NetApp Global Support.

I've been researching it for days now to see if there is any way to avoid opening the case, but there isn't =).

Hope this helps,

Hadrian

brothanb1 · ‎2011-12-01

I also would like an official way to prune the database without having to open a support case.

mark_schuren · ‎2011-12-01

Ok, can we talk about the inoffcial way...? just a couple of SQL statements?

As I said it's meant for lab environment only.

hadrian · ‎2011-12-01

I looked for days with no luck. The solution is to send the deleted object count over to NGS and work with them via webex to get it done.

Hadrian

kofchur · ‎2012-01-11

Adai,

This whole issue of having all these artifact objects is due to Snapmanager. Now I have almost 100k total volume objects total, both managed and "deleted," 53k qtrees and about 6000 luns. We migrated off SMO, but we still have SMVI, which as slowed down the growth, but I still need to migrate to a new DFM instance (hence my other posts that you have been responding to).

Anyway, I would strongly urge the OnCommand development team to integrate some intelligence into DFM to ignore these clones that are created in the backup process of Snapmanager so large environments like mine don't run into this again.

I ran a test on our lab of purging 600 volume objects and reloading the database. It took 50 minutes do do this. With as many objects that i would have to purge in production, this would take forever.

I opened a case with NGS and after looking at my situation, they agreed that the best course was to just migrate to a new DFM instance.

--todd

adaikkap · ‎2013-03-05

We have this purge tool today that take care of cleanup all this. Dfmpurge which will remove all these stale instances but requires down time. The utility has 2 modes and gives the estimation of downtime required as well. In most cases it shouldn't take more than 30mins to cleanup.

Pls take a look at this video ( 3.43 Mins) and read the KB on how this tool works.

Video Link: DFM Purge Tool: How to Video

KB link: https://kb.netapp.com/support/index?page=content&id=1014077

Link to tool chest: http://support.netapp.com/NOW/download/tools/dfmpurge/

Regards

adai

francoisbnc · ‎2013-03-06

Hi Adai,

The tools looking for these kind of objects

Qtree 13658 ( 99%) 13573 ( 99%)

Volume 9018 ( 99%) 8966 ( 99%)

Interface 32 (100%) 32 (100%)

FCP Target 17 (100%) 17 (100%)

Aggregate 4 (100%) 4 (100%)

What's about events table?, in our case pretty hudge

WHERE eventDeleted IS NULL

"COUNT(events.eventStatus)"

"12158538"

WHERE eventDeleted IS NOT NULL

"COUNT(events.eventStatus)"

"36293"

adaikkap · ‎2013-03-06

Hi Francois,

As clearly said in the KB article this tool only deletes Mark deleted objects. The events pruning that you are looking for is coming in OCUM 5.2 which is currently in BETA. In 5.2 we purge the mark deleted objects as well as events. This will happen both during upgrade to 5.2 as well add every time you restore your db.

francoisbnc · ‎2013-03-07

I really expect this version and hope sincerely major bugs will be fixed this time

francois

adaikkap · ‎2013-03-07

Hi Francois,

Is there a list or specific bugs that you are looking for ? If so can you share them ?

Regards

adai

francoisbnc · ‎2013-03-13

Hi Adai,

Sorry for the delay.

Yes I do.

- Provisioning more flexible, with compression , vol autosize on SAN prov, snap autodelete for NAS and SAN as well.

- Very slow NMC after reaching 250 Datasets, even with VM 12GB RAM 6 CPUs

- Some snapshots not purged / deleted on secondary and primary

- Events and database not purged correctely. (probably fixed in 5.2)

- Conform multiple Dataset in one action

- Cannot provision NAS Dataset with volume imported or added manually

- Current space breakout display wrong (LUN Information on NAS prov dataset)

Cheers,

francois

adaikkap · ‎2013-03-17

Hi Francois,

- Provisioning more flexible, with compression , vol autosize on SAN prov, snap autodelete for NAS and SAN as well.

As you know for NAS we allow Autogrow and Auto Delete to be indepedentaly slected starting OCUm 5.1. There are no changes in OCUM 5.2. You can check the same in the BETA Pages

https://communities.netapp.com/docs/DOC-22967

- Some snapshots not purged / deleted on secondary and primary

Can you raise a case, seems like an issue.

- Events and database not purged correctely. (probably fixed in 5.2)

Yes.

- Conform multiple Dataset in one action

You should be able to script it using cli or linux one liners

- Cannot provision NAS Dataset with volume imported or added manually

Yes, this is not there in 5.2 as well.

- Current space breakout display wrong (LUN Information on NAS prov dataset)

IIRC this is fixed in OCUM 5.1

http://support.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=493743

Regards

adai

francoisbnc · ‎2013-03-24

Hi Adai,

For SAN and NAS provisioning there are not other choices that commitment=destroy for snapshoots autodelete.

That is a big problem for customers that have snapmirror and snapvault relations on slow link, because some base snapshots are delete when volumes become full and that takes weeks to do an initial base line for 500GB-1TB volumes.

Regards,

francois

clilescapario · ‎2013-06-03

This "feature" has always boggled my mind, PM is managing the mirrors/vaults but the provisioning side will happily shoot the protection side in the foot and remove the base snaps. I understand it's to keep the volume online, but it would be nice if there were other options for commitment.

adaikkap · ‎2013-06-06

Hi

Can you open a support case with NetApp for the same ?

Regards

adai

mark_schuren · ‎2013-03-07

So in 5.2 the purging is done automatically during and DB restore ONLY?

Why is that not possible online during normal operations? An option like 'purge deleted objects automatically after xx weeks' would be very much appreciated...

Mark

-- sent from mobile device, please excuse any typos

-- Mark Schuren, Ultra Consulting Network GmbH

Am 06.03.2013 um 16:23 schrieb "Adaikkappan Arumugam" <[email protected]<mailto:[email protected]>>:

<https://communities.netapp.com/index.jspa>

Re: Purging the DFM Database

created by Adaikkappan Arumugam<https://communities.netapp.com/people/adaikkap> in OnCommand Management Software - View the full discussion<https://communities.netapp.com/message/102089#102089>

Hi Francois,

As clearly said in the KB article this tool only deletes Mark deleted objects. The events pruning that you are looking for is coming in OCUM 5.2 which is currently in BETA. In 5.2 we purge the mark deleted objects as well as events. This will happen both during upgrade to 5.2 as well add every time you restore your db.

Reply to this message by replying to this email -or- go to the message on NetApp Community<https://communities.netapp.com/message/102089#102089>

Start a new discussion in OnCommand Management Software by email<mailto:discussions-community-products_and_solutions-storage_management_software@communities.netapp.com> or at NetApp Community<https://communities.netapp.com/choose-container.jspa?contentType=1&containerType=14&container=2026>

adaikkap · ‎2013-03-07

Hi Mark

That's right in OCUM 5.2 which is currently in BETA OnCommand Unified Manager 5.2 Beta Program its only done during upgrade to or restore in 5.2.

Yes the online thing should be possible but not sure about the release its targeted.

Regards

adai