ONTAP Discussions

Protection Manager NOT automatically deleting snapshots

dboutorwincat
19,619 Views

Hello All,

Having a problem with Protection Manager.   The protection policy for a given dataset is NOT deleting old snapshots; we use the "DR Backup" policy out of the box with minor customizations.    On the primary side, and through PM, I choose to keep 2 days of hourly (on the hour) snapshots, 10 days of daily backups, and 3 weeklys.   problem is that the outdated snapshots are not going away, which causes 255 snapshots to fill up very quickly and cause NetBackup to not run because it cannot take its NDMP snapshot.    DOT 7.3.1 on primary 3020 cluster,    Ops Mgr 4.0, desperately looking for some advice.

-Dennis

1 ACCEPTED SOLUTION

rshiva
19,614 Views

Hi Dennis,


I went through some of your log files, and I believe this might be the problem:

- First, I wanted to make sure that the snapshots in your volume that were being accumilated was created through Protection Manager, by looking at the Snapshot names, I'm pretty sure they were created by PM

- With that being said, Protection Manager has something called the "conformance engine" which is responsible for conforming the dataset's members to the policy's parameters (in this case, expiring backup versions (snapshots) based on the retention period and retention count specified in your protection policy). However the Conformance Engine was not doing it's job, which is why I had to request for the log files

- In the DFM Diag file, you have the following error message:

            Installation Directory           C:/Program Files (x86)/NetApp/DataFabric Manager/DFM
                                 Error: 2.67 GB free (8.9%)

- The consequence of this error message is that your DFM Monitor service will be stpped

- I checked the DFMMonitor.log and here some messages:


       Sep 10 03:29:02 [DFMMonitor:ERROR]: [3012:0xc08]: The file system containing C:/Program Files (x86)/NetApp/DataFabric Manager/DFM/data/ has only 9.4% free space remaining, and the monitor requires 10.0%; suspending monitoring.

       Sep 13 04:18:18 [DFMMonitor:ERROR]: [3012:0xa3c]: The file system containing C:/Program Files (x86)/NetApp/DataFabric Manager/DFM/data/ has only 10.0% free space remaining, and the monitor requires 10.0%; suspending monitoring. 

       Sep 13 04:20:03 [DFMMonitor:ERROR]: [3012:0xc08]: The file system containing C:/Program Files (x86)/NetApp/DataFabric Manager/DFM/data/ has only 10.0% free space remaining, and the monitor requires 10.0%; suspending monitoring.

       Sep 13 06:55:48 [DFMMonitor:ERROR]: [3012:0xa3c]: The file system containing C:/Program Files (x86)/NetApp/DataFabric Manager/DFM/data/ has only 10.0% free space remaining, and the monitor requires 10.0%; suspending monitoring.

       Sep 13 06:57:13 [DFMMonitor:ERROR]: [3012:0xbec]: The file system containing C:/Program Files (x86)/NetApp/DataFabric Manager/DFM/data/ has only 10.0% free space remaining, and the monitor requires 10.0%; suspending monitoring.

       Sep 13 07:27:18 [DFMMonitor:ERROR]: [3012:0xa3c]: The file system containing C:/Program Files (x86)/NetApp/DataFabric Manager/DFM/data/ has only 10.0% free space remaining, and the monitor requires 10.0%; suspending monitoring.

       Sep 13 07:29:15 [DFMMonitor:ERROR]: [3012:0xbec]: The file system containing C:/Program Files (x86)/NetApp/DataFabric Manager/DFM/data/ has only 10.0% free space remaining, and the monitor requires 10.0%; suspending monitoring.

       Sep 13 07:58:48 [DFMMonitor:ERROR]: [3012:0xa3c]: The file system containing C:/Program Files (x86)/NetApp/DataFabric Manager/DFM/data/ has only 10.0% free space remaining, and the monitor requires 10.0%; suspending monitoring.

       Sep 13 08:30:18 [DFMMonitor:ERROR]: [3012:0xa3c]: The file system containing C:/Program Files (x86)/NetApp/DataFabric Manager/DFM/data/ has only 10.0% free space remaining, and the monitor requires 10.0%; suspending monitoring.

       .....

       Oct 01 14:21:02 [DFMMonitor:ERROR]: [2992:0xbe4]: The file system containing C:/Program Files (x86)/NetApp/DataFabric Manager/DFM/data/ has only 8.9% free space remaining, and the monitor requires 10.0%; suspending monitoring.

- The DFM Monitor service is responsible for the conformance engine to run and since the service is down, DFM was not able to monitor those expired snapshots

Solution:

Free up some space - Right now you have 8.9% of free space which isn't enough. The minimum free space required for the DFM monitor service to run is 10% - so free up as much space as possible, once done the DFM Monitor should resume service. Once done, you can do an on demand conformace of your dataset

dfpm dataset conform <dataset_name_or_id>

That should expire your old snapshots automatically.

Hope that helps.

Thanks and regards

Shiva Raja

         

View solution in original post

16 REPLIES 16

gary_thomas
19,530 Views

check with snapvault snap sched to see if there are any manual snapvaults happening.

rshiva
19,530 Views

Hi,

Adding to Gary's point - The retention time specified in your protection policy would only clean up snapshots created by Protection Manager. So, if you have hourly snapshots retained for 2 days (24x2=48) Daily Snapshots retained for 10 days (1x10) and weekly snapshots retained for 3 weeks (1x3=3) you should have somewhere around 58 - 61 snapshots (based on the snapshot schedule).

Once a snapshot (or a set of snapshots for mutiple volumes in a dataset) gets created at a particular schedule (by protection manager), a backup version would be created for that dataset, and the/se snapshot/s become registered as member/s to that backup version - The backup version would have the appropriate "retention-type" attribute (Hourly/Daily/Weekly/Monthly) set based on the schedule at which the member snapshot/s were created and based on the protection policy's retention time, these backup versions would be expired.

Hence if you have snapshots created outside protection manager, they won't be cleared up by Protection Manager, you have delete them manually.

Thanks and regards

Shiva Raja

dboutorwincat
19,530 Views

Thanks Shiva, all snapshots are being taken exclusively by PM.   I'm thinking about turning on snapshot autodelete, but that is just a band aid.

-Dennis

rshiva
19,530 Views

Hi Dennis,

Can you please send me the output of the following commands:

on the dfm server:

dfpm dataset list -x <Name_of_the_dataset_where_you_are_having_problems>

dfpm policy node get "<Name_of_the_protection_policy_associated_to_the_Dataset>"

on the filer:

snap list <The_secondary_Volume_where_PM_Snapshots_are_not_getting_Deleted>

Thanks and regards

Shiva Raja 

dboutorwincat
19,530 Views

Thanks Shiva,

Here is the output.  By the way, the problem of not deleting snapshots is happening on both the primary and DR filers, so I have included the output of the 'snap list' command on primary volume 'windows'.   Keep in mind that I have been manually deleting a bunch of hourly snapshots so that the nightly ndmp backups can run - so that's why there is a hodgepodge of snapshots.

on the dfm server:

dfpm dataset list -x <Name_of_the_dataset_where_you_are_having_problems>

Id:                              717
Name:                            AA-NAS01 Backup and DR
Policy:                          DR Back up
Description:                    
Owner:                          
Contact:                        
Volume Qtree Name Prefix:       
DR Capable:                      Yes
DR State:                        ready
Requires Non Disruptive Restore: No

Node details:

   Node Name:           Primary data
   Resource Pools:      DR-NAS01
   Provisioning Policy:
   Time Zone:          
   DR Capable:          No
   vFiler:            

   Node Name:           DR Backup
   Resource Pools:      DR-NAS01
   Provisioning Policy:
   Time Zone:          
   DR Capable:          Yes
   vFiler:            

dfpm policy node get "<Name_of_the_protection_policy_associated_to_the_Dataset>"

Node Id:                    1
Node Name:                  Primary data
Hourly Retention Count:     2
Hourly Retention Duration:  39600
Daily Retention Count:      2
Daily Retention Duration:   907200
Weekly Retention Count:     1
Weekly Retention Duration:  1814400
Monthly Retention Count:    0
Monthly Retention Duration: 0
Backup Script Path:        
Backup Script Run As:      
Failover Script Path:      
Failover Script Run As:    
Snapshot Schedule Id:       47
Snapshot Schedule Name:     Sunday at midnight with daily and hourly
Warning Lag Enabled:        Yes
Warning Lag Threshold:      129600
Error Lag Enabled:          Yes
Error Lag Threshold:        172800

Node Id:                    2
Node Name:                  DR Backup
Hourly Retention Count:     0
Hourly Retention Duration:  0
Daily Retention Count:      2
Daily Retention Duration:   1209600
Weekly Retention Count:     2
Weekly Retention Duration:  15724800
Monthly Retention Count:    1
Monthly Retention Duration: 0

on the filer:

snap list <The_secondary_Volume_where_PM_Snapshots_are_not_getting_Deleted>

0% ( 0%)    0% ( 0%)  Oct 01 15:16  dfpm_base(dataset-id-717)conn1.0 (snapmirror)
  0% ( 0%)    0% ( 0%)  Oct 01 15:14  2010-10-01 19:14:56 daily_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Oct 01 15:00  2010-10-01 19:00:06 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Oct 01 14:07  dfpm_base(dataset-id-717)conn1.1
  0% ( 0%)    0% ( 0%)  Oct 01 14:00  2010-10-01 18:00:10 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Oct 01 13:00  2010-10-01 17:00:09 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Oct 01 12:00  2010-10-01 16:00:08 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Oct 01 11:00  2010-10-01 15:00:08 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Oct 01 10:00  2010-10-01 14:00:08 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Oct 01 09:00  2010-10-01 13:00:08 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Oct 01 08:00  2010-10-01 12:00:07 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Oct 01 07:00  2010-10-01 11:00:07 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Oct 01 06:00  2010-10-01 10:00:07 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Oct 01 05:00  2010-10-01 09:00:06 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Oct 01 04:00  2010-10-01 08:00:11 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Oct 01 03:00  2010-10-01 07:00:10 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Oct 01 02:00  2010-10-01 06:00:10 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Oct 01 01:00  2010-10-01 05:00:10 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Oct 01 00:00  2010-10-01 04:00:09 daily_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 23:00  2010-10-01 03:00:09 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 22:00  2010-10-01 02:00:10 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 21:00  2010-10-01 01:00:08 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 20:00  2010-10-01 00:00:08 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 19:00  2010-09-30 23:00:08 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 18:00  2010-09-30 22:00:07 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 17:00  2010-09-30 21:00:07 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 16:00  2010-09-30 20:00:07 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 15:00  2010-09-30 19:00:06 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 14:00  2010-09-30 18:00:06 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 13:00  2010-09-30 17:00:11 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 12:00  2010-09-30 16:00:10 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 11:00  2010-09-30 15:00:10 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 10:00  2010-09-30 14:00:10 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 09:00  2010-09-30 13:00:09 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 08:00  2010-09-30 12:00:09 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 07:00  2010-09-30 11:00:08 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 06:00  2010-09-30 10:00:08 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 05:00  2010-09-30 09:00:08 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 04:00  2010-09-30 08:00:08 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 03:00  2010-09-30 07:00:07 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 02:00  2010-09-30 06:00:06 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 01:00  2010-09-30 05:00:06 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 30 00:00  2010-09-30 04:00:06 daily_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 23:00  2010-09-30 03:00:10 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 22:00  2010-09-30 02:00:10 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 21:00  2010-09-30 01:00:10 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 20:00  2010-09-30 00:00:09 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 19:00  2010-09-29 23:00:09 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 18:00  2010-09-29 22:00:09 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 17:00  2010-09-29 21:00:08 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 16:00  2010-09-29 20:00:09 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 15:00  2010-09-29 19:00:08 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 14:00  2010-09-29 18:00:07 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 13:00  2010-09-29 17:00:07 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 12:00  2010-09-29 16:00:07 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 11:00  2010-09-29 15:00:06 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 10:00  2010-09-29 14:00:06 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 09:00  2010-09-29 13:00:10 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 08:00  2010-09-29 12:00:10 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 07:00  2010-09-29 11:00:10 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 06:00  2010-09-29 10:00:09 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 05:00  2010-09-29 09:00:09 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 04:00  2010-09-29 08:00:09 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 03:00  2010-09-29 07:00:08 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 02:00  2010-09-29 06:00:08 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 01:00  2010-09-29 05:00:08 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 29 00:00  2010-09-29 04:00:07 daily_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 28 23:00  2010-09-29 03:00:07 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 28 22:00  2010-09-29 02:00:07 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 28 21:00  2010-09-29 01:00:06 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 28 20:00  2010-09-29 00:00:06 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 28 19:00  2010-09-28 23:00:11 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 28 18:00  2010-09-28 22:00:10 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 28 17:00  2010-09-28 21:00:11 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 28 16:00  2010-09-28 20:00:10 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 28 15:00  2010-09-28 19:00:09 hourly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 25 00:00  2010-09-25 04:00:09 daily_AA-NAS01_windows.-.kcsdata.ntdata.personal
  0% ( 0%)    0% ( 0%)  Sep 24 11:30  2010-09-24 15:30:43 daily_AA-NAS01_windows.-.kcsdata.ntdata.personal
  1% ( 1%)    1% ( 1%)  Sep 22 00:00  2010-09-22 04:00:10 daily_AA-NAS01_windows.-.kcsdata.ntdata.personal
  1% ( 0%)    1% ( 0%)  Sep 21 00:00  2010-09-21 04:00:07 daily_AA-NAS01_windows.-.kcsdata.ntdata.personal
  4% ( 3%)    4% ( 3%)  Sep 20 00:00  2010-09-20 04:00:08 daily_AA-NAS01_windows.-.kcsdata.ntdata.personal
  4% ( 0%)    4% ( 0%)  Sep 19 00:00  2010-09-19 04:00:10 weekly_AA-NAS01_windows.-.kcsdata.ntdata.personal
  4% ( 0%)    4% ( 0%)  Sep 18 00:00  2010-09-18 04:00:06 daily_AA-NAS01_windows.-.kcsdata.ntdata.personal
  4% ( 0%)    4% ( 0%)  Sep 17 00:00  2010-09-17 04:00:08 daily_AA-NAS01_windows.-.kcsdata.ntdata.personal
  5% ( 0%)    4% ( 0%)  Sep 16 00:00  2010-09-16 04:00:10 daily_AA-NAS01_windows.-.kcsdata.ntdata.personal

rshiva
19,530 Views

Hi,

Can you please forward log directory (In the Installation directory of DFM - Zip it) and also send me the output of the commands on the DFM server:

DFM diag

dfpm backup list <dataset_name_or_id>

Thanks and regards

Shiva Raja

dboutorwincat
19,529 Views

Shiva,

I’m hoping this email gets to you. I just obtained the logs from the customer.

Best Regards,

Dennis Boutorwick | Solutions Architect

41050 West Eleven Mile Road | Novi, MI 48375

Mobile: 248.910.5907

email: dennis.boutorwick@tatatechnologies.com <mailto:dennis.boutorwick@tatatechnologies.com>

website: www.tatatechnologies.com <mailto:dennis.boutorwick@tatatechnologies.com

rshiva
19,615 Views

Hi Dennis,


I went through some of your log files, and I believe this might be the problem:

- First, I wanted to make sure that the snapshots in your volume that were being accumilated was created through Protection Manager, by looking at the Snapshot names, I'm pretty sure they were created by PM

- With that being said, Protection Manager has something called the "conformance engine" which is responsible for conforming the dataset's members to the policy's parameters (in this case, expiring backup versions (snapshots) based on the retention period and retention count specified in your protection policy). However the Conformance Engine was not doing it's job, which is why I had to request for the log files

- In the DFM Diag file, you have the following error message:

            Installation Directory           C:/Program Files (x86)/NetApp/DataFabric Manager/DFM
                                 Error: 2.67 GB free (8.9%)

- The consequence of this error message is that your DFM Monitor service will be stpped

- I checked the DFMMonitor.log and here some messages:


       Sep 10 03:29:02 [DFMMonitor:ERROR]: [3012:0xc08]: The file system containing C:/Program Files (x86)/NetApp/DataFabric Manager/DFM/data/ has only 9.4% free space remaining, and the monitor requires 10.0%; suspending monitoring.

       Sep 13 04:18:18 [DFMMonitor:ERROR]: [3012:0xa3c]: The file system containing C:/Program Files (x86)/NetApp/DataFabric Manager/DFM/data/ has only 10.0% free space remaining, and the monitor requires 10.0%; suspending monitoring. 

       Sep 13 04:20:03 [DFMMonitor:ERROR]: [3012:0xc08]: The file system containing C:/Program Files (x86)/NetApp/DataFabric Manager/DFM/data/ has only 10.0% free space remaining, and the monitor requires 10.0%; suspending monitoring.

       Sep 13 06:55:48 [DFMMonitor:ERROR]: [3012:0xa3c]: The file system containing C:/Program Files (x86)/NetApp/DataFabric Manager/DFM/data/ has only 10.0% free space remaining, and the monitor requires 10.0%; suspending monitoring.

       Sep 13 06:57:13 [DFMMonitor:ERROR]: [3012:0xbec]: The file system containing C:/Program Files (x86)/NetApp/DataFabric Manager/DFM/data/ has only 10.0% free space remaining, and the monitor requires 10.0%; suspending monitoring.

       Sep 13 07:27:18 [DFMMonitor:ERROR]: [3012:0xa3c]: The file system containing C:/Program Files (x86)/NetApp/DataFabric Manager/DFM/data/ has only 10.0% free space remaining, and the monitor requires 10.0%; suspending monitoring.

       Sep 13 07:29:15 [DFMMonitor:ERROR]: [3012:0xbec]: The file system containing C:/Program Files (x86)/NetApp/DataFabric Manager/DFM/data/ has only 10.0% free space remaining, and the monitor requires 10.0%; suspending monitoring.

       Sep 13 07:58:48 [DFMMonitor:ERROR]: [3012:0xa3c]: The file system containing C:/Program Files (x86)/NetApp/DataFabric Manager/DFM/data/ has only 10.0% free space remaining, and the monitor requires 10.0%; suspending monitoring.

       Sep 13 08:30:18 [DFMMonitor:ERROR]: [3012:0xa3c]: The file system containing C:/Program Files (x86)/NetApp/DataFabric Manager/DFM/data/ has only 10.0% free space remaining, and the monitor requires 10.0%; suspending monitoring.

       .....

       Oct 01 14:21:02 [DFMMonitor:ERROR]: [2992:0xbe4]: The file system containing C:/Program Files (x86)/NetApp/DataFabric Manager/DFM/data/ has only 8.9% free space remaining, and the monitor requires 10.0%; suspending monitoring.

- The DFM Monitor service is responsible for the conformance engine to run and since the service is down, DFM was not able to monitor those expired snapshots

Solution:

Free up some space - Right now you have 8.9% of free space which isn't enough. The minimum free space required for the DFM monitor service to run is 10% - so free up as much space as possible, once done the DFM Monitor should resume service. Once done, you can do an on demand conformace of your dataset

dfpm dataset conform <dataset_name_or_id>

That should expire your old snapshots automatically.

Hope that helps.

Thanks and regards

Shiva Raja

         

dboutorwincat
18,371 Views

Thanks Shiva,

I’ve alerted the customer, and they have allocated an additional 20GB for the root drive of their DFM server. Will I have to restart the DFM services for the conformance engine to start doing its job again?

I will have access to their server in the morning (Eastern Standard Time) and report back to you.

Much thanks for all the help so far, by the way – really really appreciate it.

Best Regards,

Dennis Boutorwick | Solutions Architect

41050 West Eleven Mile Road | Novi, MI 48375

Mobile: 248.910.5907

email: dennis.boutorwick@tatatechnologies.com <mailto:dennis.boutorwick@tatatechnologies.com>

website: www.tatatechnologies.com <mailto:dennis.boutorwick@tatatechnologies.com

rshiva
18,371 Views

Hi Dennis,

I bet the DFM Monitor service is down, It's better to restart the services. Once done, check if all the services are running, check the output of DFM Diag, make sure that you don't get any free space error messages. Once done, try to run an on-demand conformance on any one dataset and check out the results.

And besides, no formalities 🙂 That's we have this community - to help each other :-).

Thanks and regards

Shiva Raja

dboutorwincat
18,372 Views

Yep, adding more space to the root drive of the DFM server fixed the problem.

Much thanks again, you are a genius.

Best Regards,

Dennis Boutorwick | Solutions Architect

41050 West Eleven Mile Road | Novi, MI 48375

Mobile: 248.910.5907

email: dennis.boutorwick@tatatechnologies.com <mailto:dennis.boutorwick@tatatechnologies.com>

website: www.tatatechnologies.com <mailto:dennis.boutorwick@tatatechnologies.com

dboutorwincat
19,530 Views

Thanks for the response Gary.  There are no manual snapvault schedules - they are all controlled by PM.   In fact, PM is taking the snapshots when it is supposed to, just not deleting the old ones it has taken.  Happening on all the filers that PM is managing (8 total filers, so you can see where this problem is snowballing out of control).  Weird stuff.

jerome_barrelet
18,371 Views

Hi,

We experience a deletion of snapshots issue in our nearline storage, every backup are controled by DFM 4.0.2D2.

the DFM forget to delete some expired snapshots (in blue in my Print screen).

It append only on CIFS, and on all our volumes (~120 volumes), but it's not always the same snapshots.

I tried a 'DFPM conform' command but it doesn't work. I check my retention but everything is correct (see print bellow)

I would like to know if somebody having the same issue ?

Thanks in advance

Jerome

nowiccnow
18,371 Views

Hi there,

i've the same pb as Jerome above.

Beside DFM, we use sme, smsql, smvi, snapvault (the last version of everything).

Some hourly (essentialy) seems to stay on our volume here and there for no reason.

We have less volume than the poor Jerome but are clearly disapointed to manually removing rogue snapshot every day/week on our 40 Volumes.

We opened a case this year but Netapp engineers didn't help that much here, struggling with dfmdc & co.

They said than 4.0.2D2 should resolved our issue because we were in 4.0.1*, and it does NOT.

question for Jerome,

From which GUI comes your first screenshot ?

I didn't find it in NMC. Is it in system manager ? (heavy gui equivalent of the http one ?)

regards,
C.

p_schmitter
18,371 Views

Ich bin ab dem 11.08.2011 wieder im Büro

Ihre Mail wird nicht weitergeleitet.

In dringenden Fällen wenden Sie sich bitte an Herrn Michael Lengowski

I am back in the office at 08/11/2011

Your mail will not be forwarded.

In urgent cases please contact Mr. Michael Lengowski.

>>> nowiccnow <xdl-communities@communities.netapp.com> 11.08.2011 11:29 >>>

nowiccnow created the discussion

"Re: Protection Manager NOT automatically deleting snapshots"

To view the discussion, visit: http://communities.netapp.com/message/60696#60696

Most Recent Post:

jerome_barrelet
15,735 Views

Hi,

For your question it's just system Manager ver2 Beta.

We also have opened a call to Netapp I will give you some news if I have better result.

Jerome

Public