Data Backup and Recovery

No space available left on a volume. After deleting some snapshots through filerview and backup through SnapManager for SQL, I didn't see any changes in volume available size.

weissasset
17,741 Views

Initially I got errors (0xc00408d3, 0xc00402c2, and 0x00402206) during a daily snapshot on SQL database through SnapManger for SQL. I called NetApp for support many times but not much help. It's been 3 weeks from the initial call. I really need help to get this issue solved. I deleted all backups through SnapManager for SQL. I don't see any changes in volume available size. I also deleted some snapshots through filerview. I still didn't see any changes in volume available size.

What have I not done right?

16 REPLIES 16

BrendonHiggins
17,670 Views

I have had this problem before and had to grow the volume in order for it to sort it's self out. Once it is running again to can shrink it back. {Flexvol assumed} The problem is with how LUNs use reserve. The snaps go into the volumes data space 1st then fill up the reserve space.

If you connect to the consol you can use the command

df -V -g

To view all the volumes on the filer and how the space {in Gb} is used. Use the command

df -A -g

To see how much space you have in the aggregates. Once aggregates are above 80% full you are heading for pain but thats another story...

vol size {volname} +5g

Will grow the volume by 5 Gb

vol size -5g

Will shrink it back by 5 Gb

Good luck

weissasset
17,670 Views

Are you saying that I have to increase the Volume for it to clean up (shrink) itself back? I'm new to NetApp SAN. I just increased the volume by 10 GB. How long will it take to shrink itself back? How do I know that increasing by 10 GB is enough or not?

Thanks,

weissasset
17,670 Views

I increased the volume by 10GB. Now I can make a snapshot without any error. I deleted about 2/3 of all snapshots. It's been over 5 hours, but the volume consumpsion has not reduced.

stetson
17,672 Views

This can happen in two scenarios that I have seen thus far:

  1. Deletions sometimes take some time to complete on heavily utilized systems.
  2. If the aggregate is already over-committed, there is no space to give back.

The latter seems to be the most probable. To confirm this theory, try these two commands:

  1. "aggr show_space"
  2. "vol status -v" and look for "guarantee"

In the "aggr" command, the output is self explanatory. In the latter, if you see ANY volume that has guarantee set to "none", in essence, that is a volume that has the ability to be sized above and beyond the physical confines of the aggregate.

Please confirm.

weissasset
17,672 Views

All volume status show "guarantee"

volume=guarantee.

stetson
17,671 Views

What version of ONTAP? Do the numbers in "aggr show_space" look as expected?

weissasset
17,670 Views

I think the version is 7.2.4L1. I'm expecting the log (the trouble volume) consuming less space since I deleted about 2/3 of snapshots within the volume. It's been over 5 hours, but the volume consumpsion has not reduced. I also increased the volume by 10GB. Now I can make a snapshot without any error. I don't see any changes in the volume consumpsion.

stetson
17,670 Views

Check space reservations on your luns. If you have space reservations enabled, the luns will require twice their size in the volume for lun snapshots. There is a whole section on space reservations in the ONTAP documentation which will do you more justice than I can do in this little box 😉

Here's a link with respect to your ONTAP release:

http://now.netapp.com/NOW/knowledge/docs/ontap/rel724L1/html/ontap/bsag/4cr-f3.htm

The command "lun show -v"will tell you if space reservation is enabled on your luns.

Let me know if this seems like the issue; I'm now suspecting it is.

weissasset
17,670 Views
Yes. the space reservation is enabled on all luns. Is that the best practice? The SAN was implemented by one of the well-known Boston consulting firm. I also was highly suggested to stay with these settings. I'm very new to NetApp SAN. I got the Snapshot to work by increase the size of the volume. I also deleted both SQL backups and Snapshots. I don't see any increase in available volume space. It's been a week since I deleted those. How do I reclaim the space after deleting the snapshots?

stetson
13,545 Views

It's typically the best practice to enable space reservations on luns for the reasons cited in the documentation link I referenced. There is also a concept called thin-provisioning which among other things, disables space reservations. Each have their advantages and draw-backs which I recommend you become intimately familiar with before changing. Here is a link on Thin Provisioning:

http://www.netapp.com/us/library/technical-reports/tr-3483.html

Just for further clarification. This is about volumes with luns, right? And the problem is that you are deleting data inside your luns (from the host) and some lun snapshots and not getting space back on the storage system?

Please clarify.

weissasset
13,545 Views

1. I used FilerView. Then go to Volumes -> Snapshots -> Manage -> select the snapshots (two third) within the trouble volume and delete them.

2. I also used SnapManager for SQL server -> Action -> Delete Backup... -> select the LUN within the trouble volume and select delete oldest backup in excess of "1". Then delete. (I'm not really sure if this method delete data inside the LUN or not. A NetApp support told me that this one deleted the snapshots which It didn't make sense to me.)

stetson
13,545 Views

Data inside the lun can only be managed from the host that is using the lun.

But I think the remaining disconnect is in what you are expecting to see and not seeing it. Perhaps there is no space to be recovered at all. You can take 200 snapshots of your lun this very minute and take up hardly any space at all beyond what you are taking up. Then delete them once more and not see any gains.

Do understand that with luns, that first snapshot will take up exactly the space of the lun. So a 300GB lun with snapshots will require a +600GB volume with space reservations. In having a 300GB lun in a 550GB volume, you WILL NOT be able to take a snapshot and you WILL get a space error.

But do the math and look at the volume, lun and snapshots carefully and consider what I just said. Is there really space to recover?

See if these links help:

http://now.netapp.com/NOW/knowledge/docs/ontap/rel724L1/html/ontap/onlinebk/2snap8.htm

https://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb39654

http://now.netapp.com/NOW/knowledge/docs/ontap/rel724L1/html/ontap/bsag/4cr-f5.htm

weissasset
13,545 Views

I think I'm getting close to the solution. Thank for your advise and info. But I still can't determine which snapshot comsumes the most space on the volume. When I deleted the snapshot with 28% used 15% total, it transfers the 28% usage and 15% total to the previous snapshot. Do I have to delete ALL snapshots in order to free up the space. The following are the results of df and snap list commands.

bossana>df

/vol/mexico_log2/ 73400320 71608900 1791420 98% /vol/mexico_log2/
/vol/mexico_log2/.snapshot 0 12355224 0 ---% /vol/mexico_log2/.snapshot

-----------------------------------------------------------------------------------------------------------

BOSSANA> snap list mexico_log2
Volume mexico_log2
working...

%/used %/total date name
---------- ---------- ------------ --------
0% ( 0%) 0% ( 0%) Sep 08 12:57 sqlsnap__mexico_09-08-2008_12.56.24
28% (28%) 15% (15%) Aug 19 01:15 sqlsnap__mexico_08-19-2008_01.15.00__daily
29% ( 2%) 16% ( 1%) Aug 18 01:15 sqlsnap__mexico_08-18-2008_01.15.00__daily
29% ( 0%) 16% ( 0%) Aug 17 12:30 sqlsnap__mexico_08-17-2008_12.30.00__weekly
29% ( 0%) 16% ( 0%) Aug 17 01:15 sqlsnap__mexico_08-17-2008_01.15.00__daily
29% ( 0%) 16% ( 0%) Aug 16 01:15 sqlsnap__mexico_08-16-2008_01.15.00__daily
30% ( 0%) 16% ( 0%) Aug 15 01:15 sqlsnap__mexico_08-15-2008_01.15.00__daily
30% ( 1%) 17% ( 0%) Aug 14 01:15 sqlsnap__mexico_08-14-2008_01.15.00__daily

stetson
13,545 Views

There are two ONTAP commands that allow you to see that:

snap delta

snap reclaimable

Play with those and see how you can determine what you are looking for.

Lemme know .....

weissasset
13,545 Views

Thanks for all your help. Those commands and article links you gave me were very useful. They led me to my solution. And I'd like to emphasize that this is only for my solution. It may not suit for the other. I simply turned on SNAP AUTODELETE for the trouble volume and it started deleting snapshots until it reached 20% free space (by default). Since this is a SQL server for researching not the production, I can take more risks. I need to learn more about managing spaces on NetApp SAN for the future. BTW I setup a backup job for SQL server for all databases nightly. Even I lost this server I still have backup of all databases.

stetson
11,204 Views

Awesome. You are quite welcome. Glad to hear this is resolved for you.

Public