Deduplication metadata relocated

bepresseditor · ‎2010-07-19

So we had a bunch of flexible volumes, all mounted via nfs:

/vol/vm_dns

/vol/vm_vpn

/vol/vm_misc

In order to deduplicate them on DataONTAP 7.3.3, we jammed them all into a single volume:

/vol/vms/vm_dns

/vol/vms/vm_vpn

/vol/vms/vm_misc

And that worked, but now we're really feeling the cost.

First, copying a qtree takes radically longer than copying the same size volume. A second our storage is no longer independent. We had a snap mirror slowdown, and all of a sudden each of the qtrees got the lesser of their quota or the overall available volume size (in other words, all the vm's ran to 0% disk space all at once). And we no longer have the same tools to view the volumes (e.g. vol status, df, etc.).

Other than monitoring disk space better... is there anything we can do to regain the waffly goodness of individual volumes for each of our VM's?

Sebastian_Goetze · ‎2010-07-20

Just 2 quick thoughts:

Get familiar with (tree) quotas
- I wouldn't set hard quotas, but you could use the soft ones for warnings/reporting
Don't use QTree SnapMirror, but use Volume Snapmirror instead if possible.
- QSM will walk through the file system tree, so it will take a lot longer and will not be dedupe aware.
- VSM on the other hand is fast and dedupe aware (and you could use compression). It might be even faster than before (because of the 'global' dedupe...)

Hope that helped.

Sebastian

bepresseditor · ‎2010-07-20

We're using qtree quotas -- nonetheless when the volume got accidentally filled each vm got the lesser of the qtree quota or the available space in the volume. Thus they all failed at once. Separate volumes would be more isolated.

We're using volume snapmirror.

Each of these qtrees consists of thousands of files (these are Xen VM's with thousands of nfs files, not a single virtualized container ala VMWare).

reinoud7 · ‎2010-07-20

Indeed, you have to choose.

But how big are your old volumes (qtrees). Because I think that when they are 500 GB - 1 TB, that it make sense to make separate volumes.

But why do you want qtrees? Do you have separated backup retention policies? We use one backup policy for all our VM's: not that we need the same for all the VM's but it's easy and than you can use snapmirror.

Reinoud

bepresseditor · ‎2010-07-20

We want qtrees because otherwise we get 0% dedupe. Though, at 16% dedupe, it hardly seems worth it:

netapp0a> df -ih /vol/xen_vms/
Filesystem               iused      ifree %iused Mounted on
/vol/xen_vms/          8309590   13826990     38% /vol/xen_vms/
netapp0a> df -h /vol/xen_vms/
Filesystem               total       used   avail capacity Mounted on
/vol/xen_vms/            254GB      154GB    99GB      61% /vol/xen_vms/
/vol/xen_vms/.snapshot     0KB      380MB     0KB     ---% /vol/xen_vms/.snapshot
netapp0a> df -sh /vol/xen_vms/
Filesystem                used      saved %saved
/vol/xen_vms/            154GB       30GB     16%

rdfile /etc/quotas
#Auto-generated by setup Wed Oct 17 21:40:24 GMT 2007
"/vol/xen_vms/fisheye.svc"    tree    6291456K    -    6291456K    6291456K    -
"/vol/xen_vms/cacti.svc"    tree    4194304K    -    4194304K    4194304K    -
"/vol/xen_vms/xfer.util"    tree    8388608K    -    8388608K    8388608K    -....

Each of these is a Xen VM running either Debian Etch or Debian Lenny.

mitchells · ‎2010-07-20

Your dedupe ratio seems rather low. What OS are your VMs are you running? What is your dedupe schedule?

Thanks,

Mitchell

pascalduk · ‎2010-07-20

mitchells wrote:

Your dedupe ratio seems rather low. What OS are your VMs are you running? What is your dedupe schedule?

That is probably because he does not have that many VMs in the volume (150 GB is not much). And perhaps not all the same OS.

bepresseditor · ‎2010-07-20

Each of these is a Xen VM running either Debian Etch or Debian Lenny. Etch and Lenny are different enough they probably don't dedupe much.

mitchells · ‎2010-07-21

I would make sure that you have block alignment between your guest filesystems, your partition table inside the VMDK, and the 4K boundry of WAFL.

I would also take a look at the placement of your guest OS swap and your VM swap.

Did you clone your Etch and Lenny installs or is every install fresh from media?

Thanks,

Mitchell

bepresseditor · ‎2010-07-21

There is no, as far as I can tell, block alignment possible for a nfs volume. (note: vmware is not installed).

Swap is on local disk.

Each VM is a copy of a prior one (not always the same prior one, but definitely a prior one).

mitchells · ‎2010-07-21

There is no block alignment for an NFS volume. NFS handles file symantics, which means WAFL is going to lay the files down aligned. However, if you are storing a virtual hard drive (VHD in the Xen world), that needs to be aligned with WAFL.

bepresseditor · ‎2010-07-21

We are running our Xen VMs in a style which is both old-school and different from how most of the world does them. Instead of using loop-back filesystems (à la .vmdk files), we store the actual files which comprise the VMs directly on WAFL and then use NFS root to boot directly from the filer. This has a number of advantages (and disadvantages). If we ever need to access an individual file from a VM, we don't need to use any intermediary software to get at our files: we simply mount the volume somewhere else and all the files are simply there. It also makes backups both more useful (again, because we can directly backup each file) and a lot slower (because we're backing up millions of small files instead of hundreds of large .vmdk files). But that is neither here nor there, except that it makes discussion of large .vmdk files inapplicable to us.

Given our setup, what we really want is for each VM to live inside its own volume, not inside a qtree in one big "VMs" volume. But in order to take advantage of de-dupe, we were forced to move from individual volumes into one big volume. And then that one big volume filled up, which caused all of our VMs to go down at once.

What we really want is for de-dupe to work across volumes.

mitchells · ‎2010-07-21

You are using VHDs, correct?

mitchells · ‎2010-07-21

Deduplication occurs at the level so there is no way to span the savings. If you are trying to avoid the volume filling up, you can implement a snapshot autodelete policy and a volume autogrow policy.

dejanliuit · ‎2011-04-10

Isn't the deduplication across volumes since Ontap 7.3?

The data still has to be in the same aggregate, but not the same volume.

So essentialy you sould be able to go back to the volume structure you had before.

From the Ontap 7.3 release notes.

Deduplication metadata relocated

In Data ONTAP 7.3 and later, the fingerprint database and the change logs used by the deduplication process, are located outside the volume, in the aggregate. The fingerprint database and the change logs form the deduplication metadata. Because the metadata resides in the aggregate outside the volume, it is not included in the FlexVol volume Snapshot copies. This change enables deduplication to achieve higher space savings.

radek_kubka · ‎2011-04-12

Isn't the deduplication across volumes since Ontap 7.3?

Nope - sadly that's not the case.

Despite the fact the metadata is shifted into the aggregate, the actual fingerprint database is still on a per volume basis & the scope is volume-wide.

Have a look at TR-3505, starting at page 7:

http://www.netapp.com/us/library/technical-reports/tr-3505.html

Regards,
Radek

De-dupe across volumes (not within a volume)

Deduplication metadata relocated