Subscribe

De-dupe across volumes (not within a volume)

So we had a bunch of flexible volumes, all mounted via nfs:

  /vol/vm_dns

  /vol/vm_vpn

  /vol/vm_misc

In order to deduplicate them on DataONTAP 7.3.3, we jammed them all into a single volume:

  /vol/vms/vm_dns

  /vol/vms/vm_vpn

  /vol/vms/vm_misc

And that worked, but now we're really feeling the cost.
First, copying a qtree takes radically longer than copying the same size volume.  A second our storage is no longer independent.  We had a snap mirror slowdown, and all of a sudden each of the qtrees got the lesser of their quota or the overall available volume size (in other words, all the vm's ran to 0% disk space all at once).  And we no longer have the same tools to view the volumes (e.g. vol status, df, etc.).
Other than monitoring disk space better... is there anything we can do to regain the waffly goodness of individual volumes for each of our VM's?

Re: De-dupe across volumes (not within a volume)

Just 2 quick thoughts:

  • Get familiar with (tree) quotas
    • I wouldn't set hard quotas, but you could use the soft ones for warnings/reporting
  • Don't use QTree SnapMirror, but use Volume Snapmirror instead if possible.
    • QSM will walk through the file system tree, so it will take a lot longer and will not be dedupe aware.
    • VSM on the other hand is fast and dedupe aware (and you could use compression). It might be even faster than before (because of the 'global' dedupe...)

Hope that helped.

Sebastian

Re: De-dupe across volumes (not within a volume)

Indeed, you have to choose.

But how big are your old volumes (qtrees). Because I think that when they are 500 GB - 1 TB, that it make sense to make separate volumes.

But why do you want qtrees? Do you have separated backup retention policies? We use one backup policy for all our VM's: not that we need the same for all the VM's but it's easy and than you can use snapmirror.

Reinoud

Re: De-dupe across volumes (not within a volume)

We're using qtree quotas -- nonetheless when the volume got accidentally filled each vm got the lesser of the qtree quota or the available space in the volume.  Thus they all failed at once.  Separate volumes would be more isolated.

We're using volume snapmirror.

Each of these qtrees consists of thousands of files (these are Xen VM's with thousands of nfs files, not a single virtualized container ala VMWare).

Re: De-dupe across volumes (not within a volume)

We want qtrees because otherwise we get 0% dedupe.  Though, at 16% dedupe, it hardly seems worth it:

netapp0a> df -ih /vol/xen_vms/
Filesystem               iused      ifree  %iused  Mounted on
/vol/xen_vms/          8309590   13826990     38%  /vol/xen_vms/
netapp0a> df -h /vol/xen_vms/
Filesystem               total       used   avail capacity  Mounted on
/vol/xen_vms/            254GB      154GB    99GB      61%  /vol/xen_vms/
/vol/xen_vms/.snapshot     0KB      380MB     0KB     ---%  /vol/xen_vms/.snapshot
netapp0a> df -sh /vol/xen_vms/
Filesystem                used      saved  %saved
/vol/xen_vms/            154GB       30GB     16%


rdfile /etc/quotas
#Auto-generated by setup Wed Oct 17 21:40:24 GMT 2007
"/vol/xen_vms/fisheye.svc"    tree    6291456K    -    6291456K    6291456K    -
"/vol/xen_vms/cacti.svc"    tree    4194304K    -    4194304K    4194304K    -
"/vol/xen_vms/xfer.util"    tree    8388608K    -    8388608K    8388608K    -....

Each of these is a Xen VM running either Debian Etch or Debian Lenny.


Re: De-dupe across volumes (not within a volume)

Your dedupe ratio seems rather low.  What OS are your VMs are you running?  What is your dedupe schedule?

Thanks,

Mitchell

Re: De-dupe across volumes (not within a volume)

mitchells wrote:

Your dedupe ratio seems rather low.  What OS are your VMs are you running?  What is your dedupe schedule?

That is probably because he does not have that many VMs in the volume (150 GB is not much). And perhaps not all the same OS.

Re: De-dupe across volumes (not within a volume)

Each of these is a Xen VM running either Debian Etch or Debian Lenny.  Etch and Lenny are different enough they probably don't dedupe much.

Re: De-dupe across volumes (not within a volume)

I would make sure that you have block alignment between your guest filesystems, your partition table inside the VMDK, and the 4K boundry of WAFL.

I would also take a look at the placement of your guest OS swap and your VM swap.

Did you clone your Etch and Lenny installs or is every install fresh from media?

Thanks,

Mitchell

Re: De-dupe across volumes (not within a volume)

There is no, as far as I can tell, block alignment possible for a nfs volume.  (note: vmware is not installed).

Swap is on local disk.

Each VM is a copy of a prior one (not always the same prior one, but definitely a prior one).