I have a RHEL5 VM on vSphere 4.1, running Intersystems Caché. All storage is currently in VMDK files on an NFS datastore on a FAS2040 (7.3.5P1). Every night, Caché does a full dump of its database (~120GB) to an independent VMDK located on a volume that does not do snapshots; this dump is subsequently backed up to tape using Backup Exec Remote Agent. Each dump overwrites the previous one. I'm wondering if it's possible to create a volume specifically for those backups, export to to NFS, mount the NFS share directly inside the VM, run the dump, then snap it and have asis deduplicate the data between nightly backups - which doesn't change that much - so that I could keep older backups online in the .snapshot folder, rather than restore from tape when I need one.
I think I'm doing something wrong there, but I can't figure out what. For the time being, I'm testing with Windows and CIFS - I created a 10GB volume, enabled deduplication, shared it via CIFS, and copied a 1GB file there. Manually ran asis on the volume, created a snapshot, then copied the same file there again, overwriting the original - but it's the very same file. Ran asis again, and it didn't find anything to deduplicate, and now I had a total of 20% used space on the volume. Took another snapshot, ran asis - nothing. Copied the file again, ran asis again - still nothing, and 30% used. What's the proper way to have asis deduplicate between current data and a snapshot and/or existing snapshots? Is there one?
I created the volume from system manager and checked 'enable deduplication' on creation. To start deduplication I ran 'sis start /vol/test_bak' from CLI. I just tried the same thing with the -s switch, but it didn't help - I've got a 1GB snapshot and 1GB data on the volume, and asis can't find anything to deduplicate.
netapp2> df -hs test_bak Filesystem used saved %saved /vol/test_bak/ 2053MB 0MB 0% netapp2> df -h test_bak Filesystem total used avail capacity Mounted on /vol/test_bak/ 10GB 2053MB 8186MB 20% /vol/test_bak/ /vol/test_bak/.snapshot 0MB 1026MB 0MB ---% /vol/test_bak/.snapshot netapp2> sis status /vol/test_bak Path State Status Progress /vol/test_bak Enabled Idle Idle for 00:18:27
The *file* itself does not have any data to be deduplicated - it's an encrypted archive. However, the snapshot copy and the live copy are exactly the same file - what I need is for the filer to deduplicate the live data against snapshot(s), if that is at all possible.
You cannot dedupe from Snapshot to Active File System.
The data in the Active Files System can be deduplicated (if the content allows it, and you are right, the kind of file you are testing with does not work). The SnapShots are a READ-ONLY copy of the AFS-inode (incl. pointers) and therefore cannot be deduped separately.
All you can get deduped is the data in the AFS, then snapshot the volume and save space in the snapshot as well, beacuse the blocks are already multipointered by the AFS-dedupe process.
I thought it might be smart enough to compare current data with existing snapshots, but apparently not. A bit of googling gave up options cifs.snapshot_file_folding.enable which seems to do this very thing for CIFS clients - is there an NFS counterpart?
I guess I could just get the Caché admin to rotate backups between backup.1, backup.2, etc, keeping versions in different folders on the active filesystem without Netapp snapshots - less elegant, but I suppose it'll work just fine. In this scenario, asis shouldn't have problems deduplicating between database dumps.