Re: please explain to me the snap list on the source filer in snapmirror - Page 2

netappmagic · ‎2013-10-13

The vol1 is 100% full again. My question is not about sovling volume full issue. but, more about the snapmirror in a more granular way. Please see the following outputs on the source filer "filer2". It seems to me that the size of the snap caused by the snapmirror is 1911GB, the rest of the space is taken by the volume (FS) itself. Could anybody please explain to me in detail about the output of "snap list vol1"?

- what does exactly the snap include, complete copy of volume1, and plus all snapshots since the first full copy? Why do I have to continue to leave the full copy on the source filer, after it has already copied over to DR site?

- has this listed snap alredy been copied to drfiler1? or just the full sets of snapshots?

- is there any way to list the data in detail as to what is the full copy of volume and what are those snapshots, when were those snapshots taken individually?

thanks for your help!

filer2> df -rg vol1
Filesystem               total       used      avail   reserved Mounted on
/vol/vol1/       8193GB     8148GB        0GB        0GB /vol/vol1/
/vol/avol1/.snapshot        0GB     1911GB        0GB        0GB /vol/vol1/.snapshot
filer2> snap list vol1
Volume vol1
working...

%/used %/total date name
---------- ---------- ------------ --------
23% (23%) 23% (23%) Oct 09 06:00 drfiler(0151735037)_vol1.806 (snapmirror)

netappmagic · ‎2013-10-31

Hi Bill,

I need turn back to you again for your help.

to continue on our conversation about the size of the snapshot which is produced by snapmirror, the issue is that the size of the snapshot will be gradually increased, as more data gets removed from this 8TB volume. If I frequently use df -rg volume, the size is getting larger and larger, after I just removed 1TB data. Why? Eeven though I break off the snapmirror on the DR site, the size of snapshot (on the source) is still increasing. and if I do" snapmirror status volume" the status is showing me "transferring". Why? I thought if I broke off the snapmirror, transferring should be stopped.

billshaffer · ‎2013-10-31

The source snapshot is used by snapmirror, but it still a snapshot - a list of pointers to unchanged blocks, and a bunch of blocks that have changed since the snapshot was taken. The source snapshot will continue to track changes to the volume whether or not snapmirror is active. In addition, when you delete a large amount of data, the process that "transfers" that data to the snapshots (block reclamation) is not instantaneous - so it's expected to see the snapshot continue to grow for a while after a delete.

If the relationship is broken, transferring is stopped. However, if the relationship is still defined in snapmirror.conf, it will continue to attempt to resync and error out. It could be during this attempt-error cycle that you see the "transferring" state. It's also possible that, if you did a break without doing a quiesce, you broke it in the middle of a transfer and its status is "stuck" on the source. I've seen a couple different scenarios where "snapmirror status" said "transferring" when it clearly couldn't be. Bottom line is if the destination 'snapmirror status" says Broken-off, then no transferring is going on.

Bill

netappmagic · ‎2013-11-01

>In addition, when you delete a large amount of data, the process that "transfers" that data to the snapshots (block reclamation) is not instantaneous -

can I understand this sentence as following: Snapshot is not only keep track of data that just added, also data that just removed, and therefore I will see just removed data will be "transferred" to snapshot area, and then that's why I will see snapshot space is growing?

I did quiesce the snapmirror first before break off. However, the relastionship is still defined in snapmirror.conf file. As you said, that's why the "transferring" would be still going on, but eventually error out?

Thank you very much for staying with me for so long.

billshaffer · ‎2013-11-01

If the snapmirror is still scheduled, it will try to kick off on schedule, but will error out pretty quickly with something like "not in a snapmirrored relationship." You should be able to piece the sequence together from logs, but I think for the full picture you need /etc/messages and /etc/log/snapmirror from both the source and the destination. Which side are you seeing the "transferring" state on? If you comment out the entry in snapmirror.conf, it will stop trying to run.

Snapmirrors don't really track added data. When you take a snapshot of a volume, you're creating a point-in-time image of that volume. This works by creating a bunch of pointers in the snapshot to all the allocated blocks in the volume - not really taking any space at this point, because it's all pointers. As new data is added to the volume, new blocks are allocated. The snapshot is unaware of this. When data gets changed/deleted, the new data still gets allocated to new blocks (and the snapshot is still unaware of this new data), but the pointers in the snapshot that pointed to that changed data still exist - now the blocks get deallocated from the volume (since they are no longer valid in the live filesystem), and allocated to the snapshot. The data doesn't move - it's still on the same physical block - but now it "belongs" to the snapshot, not the volume.

The way snapmirror can use the snapshots to know what new/changed data needs to be replicated is by taking that second snapshot, and comparing the two.

Does that make sense?

Bill

netappmagic · ‎2013-11-01

Hi Bill,

I have read your message a few times. based on my understanding, I honestly still don't understand why the snapshot space is getting big and big, and the following number 2320GB (bold) is climbing again and again for about half hour, and reached as much as 3219G before I had to delete it by using snap delete command.

source> df -rg vol1

Filesystem total used avail reserved Mounted on

/vol/vol1/ 8193GB 8148GB 0GB 0GB /vol/vol1/

/vol/vol1/.snapshot 0GB 2320GB 0GB 0GB /vol/vol1/.snapshot

So, it did not error out quickly, and I am not sure if it is the result of scheduled resync in /etc/snapmirror.conf, because the number immediately started to climb as soon as we deleted 1TB amount data.

that "transferring" is on the source side when I run "snapmirror status" on the source volume

there are "vol1 is full" messages in /etc/messages file, also about destination volume is full, could not make transfer. The only type of messages in /etc/log/snapmirror is about DR volume is full, and could not make the transfer. So, both volumes were full.

Another basic question, please forgive me, is 2320GB here really a total amount of data that all snapshot pointers point to, not the amount of space that these pointers occupy, right? because ponters won't now occupy so much space. If right, then it means once 1TB data got removed, then growing snapshot pointers are starting to point to these removed blocks, therefore the amount of data that pointers point to is getting big and big?

billshaffer · ‎2013-11-01

Pointers in snapshots take almost no space - you can see this be creating a snapshot of a large volume, and seeing with df and snap list that is has no real size. In this case, 2320G is the total space of the CHANGED blocks pointed to by the snapshot pointers - data that has not changed in the live filesystem is still just pointed to, and still takes no space in the snapshot. Yes - when you delete/change data in the live filesystem, the snapshots grow in size. This is normal.

I'm not sure what you mean by "So, it did not error out quickly, and I am not sure if it is the result of scheduled resync in /etc/snapmirror.conf, because the number immediately started to climb as soon as we deleted 1TB amount data." You need to decouple (in your mind) the snapshot growth from the snapmirror - they are really unrelated. Snapmirror will use snapshots, taking new ones and removing old ones, to determine what needs to be replicated, but the snapshots are really an independant entity.

So, "it did not error out quickly" - if, in fact, you have broken your snapmirror, the scheduled sync will fail. It will try again on the same schedule, and fail again. If the volumes are full, it may error out on that before discovering that the relationship is broken.

"the number immediately started to climb as soon as we deleted 1TB amount data" - as I said, this is expected snapshot behavior. Your original snapshot was 1911G. You deleted 1000G. Thus, I would expect the snapshot to grow to 2911G, more if there has been more change to the live filesystem (which we can assume, given the change rate observed earlier).

Bill

netappmagic · ‎2013-11-01

this number 2320G or any other later number being grown to did indeed contain those just deleted data? and that's why you expect the size would eventually grow to 2911G, adding the same amount that we just removed? and therefore we could say that snapshots will also keep track of deleted data as well?

Sorry, Bill, I am slow man...

billshaffer · ‎2013-11-01

Snapshots provide a point-in-time image of the live filesystem - so yes, any data that is changed or deleted from the live filesystem is essentially tracked in the snapshots.

Bill