VSM (TDP) fails consistently at 342.5GB

arizonavol16 · ‎2017-06-17

Has anyone ever experienced a Volume SnapMirror fail at the exact same spot consistently?

This is the very last mirror as part of a migration project. All other 9 VSM mirrors transferred from the same source to the same destination just fine.

During the initialization phase the transfer gets to 342.5GB and then the source reports that the snapmirror failed. Just generic message. The destination continues to say transferring for another 30 minutes before it finally stops with a failed message (also generic). The source volume is only using 5% of inodes and 80% of storage space. It is not deduped or compressed.

I have tried multiple things to troubleshoot. I have deleted the snapmirror and volume on the destination and create it again. I have created the destination volume twice the size and started the mirror. Everything I have tried stops at same 342.5GB.

On the source I created three QSMs to a bogus volume and those QSMs finished just fine.

The Source = NetApp Release 8.1.4P9D18

Destionation = NetApp Release 9.1P2

shamz · ‎2017-06-18

Hi,

You haven't included much information about the source volume. Do you, by chance, have active QSM sessions going to this volume?

S.

arizonavol16 · ‎2017-06-20

All volumes on the source have VSMs going to another DR filer (7-mode to 7-mode) with no issues, and has been replicating like this for years.

I completed 9 of the 10 volumes with TDP mirrors to cDOT and they completed without issue. The only one I am having issue with transfers 342.6GB and then fails. I can destroy the destination and recreate and it still fails at same 342.6GB mark.

I can do a VSM of this volume10 to a bogus_volume10 on the same Filer then do a TDP mirror of bogus_volume10 to cDOT just fine.

AlexDawson · ‎2017-06-20

Thank you for confirming functionality. Sometimes MPLS links won't run actual 1500 byte packets, but if it is just LAN (where TDP runs most of the time anyway), that rules that out.

If the VSM'ed copy works as a source, I would just use that and put it down to "weird issue with no longer supported version of 7DOT".

My thoughts include some form of missed volume corruption - but it's rare to begin with, and VSM should just copy the corruption over..

shamz · ‎2017-06-20

If you really want help...

How about some details?

snapmirror status / snapmirror show for the source and destination volumes.

df -h for source and destination volumes.

Error messages from event log

AlexDawson · ‎2017-06-19

I've seen it happen when networks have incompatible MTUs.. what does the network look like between the two systems?

arizonavol16 · ‎2017-06-20

The intercluster lifs are 1500 and that matches the source.

Not network related as 9 other volumes completed their VSMs just fine. This one volume is the only one that I have issues with.

I may simply create a new empty volume on the source and do three QSMs into this new volume then VSM that over to cDOT. That works.

GR · ‎2018-02-28

Hello tech gurus,

I have a similar problem too, vsm transfer is stopping exactly at the same point and alert specifies transfer failed. Appreciate your help in fixing the same, tried multiple way by using different network interfaces at sources and different volume in destination. Nothing helps.

Appreciate your expertize here.

-RK