General Discussion

Transfer failed. (Checksum mismatch (Replication engine error))

cterrero
7,489 Views

Hi all,

 

I'm getting the following error message from my SnapMirror production volumes:

 

Transfer failed. (Checksum mismatch (Replication engine error))

 

However, if I create a new volume (with data) and start a SnapMirror process this new volume is SnapMirrored OK.

 

vserverA:ProductionA
            XDP  vserverA_mirror:ProductionA_mirror
                              Uninitialized
                                      Idle           -         false   -
vserverA:TestA
            XDP  vserverA_mirror:TestA_mirror
                              Snapmirrored
                                      Idle           -         true    -

 

8/27/2018 12:54:40  nodeA
                                     ERROR         smc.snapmir.init.fail: Initialize from source volume 'vserverA:ProductionA' to destination volume 'vserverA_mirror:ProductionA_mirror' failed with error 'Transfer failed.(Checksum mismatch(Replication engine error))'. Relationship UUID '517405a5-a6dc-11e8-860d-00a098c339d7'.

 

Any ideas?

 

Thanks a lot.

7 REPLIES 7

marcusgross
7,469 Views

Hi,

 

could be IPS related. Check event log show and snapmirror show -instance for further details.

 

M.

cterrero
7,464 Views

Hi marcusgross,

 

Thanks for youir feedback.

 

IPS is setup correctly, event log only shows that error message and the snapmirror -instance and cluster peer is healthy.

BTW this is ONTAP 9.3 version.

 

 

Node       Cluster-Name                 Node-Name
             Ping-Status               RDB-Health Cluster-Health Availability
---------- --------------------------- --------- --------------- ------------
NODEA
           SOURCEA                 SOURCEA
             Data: interface_reachable
             ICMP: interface_reachable true      true            true
                                       SOURCEB
             Data: interface_reachable
             ICMP: interface_reachable true      true            true
NODEB
           SOURCEB                 SOURCEA
             Data: interface_reachable
             ICMP: interface_reachable true      true            true
                                       SOURCEB
             Data: interface_reachable
             ICMP: interface_reachable true      true            true
4 entries were displayed.

 

 


                            Source Path: vserveA:ProductionA
                       Destination Path: vserverA_mirror:ProductionA_mirror
                      Relationship Type: XDP
                Relationship Group Type: none
                    SnapMirror Schedule: -
                 SnapMirror Policy Type: async-mirror
                      SnapMirror Policy: MirrorAllSnapshots
                            Tries Limit: -
                      Throttle (KB/sec): 8192
                           Mirror State: Uninitialized
                    Relationship Status: Transferring
                File Restore File Count: -
                 File Restore File List: -
                      Transfer Snapshot: weekly.2018-07-29_0015
                      Snapshot Progress: 49.10GB
                         Total Progress: 49.10GB
              Network Compression Ratio: 1:1
                    Snapshot Checkpoint: 48.63GB
                        Newest Snapshot: -
              Newest Snapshot Timestamp: -
                      Exported Snapshot: -
            Exported Snapshot Timestamp: -
                                Healthy: false
                       Unhealthy Reason: Transfer failed.
               Constituent Relationship: false
                Destination Volume Node: NODEA
                        Relationship ID: 517405a5-a6dc-11e8-860d-00a098c339d7
                   Current Operation ID: 1283f052-a9f9-11e8-860d-00a098c339d7
                          Transfer Type: initialize
                         Transfer Error: -
                       Current Throttle: 8192
              Current Transfer Priority: normal
                     Last Transfer Type: initialize
                    Last Transfer Error: Transfer failed. (Checksum mismatch (Replication engine error))
                     Last Transfer Size: -
Last Transfer Network Compression Ratio: -
                 Last Transfer Duration: -
                     Last Transfer From: vserverA:ProductionA
            Last Transfer End Timestamp: 08/27 12:54:40
                  Progress Last Updated: 08/27 15:03:44
                Relationship Capability: 8.2 and above
                               Lag Time: -
           Identity Preserve Vserver DR: -
                 Volume MSIDs Preserved: -
                 Is Auto Expand Enabled: -
           Number of Successful Updates: 0
               Number of Failed Updates: 0
           Number of Successful Resyncs: 0
               Number of Failed Resyncs: 0
            Number of Successful Breaks: 0
                Number of Failed Breaks: 0
                   Total Transfer Bytes: 0
         Total Transfer Time in Seconds: 0


                            Source Path: vserverA:TestA
                       Destination Path: vserverA_mirror:TestA_mirror
                      Relationship Type: XDP
                Relationship Group Type: none
                    SnapMirror Schedule: 5min
                 SnapMirror Policy Type: async-mirror
                      SnapMirror Policy: MirrorAllSnapshots
                            Tries Limit: -
                      Throttle (KB/sec): 8192
                           Mirror State: Snapmirrored
                    Relationship Status: Idle
                File Restore File Count: -
                 File Restore File List: -
                      Transfer Snapshot: -
                      Snapshot Progress: -
                         Total Progress: -
              Network Compression Ratio: -
                    Snapshot Checkpoint: -
                        Newest Snapshot: snapmirror.118127c5-3fb3-11e7-bef5-00a098c33757_2147524595.2018-08-27_150000
              Newest Snapshot Timestamp: 08/27 14:59:59
                      Exported Snapshot: snapmirror.118127c5-3fb3-11e7-bef5-00a098c33757_2147524595.2018-08-27_150000
            Exported Snapshot Timestamp: 08/27 14:59:59
                                Healthy: true
                       Unhealthy Reason: -
               Constituent Relationship: false
                Destination Volume Node: NODEA
                        Relationship ID: ebdef187-a79d-11e8-860d-00a098c339d7
                   Current Operation ID: -
                          Transfer Type: -
                         Transfer Error: -
                       Current Throttle: -
              Current Transfer Priority: -
                     Last Transfer Type: update
                    Last Transfer Error: -
                     Last Transfer Size: 2.15KB
Last Transfer Network Compression Ratio: 1:1
                 Last Transfer Duration: 0:0:7
                     Last Transfer From: vserverA:TestA
            Last Transfer End Timestamp: 08/27 15:00:07
                  Progress Last Updated: -
                Relationship Capability: 8.2 and above
                               Lag Time: 0:3:53
           Identity Preserve Vserver DR: -
                 Volume MSIDs Preserved: -
                 Is Auto Expand Enabled: -
           Number of Successful Updates: 861
               Number of Failed Updates: 0
           Number of Successful Resyncs: 0
               Number of Failed Resyncs: 0
            Number of Successful Breaks: 0
                Number of Failed Breaks: 0
                   Total Transfer Bytes: 5364756232
         Total Transfer Time in Seconds: 3097

 

 

8/27/2018 12:54:40  nodeA
                                     ERROR         smc.snapmir.init.fail: Initialize from source volume 'vserverA:ProductionA' to destination volume 'vserverA_mirror:ProductionA_mirror' failed with error 'Transfer failed.(Checksum mismatch(Replication engine error))'. Relationship UUID '517405a5-a6dc-11e8-860d-00a098c339d7'

mrahul
7,415 Views

Hi,

 

    This can due to WAFL inconsistency or hardware related.

 

I recommend you to please open a support case with NetApp Support team for further resolution.

 

There is similar issue "checksum error due to wafl context mismatch on a snapshot on snapmirror source."  that is called out in https://mysupport.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=724468

 

 

 

cterrero
7,369 Views

Hi mrahul,

 

Thanks for your suggestion, I will keep that in mind.

 

I already have a support case opened a month ago, but it's taking to long to get solved.

 

Thanks for your feedback.

cterrero
7,226 Views

I've upgraded to version 9.4 as per NetApp recommendation, however failed again.

I've moved the production volume to another aggregate, and it failed again.

Despite that, if I create a brand new volume smaller, bigger or same size this one gets snapmirrored OK, so at this point in time I don`t think this is a network related issue.

Any ideas?

Thanks.

tduran12165
4,541 Views

...Been a few years, but did you ever figure out what the issue was from support?

Sanaman
3,430 Views

I am now noticing same error message in my system, I have multiple snapmirror relationship running, everyday one volume complains "Checksum mismatch". No answer from NetApp. In my environment both clusters in the same network so no FW/WAN accelerator. 🤔 

Public