ONTAP Discussions

Snapmirror lag issue

arghya
13,742 Views

Hi,

Somebody just asked me what could be the reason behind a lag time of 58 hours while snapmirroring between the source and destination.

I asked him to do a status -l and this is what it shows up as below:

---------------------------------------------------------------------------------------------------

snapmirror status -l Pre_openlab_service

Snapmirror is on.

Source:                 sekinas01.seki.xxxx.net:vol001  ( I deliberately mentioned it as xxxx here by removing the actual name)

Destination:            ussbnsrs8322:Pre_openlab_service

Status:                 Pending

Progress:               -

State:                  Snapmirrored

Lag:                    58:35:30

Mirror Timestamp:       Wed Aug 10 03:50:03 EDT 2011

Base Snapshot:          ussbnsrs8322(0118056294)_Pre_openlab_service.6650

Current Transfer Type:  Retry

Current Transfer Error: incremental update not possible; a resync or initialize is necessary

Contents:               Replica

Last Transfer Type:     Scheduled

Last Transfer Size:     2016 KB

Last Transfer Duration: 00:00:21

Last Transfer From:     sekinas01.seki.xxxx.net:vol001

-------------------------------------------------------------------------------------------------------

My research says any of the below can be reasons.

  1. the ontap version on both source and destination
  2. the /etc/snapmirror.conf file and see whatis the lag defined there if any.
  3. the snapmirror.conf schedule minute field(A * in that field means the update request is triggered each minute). If thebusiness requires a synchronized backup for critical data, then Sync SnapMirroris the suitable service to use instead of Async SnapMirror scheduled per minute.
  4. For  traditional volumes SnapMirror, ensure that disks size/type and raid group  size are identical between the source and the destination volumes.

   5.  Ensure that snapshot creation schedules for  all mirror/backup active services do not overlap. When possible, schedule  transfers at different times than scheduled regular volume snapshot copies.

       

    6. Is there enough space on the source and  destination volume? (Use the df command to display the free space per  volume).

 

     7. High system resources utilization (CPU% Util,Disks I/O, CIFS/NFS connections/transactions, etc.) may slowdown transfer throughput.

                                Collect and  analyze the following commands outputs:
                                perfstat  output from source and destination (this adds statit and sysstatoutput as  well).
                                statit  and sysstatoutput while the transfer is going on, both on the source and on  the destination.
                                Network  details (other jobs, bandwidth, failures, expected throughput, throttling in  place).

 

Seems any of the above can be a reason. Does any body has any particular clue or anthing more to this that can be the reason.

 

Kindly advise.

Thanks.

2 REPLIES 2

columbus_admin
13,742 Views

The telling thing in the output is this here:

"Current Transfer Error: incremental update not possible; a resync or initialize is necessary"  <--  This typically is due to one side or the other having the snapshot that was used for the replication being deleted.

Run snap list 'vol_name' on both sides and look for a common snapshot:

'destination_filer_name'(system_id)_vol_or_dateset_name.xxxx (snapmirror)

                                                                                    ^ ---- This will be an increment for the number of snapmirrors that have occured over time

                                                     ^ --- This will be the volume or dataset name depending on how you mirror the data

                                   ^ -- This will be the destination filer's system id as from a sysconfig output 

^ - From the source you will see the destination filer's name first

So on both ends you should see something like the snap listed in the output.  If you do not have one, the relationship has been lost and you will need to re-initialize the replication.

for example from what you posted:  ussbnsrs8322(0118056294)_Pre_openlab_service.6650  should be on both the source and destination.  If both sides have a similarly number increment such as ussbnsrs8322(0118056294)_Pre_openlab_service.6651 or .6652, you can resync by specifiying that snapshot explicitly.

- Scott

arghya
13,742 Views

Appreciate your prompt response Scott. Thank you.

Public