Microsoft Virtualization Discussions

SRM Fails at steps 1 and 4

CSTOCKDA
3,634 Views

SRM issues

SRM seems to fail at the following steps:

1. Synchronize storage and
4. Create writable storage snapshot so this could be two separate issues.

We see the following message in Vmware

VMWare SRM Support:

<_type>dr.storage.fault.CommandDeviceFailed</_type>",
,

We see in EMS that they get the below message about the Snapmirror update

EMS:
***********
****************
*********************************
error* does not have Snapshot copy vserverdr.2.0b567cee-9747-11e6-910a-00a09863b8f7.2018-11-12_012500.)"


It appears they get the EM in EMS when this happens

error="Failed to create Snapshot copy ***************does not have Snapshot copy vserverdr.2.0b567cee-9747-11e6-910a-00a09863b8f7.2018-11-12_012500.)"

Vmware was involved and the reason for failure issues

I was looking through the logs where the Netapp storage during test recovery as a reason SRA commands are failing. Reason LUN snapshots are not created

We checked this on the recovery report:

. Synchronize storage Skipped
1.1. Protection Group G********** Skipped
2. Restore recovery site hosts from standby Success
3. Suspend non-critical VMs at recovery site Inactive
4. Create writable storage snapshot

Error - Failed to create snapshots of replica devices. Failed to create snapshot of replica device <VOLUME>> Unable to create the temporary writeable copy.
Ensure that the aggregate or parent volume, if using QTrees, has at least as much free space as the size of the volume being cloned. Also, ensure that the aggregate is added to the aggr-list of the SVM.. SRA command 'testFailoverStart' failed for device

We have increased the space on the destination volumes however the issue remains the same

 

I am wondering if anyone has seen errors like this before and could help with an action plan to fix this. I have serached KBs and checked logs however no luck

1 REPLY 1

ChanceBingen
3,617 Views

There are a couple of things going on here, but without knowing the specific versions of ONTAP, SRM, and SRA, it's hard to say.

 

At first glance, I would suggest these three things.

1. Log into ONTAP and create a regular volume snapshot. If it succeeds, you should have enough space for any other snapshot that gets created. While you are looking at the volumes, make sure both source and destination volumes look ok.

2. Check and confirm that the LUN has space reservation disabled. Usually (depending on your ONTAP version) when you clone a LUN, it inherits all of the attributes of its parent. This would include the thin provisioning attribute. So it's possible you may have enough room to create a snapshot, but not enough room to create a snapshot and then create a new space reserved LUN using it.

3. The error message you posted doesn't exactly say so, but it could be due to not having a FlexClone license. You will need that license on both the source and destination. In ONTAP 9(and Clustered ONTAP before it), unlike ONTAP 7-mode, we use FlexClone exclusively, as opposed to 7-mode where we created snapshot backed LUN clones.

 

Oh, and I guess a fourth thing. Do make sure that the aggregates are added to the aggregate list for the SVM. I've seen that trip up a lot of people. It's very easy to miss.

Public