ONTAP Discussions

CDOT 8.2.3P5 SFO partial giveback error



Wondering if someone could throw some light on this error whilst giving back, root aggr giveback is successful but data aggr giveback fails.


Aggr show on node2 (via SP console)
Warning: WAFL-side aggregate name "aggr0_cluster_node02" does not match VLDB-side aggregate name "aggr0_clusternode02"

Info: Node dkcluster-01 that hosts aggregate aggr1_cluster_node02 is offline

SFO Giveback (from node01)
Error: command failed: Failed to initiate giveback. Reason: Partner has not fully booted for giveback of data aggregates. Retry
giveback later or retry with require-partner-waiting parameter set to false

cluster::cluster ha> storage failover show
Node Partner Possible State Description
-------------- -------------- -------- -------------------------------------
cluster-01 cluster-02 true Connected to cluster-02, Partial giveback
cluster-02 cluster-01 - Waiting for cluster applications to come online on the local node

It appears VLDB has not captured aggr rename commands which were issued few days ago when both nodes were healthy. No changes were done on the cluster whilst one node is in failover state.


How do I get  my cluster to update vldb with the changed vol number?  


This is only a test cluster so no impact to prod.



Looks like D-Blade and VLDB are out of sync.  Halt the unhealthy node, use debug vreport show/fix commands on the healthy node to fix the VLDB, then see if the partner will boot and take a giveback.




If this post resolved your issue, help others by selecting ACCEPT AS SOLUTION or adding a KUDO.


Keep in mind that command is at the diag level. No need to halt to run that command, btw.


I'd suggest opening a case if debug vreport doesn't fix the issue.


@parisi  and @SeanHatfield
Thanks very much for suggesting a possible fix, I wil try this and let you know how it goes.

Have a quick question, Netapp documentation says RDB is transactional in that the RDB guarantees that when data is written to a DB, either all gets written successfully or rolled back. No partial or inconsistent DB writes are commited.  
I wonder what could have caused inconsistent DB between both nodes ?


This isn't considered a partial or inconsistent RDB. It's a difference between what is in WAFL and what is in RDB.


This can happen for a variety of reasons, so I'd suggest opening up a support case if you want root cause.