Subscribe
Accepted Solution

resyncing aggr plex

What happens when you loose the loop to the good plex (connection between controller and shelf that has the latest up-to-date WAFL file system) during a aggr resync ?

resyncing aggr plex

The same as if you loose the access to disks in a configuration without syncmirror -> downtime.

What exactly the damage will be depends on the scenario.

resyncing aggr plex

I'm not convinced about the downtime. But maybe I didn't explain the circumstances very well:

1. Both plexes are fine, syncmirror between both shelf loops works fine (we have only one loop to each plex).

2. One loop interrupted, no syncmirror between both plexes. But no downtime, since other plex is fine

3. Loop recoverd, syncmirror resync's latest data to the not up-to-data plex.

4. Other loop interrupted, no syncmirror anymore (process interrupted) between both plexes.

=> WE STILL HAVE ONE PLEX HOWEVER, SO YOU SHOULD EXPECT NO DOWNTIME, BUT THIS PLEX DOESN'T HAVE AN UP-TO-DATE WAFL (since resync wasn't finished yet).

QUESTION: what would happen here !?

Re: resyncing aggr plex

I would expect Data ONTAP to panic due to multiple disk failures in aggregate. The second plex cannot be used because it is stale. So you suddenly lost your aggregate.

P.S. just tested in simulator and it panics indeed. I would be greatly surprised if anything else happened

Re: resyncing aggr plex

aborzenkov wrote:

I would expect Data ONTAP to panic due to multiple disk failures in aggregate. The second plex cannot be used because it is stale. So you suddenly lost your aggregate.


I can confirm this behaviour. The filer won't use a plex that is out of sync.

Re: resyncing aggr plex

Thanks for the feedback so far !

Taking it a little further: so in this scenario if you have a HA config or Metrocluster, it would failover to the ohter node and then go further with the resync operation from there ?

Re: resyncing aggr plex

Yes, as long as partner has access to both plexes.