Re: CVO failover/giveback: what happens to the I/O in flight?

DMWNTAP23 · ‎2019-05-30

During a Windows client "copy/paste" operation on a 10GB file, the user [purposely] failed over one CVO node to the other as a test. The copy operation was about halfway done when the f/o started, and the user got a pop-up saying the copy got aborted, and "Do you want to restart? Y/N"

1. If they say "Yes", does the copy restart where it got interrupted (checkpoint/restart function), or does it start over from the beginning once node2 takes over?

2. They actually said "No" to the restart question when they tested this. Would they expect:

a. The file to be "wiped" from the target client (no data copied)?

b. The file to be partially copied over, but not completely.

c. The Windows Explorer on the target client shows the file completely copied over, but is unreadable?

mandrews · ‎2019-05-31

Which hyperscaler? Is this in AWS multi AZ by chance?

If so are you using the floating IP rather than the static IP’s?

DMWNTAP23 · ‎2019-05-31

It's AWS, single AZ with one CVO HA pair. User copied (used Windows "copy/paste") file(s) from an external Windows machine (not a CVO client) TO a Windows client of the CVO instance. No floating IP required.

JFMartin_ESI · ‎2019-06-11

Mmmm interesting use case testing... would love to have an official answer on this. Anyone from NetApp ?

elementx · ‎2019-09-29

I expect for SMB they would have to restart from scratch (we aren't discussing a Fault Tolerant cluster with RAM mirroring, are we?)

I expect 2a after restart.

Why would this work any different (or better) than it works on prem?

Ontapforrum · ‎2019-09-29

Just sharing my thoughts...

I am not familiar with AWS cloud HA stuff, but I guess CIFS/SMB behavior should be same whether on premise or Cloud.

During NAS Fail-over, LIF (Holding the IP) will be moved to surviving Node Physical Port, it may drop 1 or 2 pings but the copy will go through. However, if the LIF (Holding the IP) cannot be failed-over, then CIFS will error-out and will ask user to cancel or retry. If you say retry, it will start from the scratch.

1. If they say "Yes", does the copy restart where it got interrupted (checkpoint/restart function), or does it start over from the beginning once node2 takes over?
Ans: Start over from begining.

2. They actually said "No" to the restart question when they tested this. Would they expect:

a. The file to be "wiped" from the target client (no data copied)?

b. The file to be partially copied over, but not completely.

c. The Windows Explorer on the target client shows the file completely copied over, but is unreadable?
Ans: a. The file to be "wiped" from the target client (no data copied) - This is purely CIFS/SMB nature, it will either commit everything or nothing (unlike NFS).