Re: AFF A800 - Failover

sandeepthots · ‎2021-09-28

Hi Team,

We have recently responded to a banking customer for their RFP and they have reverted with queries as below

Proposed Solution -

AFF A800 with 15.3TB NVMe SSD [ 48 x 15.3TB for DC and 24 x 15.3TB for DR ]

32Gbps FC ports

Queries

1. Maximum time taken by the storage to resume operations in terms of Failover ? Any Test results will help

2. During the failover period , will the requests coming in from Hosts be Queued ?

3. What is the recommended practice to configure Multipathing at VM 's level ?

4. Do we have any performance numbers at various Cached Hit and Cache Miss levels ?

aladd · ‎2021-09-28

1. during successful takeover, operations should experience no disruption therefore there's no time taken to resume operations.

2. During failover writes are sent to the HA partner and flushed to their respective aggregates.

3. Not sure what the question is here. Are you referring to a VMware host? a VMware guest? what protocol are we using and what is the operating system?

4.What is the specifics you're looking for in terms of cache hit/miss? we track cache hits and misses in statistical data, and there are a few tools that are available to track the data.

paul_stejskal · ‎2021-09-30

3. Multipathing isn't done at guest VM level but host OS on SAN client side.

ZaphodBbx · ‎2021-12-22

Regarding #2 & #3: 32Gb FC is specified, so (unless you sell them an all SAN array) the multipathing will be ALUA. Each host will have paths to each node, and in a node or path failure, the host will retry failed I/O to alternate paths. The retry is at the driver level, so the application doesn't see it, only some additional latency for the (in theory one) I/O operation that experiences the failure. Subsequent I/O happens on the alternate path(s) until the node or path recovers. None is queued.

But these questions, along with the DC and DR notations, make me suspect that they are thinking of a pair of systems, as in a metrocluster.