VMware Solutions Discussions

Takeover Issues with 3240 HA Pair

WICKEDSHARK
3,354 Views

Hi the last two times we have completed takeovers on one of our controllers of our busiest filers we have issues with vmware Linux guests.

Here is a high level system configuration

3240 controllers in HA Pair

4 SAS disk shelves (600GB)

2 SATA disk shelves (2TB)

2 stacks (1 for SAS and 1 for SATA)

2 SAS aggregates (1 per controller)

2 SATA aggregates (1 per controller)

The issues are usually with VMware datastores during the takeover. These datastores are 8Gb FC. What happens is when we perform a takeover the linux guests in the datastores put there root file systems into Read Only mode. I assume this is due to latency which does get rather high and Linux attempts to protect itself. The Windows guests recover on giveback.

I am not sure why this happens and if it is purely due to latency or not.

Any of the experts here have any ideas of what the potential issue or issues could be?

Thanks

3 REPLIES 3

DILIDOLO2
3,354 Views

Did you set disk io timeout to 190s in VMs? We find Linux is a lot more sensitive to this.

WICKEDSHARK
3,354 Views

Ok I see this KB article now for ESX 4 and higher 180 seconds is recommended. I thought VMware tools did this already. I know it sets Windows to 60 seconds but I guess Linux needs longer to respond. I will look into this further. More information on the ESX KB here is the link.

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1009465&sliceId=1&docTypeID=DT_KB_1_1&dialogID=122102300&sta...

christin
3,354 Views

This is a late reply but I just wanted to share a KB article that documents this:

https://kb.netapp.com/support/index?page=content&id=2010823

If your issue persists, please contact NetApp Technical Support.

Regards,

Christine

Public