VMware Solutions Discussions

Vmware datastores disconnection

deygaurab
10,105 Views

Hello Folks,

Today morning many of my vmware datastores lost access to the NetApp array. See one such error below.

Lost access to volume 4fad3686-2f38ba84-fc46-0025b502008f (Deployment-Conversion-PDT3-01) due to connectivity issues.

Recovery attempt is in progress and outcome will be reported shortly.

I checked the storage /etc/messages, SAN switches(my setup is entirely fcp) but could not find anything.. the SAN switches didnt log any port fluctuations nor the filer. Any idea what I could have missd checking?

Cheers

Rahul

12 REPLIES 12

deygaurab
10,071 Views

to add to it, all the ESX in the fram behaved this way. 70% of its datastores lost connection and then reconnected itself after a while..

deygaurab
10,071 Views

any suggestions?

martin_fisher
10,071 Views

Anything in the syslog from the NetApp Filer, or the log files from an affected ESX Host at all ?

deygaurab
10,071 Views

Hello Martin,

Nothing related to any port logout or LUN reset in the /etc/messages of the filer. I am yet to check the ESX's if they have logged anything...

Cheers

Rahul

cedric_renauld
10,071 Views

Hi, Ranul

Have you receive any autosupport during this BREAK ?

Have you the possibility to send us the messages ?

deygaurab
10,071 Views

Hi Cedric,

There was no autosupport generated . Neither did anything show up in the /etc/messages log or in the filer console.   The below messages are from the Vcenter logs..

Lost access to volume 4e9ee3d1-3ede5480-c3ae-0025b501006f (PrePROD-PAR3-002) due to

connectivity issues. Recovery attempt is in progress and outcome will be reported shortly.

Successfully restored access to volume 4e9ee3d1-3ede5480-c3ae-0025b501006f (PrePROD-PDT) following connectivity issues.

Successfully restored access to volume 4fad3686-2f38ba84-fc46-0025b502008f (Deployment-Conversion-PDT) following connectivity issues.

Lost access to volume 4e9ee3d1-3ede5480-c3ae-0025b501006f (PrePROD-PDT) due to connectivity issues. Recovery attempt is in progress and outcome will be reported shortly.

Lost access to volume 4f704a90-1f470f5c-4df9-0025b501004f (PrePROD-PDT) due to connectivity issues. Recovery

attempt is in progress and outcome will be reported shortly.

Lost access to volume 4fad3686-2f38ba84-fc46-0025b502008f (Deployment-Conversion-PDT) due to

connectivity issues. Recovery attempt is in progress and outcome will be reported shortly.

Not all of the datastores in the ESX disconnected.. say around 50% of the datastores in all the ESX in the farm disconnected and then reconnected itself..and the rest 50% stayed intact without any issues... I am wondering if there could be something from the storage which is not showing up from the message logs of netapp..

Cheers

Rahul

martin_fisher
10,071 Views

Last place you could look is in the logs on your switches if any are kept... Unfortunately it could be a hard job to indentify what caused the issue now. Have you logged it with NetApp Support also?

deygaurab
10,071 Views

Haven't logged this with NetApp support yet as I do not have any error logs from storage . I checked the switches but could not find any errors.

Cheers

Rahul

CASTROJSEC
10,071 Views

Did you ever resolve this or find out what was going wrong?  i just now ran into the same issue.

BJBAARSSEN
7,768 Views

Hello James,

I found the solution.

This is caused because of wrong FillWords on the 8Gb (Brocade) switch port.

This must be set to "3".

Use the portCfgFillWord (portCfgFillWord <Port#><Mode>) command to configure this setting.

Regards, Barend

CASTROJSEC
7,768 Views

Thanks.  We use Cisco MDS so I will see what the equivalent is.

SHAILESHVAIDYA
7,768 Views

Hello Barand,

Fill word is set to 3, still the same issue. This is not the solution.

Public