Data migration from EMC to NetApp storage - problems with starting hosts (WIN 2003/2008 clusters)

klemen_bregar · ‎2014-05-18

Hello everyone!

We've been working on this issue for a month now and have tickets open with Netapp and Microsoft but we have not been able to resolve the issue yet. We have issues in WIN cluster after data migration from EMC storage to NetApp storage. We have migrated data with FalconStor NSS device. We have problems in 2 Windows clusters (2008 and 2003), single hosts are working OK.

After mapping NetApp LUN's to hosts they do not restart properly and hang on black screen. In the case when windows not start and it goes to "black screen" the server is still pingable. If we unplug the FC cables from HBA, they restart OK and then we can plug them into cluster. Other workaround is unmapping LUNs on netapp, boot the host and mapping LUNs back to server. After that hosts see all LUNs from NetApp normally. Sometimes also happens that hosts boot normally with more than half of the LUNs (we have unmapped some LUNs). NetApp software (DSM, HUK) and HBA drivers information are a supported configuration and we have removed all EMC software from this servers.

Does anybody has any idea or hint why this is happening? We want to find the reason for such strange behaviour.

Could the issue be somehow related with EMC software that was installed on this hosts before?

Thank you!

With regards,

Klemen

JGPSHNTAP · ‎2014-05-19

Please provide whate type of controllers and version of ontap.

Also, are you running netapp MPIO software on your clusters?

I'm not familiar with falconstor, but perhaps someone else can assist

klemen_bregar · ‎2014-05-19

Hi!

The system is:

FAS3250 Metrocluster

DataONTAP 8.1.3

We have installed:

NetApp Software:

Host Utilities	6.0.4649.1236
Data ONTAP DSM	4.1.4220.2117

Previosuly this EMC software was installed:

MPIO Software:

Software	Version	Description

LAMINV	5.5.0.161	EMC PowerPath LAM for Invista
LAMGEN	5.5.0.161	EMC PowerPath LAM for Generic
LAMSYMM	5.5.0.161	EMC PowerPath LAM for Symmetrix
LAMCLAR	5.5.0.161	EMC PowerPath LAM for CLARiiON
MSDSM	6.1.7601.18015	Microsoft Multi-Path Device Specific Module
MPIO	6.1.7601.18015	Microsoft Multi-Path Bus Driver

With regards,

Klemen

Willybobs27 · ‎2014-05-20

Given this is Metrocluster this opens up all sorts of possible scenario's.

Are the LUNs being mounted on the Local or Remote Controller?

Is the connectivity between Controllers and Shelves working correctly?

Are you presenting via SnapDrive? (Highly recommended)

klemen_bregar · ‎2014-05-20

LUNs are being mounted from both locations. The reason here is that one metrocluster site has bigger SATA disks and other site faster SAS disks. They put LUNs from WIN cluster which need better performance on faster SAS disks and other on SATA disks. Also they have a lot more space on location with SATA disks. It's a matter of design – customer insisted on this configuration.

We do not use snapdrive, connectivity is OK - it was confirmed from netapp support.

Now it is also happens that server will boot few times normally and then several times will not boot and so on.

with regards,

Klemen

Willybobs27 · ‎2014-05-20

Lots of dependencies on the inter-site communication for normal production then.

My best best would be that latency is causing the problem. I'd be looking at the HBA logs and windows logs to see what's being posted.

Might be worth ripping off Microsoft and NetApp MPIO, rebooting and reinstalling

Also unless you are using the likes of CommVault/SnapProtect I'd present the LUNs via SnapDrive

COMITSUPPORT · ‎2014-05-26

Another good option is the DTA2800