2008-08-06 07:36 AM - edited 2015-12-18 01:43 AM
we have a production IBM blade center with blades attached to NetApp SAN with fiber in production.
we have the same setup in DR. Production luns are snapmirrored to DR.
To test that DR blade servers launch successfully we break the snapmirrors, disconnect the DR blade center from the LAN and then fire up the blades (all the same network so do not want the same netbios names on the network twice).
There are 24 blades and 22 of them come up fine, find all their luns and and map drive letters to those luns.
2 of the blades do not. One is a development machine so I'm more concerned with the last one, an exchange server.
When this blade is fired up in DR, it always finds its OS lun (C drive) and snap lun (J drive). The other 4 luns are seen but do not get drive letters.
After breaking snapmirror and booting up the blade the OS takes a while to respond to keyboard and mouse. Once I log in, I see the C and J drive.
I can then go to disk management and assign drive letters to the other luns. I also have to assign drive letter D to the internal drive that is used for pagefile.
And I have to reset the paging properties as a pagefile is created on the C drive. I need to get rid of that and just do paging on the D drive.
Once all drive letters are manually assigned I reboot and the blade comes up, responds immediately to keyboard and mouse and all luns are correctly mapped and drive letters assigned. Also drive labels are now assigned without me doing anything.
After doing a resync and then breaking snapmirror again to boot up the DR blade again I have to go through the above procedure all over again to get the luns mapped. I want them automatically mapped as they do on the other 22 blades
I talked to netapp and sent them snapdriveDC logs. They saw Event ID 257 logged in the System event log. In logs from a blade that correctly mapped all luns there were no 257 errors. They suggested the hotfix at http://support.microsoft.com/kb/924390 . Before adding the hotfix to production (which would then get replicated to DR) we wanted to boot the DR blade, apply the hotfix and then reboot. If this worked then apply it to production. I applied the hotfix to DR but on reboot again only C and J mapped.
I also tried putting the HBA card into another blade. Same deal, I only map the C and J luns, all others are seen but get no drive letter.
Lastly in filer I remapped all luns to another blade and again just got the C and the J.
So I can see all the luns and can assign drive letters using disk management. I can then reboot and all is well. But once I resync the snapmirror and then try to boot the DR blade, I'm back at square one with just the C and J mapped.
Any ideas or known bugs?
2008-08-12 05:31 PM
I've encountered similar types of problems before, which leads me to ask a few questions.
Have you run the Host Utility Kit for your particular platform/protocol in order to have the appropriate timeouts and dependencies set?
I've had situations where either I would time out while mapping drives, or when my system would boot much faster than my drivers get loaded, resulting in things half-way coming up. Running the HUK's should help set some of those dependencies for you as well.
Let us know if that helps at all, as it sounds like it works fine and consistently across most of your hosts, except for these final few.
Thanks and look forward to hearing from you!