Subscribe
Accepted Solution

Errors after swapping disk paths

Recently we tracked down an issue where our MPHA for some of our shelves was not done per best practices (on a 3000 series, 0a & 0b, 0c & 0d share an ASIC and shouldn't be connected to the same stack).  After talking with support, we performed a takeover, and then on the down filer we swapped the cables for ports 0a and 0c.

When the filer came back up, we noticed these errors (which are still persisting). 

1. I see periodic messages saying "DBG: ses_incorporate_path(): Enclosure Logical Identifier Mismatch."  These are always in big clusters when they happen (12 messages or so).

2. If I run sysconfig -a now, I see the following at the end of some adapter lists:

    Shelf 1: ESH4  Firmware rev. ESH A: 14  ESH B: 14

shelf: XX:XX:XX:XX:XX:XX:XX:XX channel 2b not found!

                Shelf 2: ESH4  Firmware rev. ESH A: 14  ESH B: 14

shelf: XX:XX:XX:XX:XX:XX:XX:XX channel 2b not found!

                Shelf 3: ESH4  Firmware rev. ESH A: 14  ESH B: 14

shelf: XX:XX:XX:XX:XX:XX:XX:XX channel 2b not found!

                Shelf 4: ESH4  Firmware rev. ESH A: 14  ESH B: 14

shelf: XX:XX:XX:XX:XX:XX:XX:XX channel 2b not found!

                Shelf 5: ESH4  Firmware rev. ESH A: 14  ESH B: 14

shelf: XX:XX:XX:XX:XX:XX:XX:XX channel 2b not found!

                Shelf 6: ESH4  Firmware rev. ESH A: 14  ESH B: 14

shelf: XX:XX:XX:XX:XX:XX:XX:XX channel 2b not found!

How can we correct this issue?  As far as I can tell, the disk pathing shown is correct (MPHA issue is good) and the data is still available.

Re: Errors after swapping disk paths

I have seen this when the MPHA paths go up and down the same path. Even though MPHA if you check cable paths, are the 2 paths up and down the same top of bottom modules?

Re: Errors after swapping disk paths

Also supprt or your netapp NetApp or var se can run a config advisor (wiregauge) and check all pathing to confirm the issue.

Re: Errors after swapping disk paths

Thanks for the reply Scott.  I double checked and that isn't the case here.  We are Afiler_in->Bfiler_out or vice versa in all cases. 

Re: Errors after swapping disk paths

last time i checked recabeling a mpha system online wasnt supported. if you do so, you need to properly reboot (or takeover/giveback) BOTH systems to properly clean up the disk inventory on both machines.

usualy we offline maschines to recable them.

Re: Errors after swapping disk paths

Are any of the shelf to shelf cables crossed? Probably not but worth checking. If cabled ok it is likely a power cycle of the controller will fix it but support will help Likely maintenance mode boot to check each node taken over.

Sent from my iPhone 4S

Re: Errors after swapping disk paths

Changing the source path while the partner is down is supported. Not while a node is up but during takeover is on. If you change while online a a cp event occurs organ core dump since the label has the source port on it. But During a cf takeover the down node can be recabeled and should be checked in maintenance mode for paths. On giveback all supported since the change was offline on that node.

Sent from my iPhone 4S

Re: Errors after swapping disk paths

Darn autocorrect. Ha

Re: Errors after swapping disk paths

We resolved this, but were required to take the entire system down fully (both heads and all shelves) and bring them back up.  Support still claims that you can change pathing during takeover, but it didn't work for us. 

Re: Errors after swapping disk paths

I have seen some weird anomalies that have only been cleared by a power cycle.. one recently where one path of mpha brought down both paths when connected to a ds14 fc loop... so one node wouldn't come up.  A power cycle of both nodes cleared it.  All cabling was correct.  99% of the time all good but this is the 1% of frustration... seems a bit better with SAS technology now instead of FCAL on older shelves.