ONTAP Hardware
ONTAP Hardware
I have an issue with my HA pair configuration with FAS2220.
So After shelf fault, one node 1 tookover node2 , I have theses issues :
s0u1sanb(takeover)> environment shelf
Environment for channel 0a
Number of shelves monitored: 1enabled: yes
Environmental failure on shelves on this channel? yes
s0u1sanb(takeover)> sysconfig -a
*** This system has taken over s0u1sana
System Storage Configuration: Single-Path HA
System ACP Connectivity: Partial Connectivity
slot 0: SAS Host Adapter 0a (PMC-Sierra PM8001 rev. C, SAS, <UP>)
slot 0: SAS Host Adapter 0b (PMC-Sierra PM8001 rev. C, SAS, <OFFLINE (hard)>) and PCM LED on, and
Is there any solution for that , or I should to replace tne failed node ?
Thanks alot for your advices !
Hi. What was the shelf fault, and are we sure it has been resolved?
Currently, the system seems to only see the following disks in the built-in shelf:
Shelf mapping (shelf-assigned addresses) for channel 0a: Shelf 0: XXX XXX XXX XXX XXX XXX XXX XXX XXX XXX XXX XXX 11 10 9 8 7 6 5 4 3 2 1 0
It's aware of a cable connected to the 0b port of controller B - but no link at the end (hence offline-hard)
SAS Host Adapter 0b (PMC-Sierra PM8001 rev. C, SAS, <OFFLINE (hard)>) Firmware rev: 01.11.07.00 Base WWN: 5:00a098:001c4aa:74 Phy State: [4] Enabled, Rate unknown [5] Enabled, Rate unknown [6] Enabled, Rate unknown [7] Enabled, Rate unknown QSFP Vendor: Molex Inc. QSFP Part Number: 112-00176+A0 QSFP Type: Passive Copper 0.5m ID:00
and It's aware of a cable connected to the 2nd node SAS Port (0a I believe), but I can't tell if it's up or not:
[4] Vendor: Molex Inc. Type: QSFP passive copper 0.5-1.0m ID: 00 Swaps: 1
So again - unclear what happened to the shelf, and to the other node (maybe the other node panic as it lost it's root AGGR disks? need to login via SP/console and see).
Hi Marcus,
the problem still not resolved.
I cannot console the sana controller.
I can only console the sanb controller wich it's working.
As you can see in the attached file :
s0u1sanb(takeover)> environment shelf
...
Environment for channel 0a
Number of shelves monitored: 1enabled: yes
Environmental failure on shelves on this channel? yes
s0u1sanb(takeover)> sysconfig -a
This system has taken over s0u1sana
...
System Storage Configuration: Single-Path HA
System ACP Connectivity: Partial Connectivity
slot 0: SAS Host Adapter 0a (PMC-Sierra PM8001 rev. C, SAS, <UP>)
slot 0: SAS Host Adapter 0b (PMC-Sierra PM8001 rev. C, SAS, <OFFLINE (hard)>)
So, should I restard the sana controller in order to be able to console it and then to boot it ?
The PCN LED is On , may be it's a boot problem ?
I'm really a beginner on SAN and I need your help
Thanks
You can run the command "cf status" to see if the other node maybe up and ready for giveback.
If not - the system has a physical COM console port which you can connect to with a standard RJ-45-Console cable.
It also has a service processor (SP) IP based remote-control port (a bit like ILO/iDRAC/BMC on servers). You can find the IP of the working controller SP with the command "SP status" and perhaps guess the non-working controller IP and try to connect to via telnet/ssh (user is "naroot" with the same password as the "root" user on the controller itself).
Once you connected - you can run the command "system console" to jump into the "physical" console port of the controller - and troubleshoot any boot/disks issues.
It still unclear what the status of the external shelf from you reply., if the SAS HBA on the second controller will still not recognize it some pictures of the cables/LEDs and physical troubleshooting will be needed. (can my maybe do some rests/power-cycle via ACP if cabled - but it seems that it's not or the shelf down in the output which says "System ACP Connectivity: Partial Connectivity")
Ho Marcus,
you can read my attaches file for mire information.
the console connection dosen’t work Edith the failed controller.
how to connect with the SP?
It may be bad. Your SP should be pulling a DHCP IP address, so you can check your DHCP server to see if you have a DHCP address for it, then SSH in, if it wasn't previously configured.
If you aren't getting a console response using the standard console properties (just like Cisco), and CTRL+G or CTRL+D don't do anything, it's likely dead.
Have you tried a reseat?
Hi,
I’m getting a console response using CTRL+G or CTRL+D , I can have the SP prompt
So what should I do to enable my 0a adapter which is down?
and to resolve my shelffault and to reboot my controller on SP mode
thanks
So to resum :
I have an HA fas2220 with 2 controllers s0u1sana and s0u1sanb
sanb is taking over sana
s0u1sanb(takeover)>
s0u1sanb(takeover)> environment shelf Environment for channel 0a Number of shelves monitored: 1enabled: yes Environmental failure on shelves on this channel? yes
no power fault
s0u1sanb(takeover)> sysconfig -a *** This system has taken over s0u1sana NetApp Release 8.2.1 7-Mode: Fri System ID: xxxxxxxx (s0u1sanb); partner ID: yyyyyyy(s0u1sana)
System Storage Configuration: Single-Path HA System ACP Connectivity: Partial Connectivity
Interconnect Port: port not active memory mapped I/O base 0xdf400000, size 0x100000 prefetchable memory base 0xde800000, size 0x800000 slot 0: SAS Host Adapter 0a (PMC-Sierra PM8001 rev. C, SAS, <UP>) Firmware rev: 01.11.07.00 Base WWN: 5:00a098:001c4aa:70 Phy State: [0] Enabled, 6.0 Gb/s [1] Enabled, 6.0 Gb/s [2] Enabled, 6.0 Gb/s [3] Enabled, 6.0 Gb/s 00.0 : NETAPP X487_HCOBE600A10 NA00 560.0GB 520B/sect (KSG6V0MJ) 00.1 : NETAPP X487_HCOBE600A10 NA00 560.0GB 520B/sect (KSG6U2RJ) 00.2 : NETAPP X487_HCOBE600A10 NA00 560.0GB 520B/sect (KSG6V3HJ) 00.3 : NETAPP X487_HCOBE600A10 NA00 560.0GB 520B/sect (KSG6V82J) 00.4 : NETAPP X487_HCOBE600A10 NA00 560.0GB 520B/sect (KSG7TSLJ) 00.5 : NETAPP X487_HCOBE600A10 NA00 560.0GB 520B/sect (KSG6VKVJ) 00.6 : NETAPP X487_HCOBE600A10 NA00 560.0GB 520B/sect (KSG6A8MJ) 00.7 : NETAPP X487_HCOBE600A10 NA00 560.0GB 520B/sect (KSG7TZ0J) 00.8 : NETAPP X487_HCOBE600A10 NA00 560.0GB 520B/sect (KSG6V1DJ) 00.9 : NETAPP X487_HCOBE600A10 NA00 560.0GB 520B/sect (KSG6U8EJ) 00.10: NETAPP X487_HCOBE600A10 NA00 560.0GB 520B/sect (KSG6V0HJ) 00.11: NETAPP X487_HCOBE600A10 NA00 560.0GB 520B/sect (KSG6U9HJ) Shelf 0: DS2126E Firmware rev. IOM6E A: ---- IOM6E B: 0142 slot 0: SAS Host Adapter 0b (PMC-Sierra PM8001 rev. C, SAS, <OFFLINE (hard)>)
Firmware rev: 01.11.07.00 Base WWN: 5:00a098:001c4aa:74 Phy State: [4] Enabled, Rate unknown [5] Enabled, Rate unknown [6] Enabled, Rate unknown [7] Enabled, Rate unknown QSFP Vendor: Molex Inc. QSFP Part Number: 112-00176+A0 QSFP Type: Passive Copper 0.5m ID:00 QSFP Serial Number: 213820027
s0u1sanb(takeover)> cf status s0u1sanb has taken over s0u1sana.
So, I can connect only on SP prompt :
Which SP commands should I use to correct Theses issues :
- Environmental failure on shelves on this channel? yes
- Interconnect Port: port not active
- System Storage Configuration: Single-Path HA - System ACP Connectivity: Partial Connectivity
- slot 0: SAS Host Adapter 0b (PMC-Sierra PM8001 rev. C, SAS, <OFFLINE (hard)>)
Thanks a lot