EF & E-Series, SANtricity, and Related Plug-ins
EF & E-Series, SANtricity, and Related Plug-ins
Hello everyone,
Basically We have an array 2540-M2 with dual controllers, one of them gets EO L7 code in the 7 segment display.
we've tried to clear 'held in reset status' using one of these commands run from the active controller without success:
setControllerToActive_MT 1
resetController_MT 1
altResetRelease
cmgrSetAltToOptimal
As you can see below at the beginning the state is Lockdown:
Primary Ctlr 0x2c84e50
State : Optimal
Serial : SV23501785
ActualApp : 07844710
CurrentApp : 07844710
ExpectedApp : 07844710
BootVersion : 07844710
ProductId/Rev : LCSM100_F /0784
ModelName : 2680
BoardId : 2660
HostBoardId1 : 0801
HostBoardId2 :
Vendor : SUN
PartNumber : 45234-06
OEMPartNumber : 7053352
Manufactured : 50396700
Cache/Proc : 1696/352
PhyCacheSize : 1696
FRCMemSize : 209715200
ForeignState : 1
IsInitialized : 1
FailedChannels : 0
IocFaulted : 0
FailedMirChan : 0
SubModel : 191
SubModelSupport: 1
LockdownReason : 255
NumNetIfaces : 2
Alternate Ctlr 0x2c86630
State : Lockdown <---------------------------
Serial : SV30301683
ActualApp : 07844710
CurrentApp : 07844710
ExpectedApp : 07844710
BootVersion : 07844710
ProductId/Rev : LCSM100_F /0784
ModelName : 2680
BoardId : 2660
HostBoardId1 : 0801
HostBoardId2 :
Vendor : SUN
PartNumber : 45234-06
OEMPartNumber : 7053352
Manufactured : 50f34a80
Cache/Proc : 1696/352
PhyCacheSize : 1696
FRCMemSize : 209715200
ForeignState : 0
IsInitialized : 0
FailedChannels : 0
IocFaulted : 0
FailedMirChan : 0
SubModel : 191
SubModelSupport: 1
LockdownReason : 255
NumNetIfaces : 2
But when we tried to run altResetRelease it seems Ctrl A (alternate) gets Optimal status:
Primary Ctlr 0x2c84e50
State : Optimal
Serial : SV23501785
ActualApp : 07844710
CurrentApp : 07844710
ExpectedApp : 07844710
BootVersion : 07844710
ProductId/Rev : LCSM100_F /0784
ModelName : 2680
BoardId : 2660
HostBoardId1 : 0801
HostBoardId2 :
Vendor : SUN
PartNumber : 45234-06
OEMPartNumber : 7053352
Manufactured : 50396700
Cache/Proc : 1696/352
PhyCacheSize : 1696
FRCMemSize : 209715200
ForeignState : 1
IsInitialized : 1
FailedChannels : 0
IocFaulted : 0
FailedMirChan : 0
SubModel : 191
SubModelSupport: 1
LockdownReason : 255
NumNetIfaces : 2
Alternate Ctlr 0x2c86630
State : Optimal <----------------------
Serial : SV30301683
ActualApp : 07844710
CurrentApp : 07844710
ExpectedApp : 07844710
BootVersion : 07844710
ProductId/Rev : LCSM100_F /0784
ModelName : 2680
BoardId : 2660
HostBoardId1 : 0801
HostBoardId2 :
Vendor : SUN
PartNumber : 45234-06
OEMPartNumber : 7053352
Manufactured : 50f34a80
Cache/Proc : 1696/352
PhyCacheSize : 1696
FRCMemSize : 209715200
ForeignState : 0
IsInitialized : 0
FailedChannels : 0
IocFaulted : 0
FailedMirChan : 0
SubModel : 191
SubModelSupport: 1
LockdownReason : 255
NumNetIfaces : 2
But after a few minutes Ctrl A shows Lockdown status and 7 segment display as EO L7 again.
We've tried to replace this Controller twice as well without sucess either.
As far as I know the code EO L7 means Sub-Model Identifier Not Set or Mismatched but as you can see above in the cmgrShow command
the Submodel-ID is the same for both controllers (SubModel : 191)
Please, any advise/idea?
Thanks in advance,
Regards.
Cristian
Hi there,
E-Series systems enter lockdown mode primarily to protect user data - therefore the best idea is to raise a support case if the system is under support. Is this applicable in this scenario?
Hi Alex,
unfortunatelly there is no valid Service Contract with NetApp for this Array.
So with the scenario provided ... Do you have any idea wht the Alternate Ctrl gets EO L7 code in the 7 segment display after apply
commands to clear held in reset situation? Mainly when cmgrShow commadn shows the Submodel-ID is the same for both controllers (SubModel : 191).
OE+L7 normally means the controllers sub-model is not matched so this is the main reason I don't understand this behaviour, Submodel is the same here.
The 7 segment display will only show one error but there can be more, in fact we have replaced the controller twice, rebooted the active controller with the faulty one removed but no sucess.
As far as i know to clear 'held in reset status' using one of these commands run from the active controller:
setControllerToActive_MT 1
resetController_MT 1
altResetRelease
cmgrSetAltToOptimal
But any of these commands didn't solvethis issue so far, PLease ... any idea?
Thank you,
REgards.
Cristian
What is the history of this system?
Have you tried "lemClearLockdown"?
Please run shell command "getMfgSubModelId" and post response
HI Alex,
basically one Controller in this array got a lockdown status, we tried to release as I mentioned previously from Shell commands.
Custoemr wanted to replace hardware in order to isloate this scenario so Ctrl was replaced twice and recently Midplane was replacewd as well,
obviosly the situation is the same, alternate ctrl A is showing EO LU currently so we are thinking to apply this action plan:
-- Log into the serial port of controller A that has the "LU" status.
When logged into the serial port, you will see...
Press within 5 seconds: <S> for Service Interface, <BREAK> for baud rate
----> Press ESC to get into the serial interface
login: shellUsr
Password:
->
Serial Port shell started.
-> loadDebug
-> lemClearLockdown
-> cmgrShow
Do you agree? Any other idea/advise?
Thank you,
Regards.
Cristian
Sounds like a good start - when you get into the shell please run the "getMfgSubModelId" command - you may have received a functionally similar part from another subvendor, which is not modifiable in the field 😞
Hi Alex,
first of all, thank you for your replies and support, I'll collect this command output tonight when we will apply
the action plan mentioned before.
So ... Are you thinking the SubModel-ID are still different? This command getMfgSubModelId is to get information abolut SubModelId?
But I can see the SubModel-ID seems equal:
Primary Ctlr 0x3198700
State : Optimal
Serial : SV23501785
ActualApp : 07844710
CurrentApp : 07844710
ExpectedApp : 07844710
BootVersion : 07844710
ProductId/Rev : LCSM100_F /0784
ModelName : 2680
BoardId : 2660
HostBoardId1 : 0801
HostBoardId2 :
Vendor : SUN
PartNumber : 45234-06
OEMPartNumber : 7053352
Manufactured : 50396700
Cache/Proc : 1696/352
PhyCacheSize : 1696
FRCMemSize : 209715200
ForeignState : 1
IsInitialized : 1
FailedChannels : 0
IocFaulted : 0
FailedMirChan : 0
SubModel : 191 <<-------------------------------
SubModelSupport: 1
LockdownReason : 255
NumNetIfaces : 2
Alternate Ctlr 0x3199ee0
State : Optimal
Serial : SV30301683
ActualApp : 07844710
CurrentApp : 07844710
ExpectedApp : 07844710
BootVersion : 07844710
ProductId/Rev : LCSM100_F /0784
ModelName : 2680
BoardId : 2660
HostBoardId1 : 0801
HostBoardId2 :
Vendor : SUN
PartNumber : 45234-06
OEMPartNumber : 7053352
Manufactured : 50f34a80
Cache/Proc : 1696/352
PhyCacheSize : 1696
FRCMemSize : 209715200
ForeignState : 0
IsInitialized : 0
FailedChannels : 0
IocFaulted : 0
FailedMirChan : 0
SubModel : 191 <<-------------------------------
SubModelSupport: 1
LockdownReason : 255
NumNetIfaces : 2
Thank you,
Regards.
Thanks for clarifying - that was my guess, but it seems not to be. Unfortunately I have no further suggestions. Our documentation does not cover any scenario where this is not the cause of this error.
Better to use command in shell mode
getMfgSubModelId
getCurrentSubModelId
Value Controller State Description
L0 Suspended Mismatched controller types
L1 Suspended Missing interconnect canister
L2 Suspended Persistent memory errors
L3 Suspended Persistent hardware errors
L4 Suspended Persistent data protection errors
L5 Suspended ACS failure
L6 Suspended Unsupported host card
L7 Suspended Sub-model identifier not set or mismatched
L8 Suspended Memory configuration error
L9 Suspended Link Speed Mismatch
LA Suspended Reserved
Lb Suspended Host Card configuration error
LC Suspended Persistent cache backup configuration error
Ld Suspended Mixed cache memory DIMMs
LE Suspended Uncertified cache memory DIMM Sizes
LF Suspended Lockdown with limited SYMbol support
LH Suspended Controller Firmware Mismatch
LL Suspended Unable to Access Either Midplane SBB
Ln Suspended Canister not valid for enclosure
LP Suspended Drive port mapping tables not found
LU Suspended SOD reboot limit exceeded
Hi All,
Sorry for the delay but I was on vacation, first of all thank you for all replies.
To sum up this issue, it was a weird situation, we were able to solve this problem
once both Ctrls had the same SubModelId which was seen correctly through getCurrentSubModelId command.
We ordered the same part number to replace here (7053343 2GB FC-AL Controller Module) but it seems
that the OEMPartNumber was different for the same part number provided as you can see below:
-> getCurrentSubModelId
value = 149 = 0x95
OEMPartNumber : 7011128
-> getCurrentSubModelId
value = 191 = 0xbf
OEMPartNumber : 7053352
Once, this issue was solved and lockdown status was cleared by lemClearLockdown command everything was fine:
Primary Ctlr 0x3198700 State : Optimal Serial : SV20922442 ActualApp : 07844710 CurrentApp : 07844710 ExpectedApp : 07844710 BootVersion : 07844710 ProductId/Rev : LCSM100_F /0784 ModelName : 2680 BoardId : 2660 HostBoardId1 : 0801 HostBoardId2 : Vendor : SUN PartNumber : 45233-06 OEMPartNumber : 7011128 Manufactured : 4f52b080 Cache/Proc : 1696/352 PhyCacheSize : 1696 FRCMemSize : 209715200 ForeignState : 1 IsInitialized : 1 FailedChannels : 0 IocFaulted : 0 FailedMirChan : 0 SubModel : 149 SubModelSupport: 1 LockdownReason : 255 NumNetIfaces : 2 |
Alternate Ctlr 0x3199ee0 State : Optimal Serial : SV22808449 ActualApp : 07844710 CurrentApp : 07844710 ExpectedApp : 07844710 BootVersion : 07844710 ProductId/Rev : LCSM100_F /0784 ModelName : 2680 BoardId : 2660 HostBoardId1 : 0801 HostBoardId2 : Vendor : SUN PartNumber : 45233-06 OEMPartNumber : 7011128 Manufactured : 4ffb7080 Cache/Proc : 1696/352 PhyCacheSize : 1696 FRCMemSize : 209715200 ForeignState : 1 IsInitialized : 1 FailedChannels : 0 IocFaulted : 0 FailedMirChan : 0 SubModel : 149 SubModelSupport: 1 LockdownReason : 255 NumNetIfaces : 2 |
And I have to say CAM reports part numbers 7053352 and 7011128 in the supportdata but the correct part number is 7053343.
Do you know any idea about this behavior?
Thank you,
Regards.
Cristian
Hi Cristian,
I don't have any information available about that, sorry.
Glad to hear you're back up and running!