ONTAP Hardware

Powering down a failed controller

strider78

Hello all!

 

We have a V6220 single-chassis dual-controller filer (controllers are named Filer-A and Filer-B). Some time ago Filer-A had a hardware failure (maybe voltage regulators but I'm not 100% sure), was taken over by Filer-B, rebooted by watchdog, but didn't came up. Now it's in zombie state, dead by fact but still powered on. Whole system works through alive Filer-B.

 

Is it safe to issue "system power off" command from Filer-A Service Processor CLI? Sorry for such a stupid question, but I'm not an expert in storage and we have a very mission-critical database on this system.

1 ACCEPTED SOLUTION

aborzenkov
What is the reason to power it off? It is not required for replacement. I am not sure what "safe" means in this case. It should not affect another node in the same chassis if that was the question.

View solution in original post

4 REPLIES 4

aborzenkov
What is the reason to power it off? It is not required for replacement. I am not sure what "safe" means in this case. It should not affect another node in the same chassis if that was the question.

View solution in original post

strider78

The reason is simple, it's a last resort, I hope that power cycle may heal it. Yes, I wanted to make sure that powering it off won't harm a working filer. Thank you very much for an answer.

cedric_renauld

HEllo, 

 

I think your system is in Wauting for give back or maybe @loader prompt

Can you type before :

system console 

And check the state of your controler 

Anf on your survival controler, wath is the result of

cf status 

 

Thanks  

strider78

Filer-A was dead completely, "system console" showed nothing, "cf status" on a survived Filer-B showed only that it has taken over a partner. Interconnect link was down too.

Then I issued "system power off" on a dead filer's SP. Powered it on after 5 minutes.

Bingo!

All hardware (voltage) issues were auto-deasserted, ONTAP booted normally and now is ready for a giveback.

Now I'm sure that my system was hit by a 500 days uptime bug. Upgraded my SP's firmware to latest compatible version.

Thanks all for help.

Announcements
NetApp on Discord Image

We're on Discord, are you?

Live Chat, Watch Parties, and More!

Explore Banner

Meet Explore, NetApp’s digital sales platform

Engage digitally throughout the sales process, from product discovery to configuration, and handle all your post-purchase needs.

NetApp Insights to Action
I2A Banner
Public