2010-08-11 09:29 AM
Now that the subject of UPSs is up, I want to ask how to get the Netapp to restart on its own after the UPS has shut down the Netapp (gracefully) for a power outage, and the power has been restored. At that point the head needs to boot, but there is no documentation for how that is going to happen. I have had a long correspondence with Netapp tech support, in which they say that there is no way getting around the need to visit the computer room to restart the shelves, and then the head. I specifically asked about using RMON and was told it wouldn't help. This I find quite surprising, and because we have only one Netapp 3020, and it would inconvinence users to spend too much time experimenting, we haven't been able to test those claims.
The 3020 has no on/off switch, but I can find no indication of what would cause the head to restart after being halted. In our experience it does sometimes restart, but we haven't identified what those circumstances are, and the UPS man page doesn't elaborate. I believe it restarts when there is a transition from no power to power available, or when it receives a boot command from RMON. So if some power is available from the UPS, there is no signal from the UPS that will cause a restart.. Is that right?
I have a theory that we could could put the shelves on one UPS, and the head on another. Then by setting the head UPS to shutdown the head sooner than the shelf UPS shuts down the disks, and setting the head UPS to restart later (when power is restored) than the shelf UPS, it seems like we could meet the requirement laid out by tech support that the shelves must be turned on before the head. However, I have no idea what should turn on the head. After shutting down the head the UPS continues to deliver power. and because the shutdown head uses very little power, the UPS will never feel the need to turn itself off, hence it will never have the occasion to restore power to the head, which seems to be the only way to boot the head (unless RMON could be used for this purpose, Netapp tech support advice to the contrary).
I do find it hard to believe there isn't a standard, practical way of getting this done - it would seem to be pretty basic.
2010-08-12 12:00 AM
I never understood why would anyone wish to shutdown NetApp on power failure in the first place. This raises all sort of issues to synchronize NetApp shutdown with server shutdown and this becomes really unmanageable as number of servers grow.
Back to your question – if power was actually turned off, NetApp would just boot automatically when power is turned back on. The only problem is if NetApp was shut down and sits at CFE prompt and power was not switched off. But then it is exactly the same problem you face with any other server connected to UPS – server was shutdown but power returned before UPS had actually switched off. How do you manage this with normal server?
2010-08-12 05:22 AM
We need to shut down the Netapp when the power outage outlasts the batteries. The Netapp draws over 1,000 watts, and since our staff don't do much work when the power is out, it seems unnecessary to keep the fileserver up during an extended outage. Your circumstances may be different, but I don't think we are completely out of line in this thinking. After all, the Netapp supports the UPS shutdown instructions, if there were always enough power for every outage that would not be necessary.
The servers depending upon the Netapp can be turned on with WOL (Wake on Lan) after power is restored. The Netapp is the only device we have that does not respond to a WOL packet. The desktops are turned on manually when they are needed, this is not a problem.
Currently I am thinking of the following: Use 2 UPSs, one for the shelves and one for the head. Let either UPS shutdown the head when batteries run low. Program the UPS for the head to turn back on only after a delay, program the UPS for the shelves to turn on as soon as power is restored. Then the shelves will always be on before the head. But getting the head to restart at all seems to require it losing power entirely during the outage, which I don't know how to arrange.
2010-08-12 06:43 AM
OK, I see; I did not think about additional load.
As for server-NetApp dependency: I do not think about switching on. But you must ensure NetApp is not switched off as long as server are not yet completely shut down. I am just curious – how do you ensure this?
2010-08-12 09:29 AM
We shut down the servers quite quickly in the event of a power outage - 5 minutes into the outage for most of them. They power off in a minute or two after that - they have only /tmp storage anyway. The Netapp can stay up for 15 minutes, but we shut it down after about 10 minutes of time on battery. In our situation it isn't a problem if one or two servers have trouble booting, but I don't think that long shutdown times have been a problem, other than occasions when configuration problems totally prevented a shutdown. In that case I wouldn't have wanted the Netapp to wait for the server anyway.
So I wouldn't be interested in a protocol for shutting down systems in sequence, if it made later systems wait for earlier systems to give permission.