nut/docs/shutdown.txt
2010-03-26 00:20:59 +01:00

230 lines
7.9 KiB
Text

Desc: Configuring automatic UPS shutdowns
File: shutdown.txt
Date: 24 August 2003
Auth: Russell Kroll <rkroll@exploits.org>
Shutdown design
===============
When your UPS batteries get low, the operating system needs to be brought
down cleanly. Also, the UPS load should be turned off so that all devices
that are attached to it are forcibly rebooted.
Here are the steps that occur when a critical power event happens:
1. The UPS goes on battery
2. The UPS reaches low battery (a "critical" UPS)
3. The upsmon master notices and sets "FSD" - the "forced shutdown"
flag to tell all slave systems that it will soon power down the load.
(If you have no slaves, skip to step 6)
4. upsmon slave systems see "FSD" and:
- generate a NOTIFY_SHUTDOWN event
- wait FINALDELAY seconds - typically 5
- call their SHUTDOWNCMD
- disconnect from upsd
5. The upsmon master system waits up to HOSTSYNC seconds (typically 15)
for the slaves to disconnect from upsd. If any are connected after
this time, upsmon stops waiting and proceeds with the shutdown
process.
6. The upsmon master:
- generates a NOTIFY_SHUTDOWN event
- waits FINALDELAY seconds - typically 5
- creates the POWERDOWNFLAG file - usually /etc/killpower
- calls the SHUTDOWNCMD
7. On most systems, init takes over, kills your processes, syncs and
unmounts some filesystems, and remounts some read-only.
8. init then runs your shutdown script. This checks for the
POWERDOWNFLAG, finds it, and tells the UPS driver(s) to power off
the load.
9. The system loses power.
10. Time passes. The power returns, and the UPS switches back on.
11. All systems reboot and go back to work.
How you set it up
=================
1. Make sure your POWERDOWNFLAG setting in upsmon.conf points somewhere
reasonable. Specifically, that filesystem must be mounted when your
shutdown script runs.
2. Edit your shutdown scripts to check for the POWERDOWNFLAG so they know
when to power off the UPS. You must check for this file, as you don't
want this to happen during normal shutdowns!
You can use upsdrvctl to start the shutdown process in your UPS
hardware. Use this script as an example, but change the paths to
suit your system:
if (test -f /etc/killpower)
then
echo "Killing the power, bye!"
/usr/local/ups/bin/upsdrvctl shutdown
sleep 120
# uh oh... the UPS poweroff failed!
# you probably should reboot here to avoid getting stuck
# *** see the section on power races below ***
fi
Make sure the filesystem containing upsdrvctl, ups.conf and your UPS
driver(s) is mounted when the system gets to this point. Otherwise
it won't be able to figure out what to do.
RAID warning
============
NOTE: If you run any sort of RAID equipment, make sure your arrays
are either halted (if possible) or switched to "read-only" mode.
Otherwise you may suffer a long resync once the system comes back up.
The kernel may not ever run its final shutdown procedure, so you
must take care of all array shutdowns in userspace before upsdrvctl
runs.
If you use software RAID (md) on Linux, get mdadm and try using
'mdadm --readonly' to put your arrays in a safe state. This has to
happen after your shutdown scripts have remounted the filesystems.
On hardware RAID or other kernels, you have to do some detective work.
It may be necessary to contact the vendor or the author of your
driver to find out how to put the array in a state where a power loss
won't leave it "dirty".
My understanding is that 3ware devices on Linux will be fine unless
there are pending writes. Make sure your filesystems are remounted
read-only and you should be covered.
Multiple UPS shutdowns
======================
If you have multiple UPSes connected to your system, chances are that you
need to shut them down in a specific order. The goal is to shut down
everything but the one keeping upsmon alive at first, then you do that one
last.
To set the order in which your UPSes receive the shutdown commands, define
the "sdorder" value in your ups.conf.
[bigone]
driver = apcsmart
port = /dev/ttyS0
sdorder = 2
[littleguy]
driver = bestups
port = /dev/ttyS1
sdorder = 1
[misc]
driver = megatec
port = /dev/ttyS2
sdorder = 0
The order runs from 0 to the highest number available. So, for this
configuration, the order of shutdowns would be misc, littleguy, and then
bigone.
If you have a UPS that shouldn't be shutdown when running "upsdrvctl
shutdown", set the sdorder to -1.
Testing shutdowns
=================
To see how upsdrvctl will behave without actually turning off power, use
the -t argument. It will display the sequence without actually calling
the drivers.
Other issues
============
You may delete the POWERDOWNFLAG in the startup scripts, but it is not
necessary. upsmon will clear that file for you when it starts.
Remember that some operating systems unmount a good number of filesystems
when going into read-only mode. If the UPS software is installed to /usr
and it's not mounted, your shutdowns will fail. If this happens, either
make sure it stays mounted at shutdown, or install to another partition.
Power races
===========
There is a situation where the power may return during the shutdown
process. This is known as a race. Here's how we handle it.
"Smart" UPSes typically handle this by using a command that forces the UPS
to power the load off and back on. This way, you are assured that the
systems will restart even if the power returns at the worst possible
moment.
Contact closure units (ala genericups), on the other hand, have the
potential for a race when feeding multiple systems. This is due to the
design of most contact closure UPSes. Typically, the "kill power" line
only functions when running on battery. As a result, if the line power
returns during the shutdown process, there is no way to power down the
load.
The workaround is to force your systems to reboot after some
interval. This way, they won't be stuck in the halted state with the UPS
running on line power.
Testing power races
===================
The easiest way to see if your configuration will handle a power race
successfully is to do 'upsmon -c fsd'. This will force the UPS software
to shut down as if it had a OB+LB situation, and your shutdown script
should call the UPS driver(s) in shutdown mode.
If everything works correctly, the computer will be forcibly powered off,
may remain off for a few seconds to a few minutes (depending on the
driver and UPS type), then will power on again.
If your UPS just sits there and never resets the load, you are vulnerable
to the above power race and should add the "reboot after timeout" hack
at the very least.
Know your hardware
==================
UPS equipment varies from manufacturer to manufacturer and even within
model lines. You should test the shutdown sequence on your systems before
leaving them unattended. A successful sequence is one where the OS halts
before the battery runs out, and the system restarts when power returns.
One more tip
============
If your UPS powers up immediately after a power failure instead of
waiting for the batteries to recharge, you can rig up a little hack to
handle it in software.
Essentially, you need to test for the POWERDOWNFLAG in your *startup*
scripts while the filesystems are still read-only. If it's there, you
know your last shutdown was caused by a power failure and the UPS
battery is probably still quite weak.
In this situation, your best bet is to sleep it off. Pausing in your
startup script to let the batteries recharge with the filesystems in a
safe state is recommended. This way, if the power goes out again, you
won't face a situation where there's not enough battery capacity left
for upsmon to do its thing.
Exactly how long to wait is a function of your UPS hardware, and will
require careful testing.
If this is too evil for you, buy another kind of UPS that will either
wait for a minimum amount of charge, a minimum amount of time, or both.