190 lines
7.7 KiB
Text
190 lines
7.7 KiB
Text
Desc: How to use upssched
|
|
File: upssched.txt
|
|
Date: 01 October 2005
|
|
Auth: Russell Kroll <rkroll@exploits.org>
|
|
|
|
upssched is a helper for upsmon that will invoke commands for you at some
|
|
interval relative to a UPS event. It can be used to send pages, mail out
|
|
notices about things, or even shut down the box early.
|
|
|
|
There will be examples scattered throughout. Change them to suit your
|
|
pathnames, UPS locations, and so forth.
|
|
|
|
How it works relative to upsmon
|
|
===============================
|
|
|
|
When an event occurs, upsmon will call whatever you specify as a NOTIFYCMD
|
|
in your upsmon.conf, if you also enable the EXEC in your NOTIFYFLAGS. In
|
|
this case, we want upsmon to call upssched as the notifier, since it will
|
|
be doing all the work for us. So, in the upsmon.conf:
|
|
|
|
NOTIFYCMD /usr/local/ups/bin/upssched
|
|
|
|
Then we want upsmon to actually _use_ it for the notify events, so again
|
|
in the upsmon.conf we set the flags:
|
|
|
|
NOTIFYFLAG ONLINE SYSLOG+EXEC
|
|
NOTIFYFLAG ONBATT SYSLOG+WALL+EXEC
|
|
NOTIFYFLAG LOWBATT SYSLOG+WALL+EXEC
|
|
|
|
... and so on. For the purposes of this document I will only use those
|
|
three, but you can set the flags for any of the valid notify types.
|
|
|
|
Setting up your upssched.conf
|
|
=============================
|
|
|
|
Once upsmon has been configured with the NOTIFYCMD and EXEC flags, you're
|
|
ready to deal with the upssched.conf details. In this file, you specify
|
|
just what will happen when a given event occurs on a particular UPS.
|
|
|
|
First you need to define the name of the script or program that will
|
|
handle timers that trigger. This is your CMDSCRIPT, and needs to be above
|
|
any AT defines. There's an example provided with the program, so we'll
|
|
use that here:
|
|
|
|
CMDSCRIPT /usr/local/ups/bin/upssched-cmd
|
|
|
|
Then you have to define the variables PIPEFN and LOCKFN; the former
|
|
sets the file name of the FIFO that will pass communications between
|
|
processes to start and stop timers, while the latter sets the file name
|
|
for a temporary file created by upssched in order to avoid a race condition
|
|
under some circumstances. Please see the relevant comments in upssched.conf
|
|
for additional information and advice about these variables.
|
|
|
|
Now you can tell your CMDSCRIPT what to do when it is called by upsmon.
|
|
|
|
The big picture
|
|
---------------
|
|
|
|
The design in a nutshell is:
|
|
|
|
upsmon ---> calls upssched ---> calls your CMDSCRIPT
|
|
|
|
Ultimately, the CMDSCRIPT does the actual useful work, whether that's
|
|
initiating an early shutdown with 'upsmon -c fsd', sending a page by
|
|
calling sendmail, or opening a subspace channel to V'ger.
|
|
|
|
Establishing timers
|
|
-------------------
|
|
|
|
Let's say that you want to receive a page when any UPS has been running on
|
|
battery for 30 seconds. Create a handler that starts a 30 second timer
|
|
for an ONBATT condition.
|
|
|
|
AT ONBATT * START-TIMER onbattwarn 30
|
|
|
|
This means "when any UPS (the *) goes on battery, start a timer called
|
|
onbattwarn that will trigger in 30 seconds". We'll come back to the
|
|
onbattwarn part in a moment. Right now we need to make sure that we
|
|
don't trigger that timer if the UPS happens to come back before the
|
|
time is up. In essence, if it goes back on line, we need to cancel it.
|
|
So, let's tell upssched that.
|
|
|
|
AT ONLINE * CANCEL-TIMER onbattwarn
|
|
|
|
Executing commands immediately
|
|
------------------------------
|
|
|
|
As an example, consider the scenario where a UPS goes onto battery power.
|
|
However, the users are not informed until 60 seconds later - using a timer as
|
|
described above. Whilst this may let the *logged in* users know that the UPS
|
|
is on battery power, it does not inform any users subsequently logging in. To
|
|
enable this we could, at the same time, create a file which is read and
|
|
displayed to any user trying to login whilst the UPS is on battery power. If
|
|
the UPS comes back onto utility power within 60 seconds, then we can cancel
|
|
the timer and remove the file, as described above. However, if the UPS comes
|
|
back onto utility power say 5 minutes later then we do not want to use any
|
|
timers but we still want to remove the file. To do this we could use:
|
|
|
|
AT ONLINE * EXECUTE ups-back-on-power
|
|
|
|
This means that when upsmon detects that the UPS is back on utility power it
|
|
will signal upssched. Upssched will see the above command and simply pass
|
|
'ups-back-on-power' as an argument directly to CMDSCRIPT. This occurs
|
|
immediately, there are no timers involved.
|
|
|
|
Writing the command script handler
|
|
----------------------------------
|
|
|
|
OK, now that upssched knows how the timers are supposed to work, let's
|
|
give it something to do when one actually triggers. The name of the
|
|
example timer is onbattwarn, so that's the argument that will be passed
|
|
into your CMDSCRIPT when it triggers. This means we need to do some
|
|
shell script writing to deal with that input.
|
|
|
|
#! /bin/sh
|
|
|
|
case $1 in
|
|
onbattwarn)
|
|
echo "The UPS has been on battery for awhile" \
|
|
| mail -s"UPS monitor" bofh@pager.example.com
|
|
ups-back-on-power)
|
|
/bin/rm -f /some/path/ups-on-battery
|
|
*)
|
|
logger -t upssched-cmd "Unrecognized command: $1"
|
|
;;
|
|
esac
|
|
|
|
This is a very simple script example, but it shows how you can test for
|
|
the presence of a given trigger. With multiple ATs creating various timer
|
|
names, you will need to test for each possibility and handle it according
|
|
to your desires.
|
|
|
|
Other possibilities
|
|
===================
|
|
|
|
You can invoke just about anything from inside the CMDSCRIPT. It doesn't
|
|
need to be a shell script, either - that's just an example. If you want
|
|
to write a program that will parse argv[1] and deal with the
|
|
possibilities, that will work too.
|
|
|
|
Early Shutdowns
|
|
===============
|
|
|
|
One thing that gets requested a lot is early shutdowns in upsmon. With
|
|
upssched, you can now have this functionality. Just set a timer for some
|
|
length of time at ONBATT which will invoke a shutdown command if it elapses.
|
|
Just be sure to cancel this timer if you go back ONLINE before then.
|
|
|
|
The best way to do this is to use the upsmon callback feature. You can
|
|
make upsmon set the "forced shutdown" (FSD) flag on the upsd so your
|
|
slave systems shut down early too. Just do something like this in your
|
|
CMDSCRIPT:
|
|
|
|
/usr/local/ups/sbin/upsmon -c fsd
|
|
|
|
It's not a good idea to call your system's shutdown routine directly
|
|
from the CMDSCRIPT, since there's no synchronization with the slave
|
|
systems hooked to the same UPS. FSD is the master's way of saying
|
|
"we're shutting down *now* like it or not, so you'd better get ready".
|
|
|
|
Background
|
|
==========
|
|
|
|
This program was written primarily to fulfill the requests of users for
|
|
the early shutdown scenario. The "outboard" design of the program
|
|
(relative to upsmon) was intended to reduce the load on the average
|
|
system. Most people don't have the requirement of shutting down after n
|
|
seconds on battery, since the usual OB+LB testing is sufficient.
|
|
|
|
This program was created separately so those people don't have to spend
|
|
CPU time and RAM on something that will never be used in their
|
|
environments.
|
|
|
|
The design of the timer handler is also geared towards minimizing impact.
|
|
It will come and go from the process list as necessary. When a new timer
|
|
is started, a process will be forked to actually watch the clock and
|
|
eventually start the CMDSCRIPT. When a timer triggers, it is removed from
|
|
the queue. Cancelling a timer will also remove it from the queue. When
|
|
no timers are present in the queue, the background process exits.
|
|
|
|
This means that you will only see upssched running when one of two things
|
|
is happening:
|
|
|
|
1. There's a timer of some sort currently running
|
|
2. upsmon just called it, and you managed to catch the brief instance
|
|
|
|
The final optimization handles the possibility of trying to cancel a timer
|
|
when there's none running. If there's no process already running, there
|
|
are no timers to cancel, and furthermore there is no need to start a
|
|
clock-watcher. As a result, it skips that step and exits sooner.
|