390 lines
14 KiB
Groff
390 lines
14 KiB
Groff
'\" t
|
|
.\" Title: upsmon.conf
|
|
.\" Author: [FIXME: author] [see http://docbook.sf.net/el/author]
|
|
.\" Generator: DocBook XSL Stylesheets v1.75.2 <http://docbook.sf.net/>
|
|
.\" Date: 10/09/2011
|
|
.\" Manual: NUT Manual
|
|
.\" Source: Network UPS Tools
|
|
.\" Language: English
|
|
.\"
|
|
.TH "UPSMON\&.CONF" "5" "10/09/2011" "Network UPS Tools" "NUT Manual"
|
|
.\" -----------------------------------------------------------------
|
|
.\" * Define some portability stuff
|
|
.\" -----------------------------------------------------------------
|
|
.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
.\" http://bugs.debian.org/507673
|
|
.\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html
|
|
.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
.ie \n(.g .ds Aq \(aq
|
|
.el .ds Aq '
|
|
.\" -----------------------------------------------------------------
|
|
.\" * set default formatting
|
|
.\" -----------------------------------------------------------------
|
|
.\" disable hyphenation
|
|
.nh
|
|
.\" disable justification (adjust text to left margin only)
|
|
.ad l
|
|
.\" -----------------------------------------------------------------
|
|
.\" * MAIN CONTENT STARTS HERE *
|
|
.\" -----------------------------------------------------------------
|
|
.SH "NAME"
|
|
upsmon.conf \- Configuration for Network UPS Tools upsmon
|
|
.SH "DESCRIPTION"
|
|
.sp
|
|
This file\(cqs primary job is to define the systems that \fBupsmon\fR(8) will monitor and to tell it how to shut down the system when necessary\&. It will contain passwords, so keep it secure\&. Ideally, only the upsmon process should be able to read it\&.
|
|
.sp
|
|
Additionally, other optional configuration values can be set in this file\&.
|
|
.SH "CONFIGURATION DIRECTIVES"
|
|
.PP
|
|
\fBDEADTIME\fR \fIseconds\fR
|
|
.RS 4
|
|
upsmon allows a UPS to go missing for this many seconds before declaring it "dead"\&. The default is 15 seconds\&.
|
|
.sp
|
|
upsmon requires a UPS to provide status information every few seconds (see POLLFREQ and POLLFREQALERT) to keep things updated\&. If the status fetch fails, the UPS is marked stale\&. If it stays stale for more than DEADTIME seconds, the UPS is marked dead\&.
|
|
.sp
|
|
A dead UPS that was last known to be on battery is assumed to have changed to a low battery condition\&. This may force a shutdown if it is providing a critical amount of power to your system\&. This seems disruptive, but the alternative is barreling ahead into oblivion and crashing when you run out of power\&.
|
|
.sp
|
|
Note: DEADTIME should be a multiple of POLLFREQ and POLLFREQALERT\&. Otherwise, you\(cqll have "dead" UPSes simply because upsmon isn\(cqt polling them quickly enough\&. Rule of thumb: take the larger of the two POLLFREQ values, and multiply by 3\&.
|
|
.RE
|
|
.PP
|
|
\fBFINALDELAY\fR \fIseconds\fR
|
|
.RS 4
|
|
When running in master mode, upsmon waits this long after sending the NOTIFY_SHUTDOWN to warn the users\&. After the timer elapses, it then runs your SHUTDOWNCMD\&. By default this is set to 5 seconds\&.
|
|
.sp
|
|
If you need to let your users do something in between those events, increase this number\&. Remember, at this point your UPS battery is almost depleted, so don\(cqt make this too big\&.
|
|
.sp
|
|
Alternatively, you can set this very low so you don\(cqt wait around when it\(cqs time to shut down\&. Some UPSes don\(cqt give much warning for low battery and will require a value of 0 here for a safe shutdown\&.
|
|
.if n \{\
|
|
.sp
|
|
.\}
|
|
.RS 4
|
|
.it 1 an-trap
|
|
.nr an-no-space-flag 1
|
|
.nr an-break-flag 1
|
|
.br
|
|
.ps +1
|
|
\fBNote\fR
|
|
.ps -1
|
|
.br
|
|
If FINALDELAY on the slave is greater than HOSTSYNC on the master, the master will give up waiting for the slave to disconnect\&.
|
|
.sp .5v
|
|
.RE
|
|
.RE
|
|
.PP
|
|
\fBHOSTSYNC\fR \fIseconds\fR
|
|
.RS 4
|
|
upsmon will wait up to this many seconds in master mode for the slaves to disconnect during a shutdown situation\&. By default, this is 15 seconds\&.
|
|
.sp
|
|
When a UPS goes critical (on battery + low battery, or "FSD": forced shutdown), the slaves are supposed to disconnect and shut down right away\&. The HOSTSYNC timer keeps the master upsmon from sitting there forever if one of the slaves gets stuck\&.
|
|
.sp
|
|
This value is also used to keep slave systems from getting stuck if the master fails to respond in time\&. After a UPS becomes critical, the slave will wait up to HOSTSYNC seconds for the master to set the FSD flag\&. If that timer expires, the slave will assume that the master is broken and will shut down anyway\&.
|
|
.sp
|
|
This keeps the slaves from shutting down during a short\-lived status change to "OB LB" that the slaves see but the master misses\&.
|
|
.RE
|
|
.PP
|
|
\fBMINSUPPLIES\fR \fInum\fR
|
|
.RS 4
|
|
Set the number of power supplies that must be receiving power to keep this system running\&. Normal computers have just one power supply, so the default value of 1 is acceptable\&.
|
|
.sp
|
|
Large/expensive server type systems usually have more, and can run with a few missing\&. The HP NetServer LH4 can run with 2 out of 4, for example, so you\(cqd set it to 2\&. The idea is to keep the box running as long as possible, right?
|
|
.sp
|
|
Obviously you have to put the redundant supplies on different UPS circuits for this to make sense! See big\-servers\&.txt in the docs subdirectory for more information and ideas on how to use this feature\&.
|
|
.sp
|
|
Also see the section on "power values" in
|
|
\fBupsmon\fR(8)\&.
|
|
.RE
|
|
.PP
|
|
\fBMONITOR\fR \fIsystem\fR \fIpowervalue\fR \fIusername\fR \fIpassword\fR \fItype\fR
|
|
.RS 4
|
|
Each UPS that you need to be monitor should have a MONITOR line\&. Not all of these need supply power to the system that is running upsmon\&. You may monitor other systems if you want to be able to send notifications about status changes on them\&.
|
|
.RE
|
|
.sp
|
|
You must have at least one MONITOR directive in upsmon\&.conf\&.
|
|
.sp
|
|
\fIsystem\fR is a UPS identifier\&. It is in this form:
|
|
.sp
|
|
<upsname>[@<hostname>[:<port>]]
|
|
.sp
|
|
The default hostname is "localhost"\&. Some examples:
|
|
.sp
|
|
.RS 4
|
|
.ie n \{\
|
|
\h'-04'\(bu\h'+03'\c
|
|
.\}
|
|
.el \{\
|
|
.sp -1
|
|
.IP \(bu 2.3
|
|
.\}
|
|
"su700@mybox" means a UPS called "su700" on a system called "mybox"\&. This is the normal form\&.
|
|
.RE
|
|
.sp
|
|
.RS 4
|
|
.ie n \{\
|
|
\h'-04'\(bu\h'+03'\c
|
|
.\}
|
|
.el \{\
|
|
.sp -1
|
|
.IP \(bu 2.3
|
|
.\}
|
|
"fenton@bigbox:5678" is a UPS called "fenton" on a system called "bigbox" which runs
|
|
\fBupsd\fR(8)
|
|
on port "5678"\&.
|
|
.RE
|
|
.sp
|
|
\fIpowervalue\fR is an integer representing the number of power supplies that the UPS feeds on this system\&. Most normal computers have one power supply, and the UPS feeds it, so this value will be 1\&. You need a very large or special system to have anything higher here\&.
|
|
.sp
|
|
You can set the \fIpowervalue\fR to 0 if you want to monitor a UPS that doesn\(cqt actually supply power to this system\&. This is useful when you want to have upsmon do notifications about status changes on a UPS without shutting down when it goes critical\&.
|
|
.sp
|
|
The \fIusername\fR and \fIpassword\fR on this line must match an entry in that system\(cqs \fBupsd.users\fR(5)\&. If your username is "monmaster" and your password is "blah", the MONITOR line might look like this:
|
|
.sp
|
|
MONITOR myups@bigserver 1 monmaster blah master
|
|
.sp
|
|
Meanwhile, the upsd\&.users on bigserver would look like this:
|
|
.sp
|
|
.if n \{\
|
|
.RS 4
|
|
.\}
|
|
.nf
|
|
[monmaster]
|
|
password = blah
|
|
upsmon master # (or slave)
|
|
.fi
|
|
.if n \{\
|
|
.RE
|
|
.\}
|
|
.sp
|
|
The \fItype\fR refers to the relationship with \fBupsd\fR(8)\&. It can be either "master" or "slave"\&. See \fBupsmon\fR(8) for more information on the meaning of these modes\&. The mode you pick here also goes in the upsd\&.users file, as seen in the example above\&.
|
|
.PP
|
|
\fBNOCOMMWARNTIME\fR \fIseconds\fR
|
|
.RS 4
|
|
upsmon will trigger a NOTIFY_NOCOMM after this many seconds if it can\(cqt reach any of the UPS entries in this configuration file\&. It keeps warning you until the situation is fixed\&. By default this is 300 seconds\&.
|
|
.RE
|
|
.PP
|
|
\fBNOTIFYCMD\fR \fIcommand\fR
|
|
.RS 4
|
|
upsmon calls this to send messages when things happen\&.
|
|
.sp
|
|
This command is called with the full text of the message as one argument\&. The environment string NOTIFYTYPE will contain the type string of whatever caused this event to happen\&.
|
|
.sp
|
|
If you need to use
|
|
\fBupssched\fR(8), then you must make it your NOTIFYCMD by listing it here\&.
|
|
.sp
|
|
Note that this is only called for NOTIFY events that have EXEC set with NOTIFYFLAG\&. See NOTIFYFLAG below for more details\&.
|
|
.sp
|
|
Making this some sort of shell script might not be a bad idea\&. For more information and ideas, see pager\&.txt in the docs directory\&.
|
|
.sp
|
|
Remember, this command also needs to be one element in the configuration file, so if your command has spaces, then wrap it in quotes\&.
|
|
.sp
|
|
NOTIFYCMD "/path/to/script \-\-foo \-\-bar"
|
|
.sp
|
|
This script is run in the background\(emthat is, upsmon forks before it calls out to start it\&. This means that your NOTIFYCMD may have multiple instances running simultaneously if a lot of stuff happens all at once\&. Keep this in mind when designing complicated notifiers\&.
|
|
.RE
|
|
.PP
|
|
\fBNOTIFYMSG\fR \fItype\fR \fImessage\fR
|
|
.RS 4
|
|
upsmon comes with a set of stock messages for various events\&. You can change them if you like\&.
|
|
.sp
|
|
.if n \{\
|
|
.RS 4
|
|
.\}
|
|
.nf
|
|
NOTIFYMSG ONLINE "UPS %s is getting line power"
|
|
.fi
|
|
.if n \{\
|
|
.RE
|
|
.\}
|
|
.sp
|
|
.if n \{\
|
|
.RS 4
|
|
.\}
|
|
.nf
|
|
NOTIFYMSG ONBATT "Someone pulled the plug on %s"
|
|
.fi
|
|
.if n \{\
|
|
.RE
|
|
.\}
|
|
.sp
|
|
Note that
|
|
%s
|
|
is replaced with the identifier of the UPS in question\&.
|
|
.sp
|
|
The message must be one element in the configuration file, so if it contains spaces, you must wrap it in quotes\&.
|
|
.sp
|
|
.if n \{\
|
|
.RS 4
|
|
.\}
|
|
.nf
|
|
NOTIFYMSG NOCOMM "Someone stole UPS %s"
|
|
.fi
|
|
.if n \{\
|
|
.RE
|
|
.\}
|
|
.sp
|
|
Possible values for
|
|
\fItype\fR:
|
|
.PP
|
|
ONLINE
|
|
.RS 4
|
|
UPS is back online
|
|
.RE
|
|
.PP
|
|
ONBATT
|
|
.RS 4
|
|
UPS is on battery
|
|
.RE
|
|
.PP
|
|
LOWBATT
|
|
.RS 4
|
|
UPS is on battery and has a low battery (is critical)
|
|
.RE
|
|
.PP
|
|
FSD
|
|
.RS 4
|
|
UPS is being shutdown by the master (FSD = "Forced Shutdown")
|
|
.RE
|
|
.PP
|
|
COMMOK
|
|
.RS 4
|
|
Communications established with the UPS
|
|
.RE
|
|
.PP
|
|
COMMBAD
|
|
.RS 4
|
|
Communications lost to the UPS
|
|
.RE
|
|
.PP
|
|
SHUTDOWN
|
|
.RS 4
|
|
The system is being shutdown
|
|
.RE
|
|
.PP
|
|
REPLBATT
|
|
.RS 4
|
|
The UPS battery is bad and needs to be replaced
|
|
.RE
|
|
.PP
|
|
NOCOMM
|
|
.RS 4
|
|
A UPS is unavailable (can\(cqt be contacted for monitoring)
|
|
.RE
|
|
.RE
|
|
.PP
|
|
\fBNOTIFYFLAG\fR \fItype\fR \fIflag\fR[+\fIflag\fR][+\fIflag\fR]\&...
|
|
.RS 4
|
|
By default, upsmon sends walls global messages to all logged in users) via /bin/wall and writes to the syslog when things happen\&. You can change this\&.
|
|
.sp
|
|
Examples:
|
|
.sp
|
|
.if n \{\
|
|
.RS 4
|
|
.\}
|
|
.nf
|
|
NOTIFYFLAG ONLINE SYSLOG
|
|
NOTIFYFLAG ONBATT SYSLOG+WALL+EXEC
|
|
.fi
|
|
.if n \{\
|
|
.RE
|
|
.\}
|
|
.sp
|
|
Possible values for the flags:
|
|
.PP
|
|
SYSLOG
|
|
.RS 4
|
|
Write the message to the syslog
|
|
.RE
|
|
.PP
|
|
WALL
|
|
.RS 4
|
|
Write the message to all users with /bin/wall
|
|
.RE
|
|
.PP
|
|
EXEC
|
|
.RS 4
|
|
Execute NOTIFYCMD (see above) with the message
|
|
.RE
|
|
.PP
|
|
IGNORE
|
|
.RS 4
|
|
Don\(cqt do anything
|
|
.sp
|
|
If you use IGNORE, don\(cqt use any other flags on the same line\&.
|
|
.RE
|
|
.RE
|
|
.PP
|
|
\fBPOLLFREQ\fR \fIseconds\fR
|
|
.RS 4
|
|
Normally upsmon polls the
|
|
\fBupsd\fR(8)
|
|
server every 5 seconds\&. If this is flooding your network with activity, you can make it higher\&. You can also make it lower to get faster updates in some cases\&.
|
|
.sp
|
|
There are some catches\&. First, if you set the POLLFREQ too high, you may miss short\-lived power events entirely\&. You also risk triggering the DEADTIME (see above) if you use a very large number\&.
|
|
.sp
|
|
Second, there is a point of diminishing returns if you set it too low\&. While upsd normally has all of the data available to it instantly, most drivers only refresh the UPS status once every 2 seconds\&. Polling any more than that usually doesn\(cqt get you the information any faster\&.
|
|
.RE
|
|
.PP
|
|
\fBPOLLFREQALERT\fR \fIseconds\fR
|
|
.RS 4
|
|
This is the interval that upsmon waits between polls if any of its UPSes are on battery\&. You can use this along with POLLFREQ above to slow down polls during normal behavior, but get quicker updates when something bad happens\&.
|
|
.sp
|
|
This should always be equal to or lower than the POLLFREQ value\&. By default it is also set 5 seconds\&.
|
|
.sp
|
|
The warnings from the POLLFREQ entry about too\-high and too\-low values also apply here\&.
|
|
.RE
|
|
.PP
|
|
\fBPOWERDOWNFLAG\fR \fIfilename\fR
|
|
.RS 4
|
|
upsmon creates this file when running in master mode when the UPS needs to be powered off\&. You should check for this file in your shutdown scripts and call
|
|
upsdrvctl shutdown
|
|
if it exists\&.
|
|
.sp
|
|
This is done to forcibly reset the slaves, so they don\(cqt get stuck at the "halted" stage even if the power returns during the shutdown process\&. This usually does not work well on contact\-closure UPSes that use the genericups driver\&.
|
|
.sp
|
|
See the shutdown\&.txt file in the docs subdirectory for more information\&.
|
|
.RE
|
|
.PP
|
|
\fBRBWARNTIME\fR \fIseconds\fR
|
|
.RS 4
|
|
When a UPS says that it needs to have its battery replaced, upsmon will generate a NOTIFY_REPLBATT event\&. By default, this happens every 43200 seconds (12 hours)\&.
|
|
.sp
|
|
If you need another value, set it here\&.
|
|
.RE
|
|
.PP
|
|
\fBRUN_AS_USER\fR \fIusername\fR
|
|
.RS 4
|
|
upsmon normally runs the bulk of the monitoring duties under another user ID after dropping root privileges\&. On most systems this means it runs as "nobody", since that\(cqs the default from compile\-time\&.
|
|
.sp
|
|
The catch is that "nobody" can\(cqt read your upsmon\&.conf, since by default it is installed so that only root can open it\&. This means you won\(cqt be able to reload the configuration file, since it will be unavailable\&.
|
|
.sp
|
|
The solution is to create a new user just for upsmon, then make it run as that user\&. I suggest "nutmon", but you can use anything that isn\(cqt already taken on your system\&. Just create a regular user with no special privileges and an impossible password\&.
|
|
.sp
|
|
Then, tell upsmon to run as that user, and make
|
|
upsmon\&.conf
|
|
readable by it\&. Your reloads will work, and your config file will stay secure\&.
|
|
.sp
|
|
This file should not be writable by the upsmon user, as it would be possible to exploit a hole, change the SHUTDOWNCMD to something malicious, then wait for upsmon to be restarted\&.
|
|
.RE
|
|
.PP
|
|
\fBSHUTDOWNCMD\fR \fIcommand\fR
|
|
.RS 4
|
|
upsmon runs this command when the system needs to be brought down\&. If it is a slave, it will do that immediately whenever the current overall power value drops below the MINSUPPLIES value above\&.
|
|
.sp
|
|
When upsmon is a master, it will allow any slaves to log out before starting the local shutdown procedure\&.
|
|
.sp
|
|
Note that the command needs to be one element in the config file\&. If your shutdown command includes spaces, then put it in quotes to keep it together, i\&.e\&.:
|
|
.sp
|
|
.if n \{\
|
|
.RS 4
|
|
.\}
|
|
.nf
|
|
SHUTDOWNCMD "/sbin/shutdown \-h +0"
|
|
.fi
|
|
.if n \{\
|
|
.RE
|
|
.\}
|
|
.RE
|
|
.SH "SEE ALSO"
|
|
.sp
|
|
\fBupsmon\fR(8), \fBupsd\fR(8), \fBnutupsdrv\fR(8)\&.
|
|
.SS "Internet resources:"
|
|
.sp
|
|
The NUT (Network UPS Tools) home page: http://www\&.networkupstools\&.org/
|