nut/conf/upsmon.conf.sample.in
2022-06-29 12:37:36 +02:00

453 lines
19 KiB
Text

# Network UPS Tools: example upsmon configuration
#
# This file contains passwords, so keep it secure.
# --------------------------------------------------------------------------
# RUN_AS_USER <userid>
#
# By default, upsmon splits into two processes. One stays as root and
# waits to run the SHUTDOWNCMD. The other one switches to another userid
# and does everything else.
#
# The default unprivileged user is set at compile-time with the option
# 'configure --with-user=...'
#
# You can override it with '-u <user>' when starting upsmon, or just
# define it here for convenience.
#
# Note: if you plan to use the reload feature, this file (upsmon.conf)
# must be readable by this user! Since it contains passwords, DO NOT
# make it world-readable. Also, do not make it writable by the upsmon
# user, since it creates an opportunity for an attack by changing the
# SHUTDOWNCMD to something malicious.
#
# For best results, you should create a new normal user like "nutmon",
# and make it a member of a "nut" group or similar. Then specify it
# here and grant read access to the upsmon.conf for that group.
#
# This user should not have write access to upsmon.conf.
#
# RUN_AS_USER @RUN_AS_USER@
# --------------------------------------------------------------------------
# MONITOR <system> <powervalue> <username> <password> ("primary"|"secondary")
#
# List systems you want to monitor. Not all of these may supply power
# to the system running upsmon, but if you want to watch it, it has to
# be in this section.
#
# You must have at least one of these declared.
#
# <system> is a UPS identifier in the form <upsname>@<hostname>[:<port>]
# like ups@localhost, su700@mybox, etc.
#
# Examples:
#
# - "su700@mybox" means a UPS called "su700" on a system called "mybox"
#
# - "fenton@bigbox:5678" is a UPS called "fenton" on a system called
# "bigbox" which runs upsd on port "5678".
#
# The UPS names like "su700" and "fenton" are set in your ups.conf
# in [brackets] which identify a section for a particular driver.
#
# If the ups.conf on host "doghouse" has a section called "snoopy", the
# identifier for it would be "snoopy@doghouse".
#
# <powervalue> is an integer - the number of power supplies that this UPS
# feeds on this system. Most personal computers only have one power supply,
# so this value is normally set to 1, while most modern servers have at least
# two. You need a pretty big or special box to have any other value here.
#
# You can also set this to 0 for a system that doesn't take any power
# from the MONITORed supply, which you still want to monitor (e.g. for an
# administrative workstation fed from a different circuit than the datacenter
# servers it monitors). Use <powervalue> if 0 when you want to hear about
# changes for a given UPS without shutting down when it goes critical.
#
# <username> and <password> must match an entry in that system's
# upsd.users. If your username is "upsmon" and your password is
# "blah", the upsd.users would look like this:
#
# [upsmon]
# password = blah
# upsmon primary # (or secondary)
#
# "primary" means this system will shutdown last, allowing the secondary
# systems time to shutdown first.
#
# "secondary" means this system shuts down immediately when power goes
# critical and less than MINSUPPLIES power sources have reliable input feeds.
#
# The general assumption is that the "primary" system is the one with direct
# connection to an UPS (such as serial or USB cable), so the primary system
# runs the NUT driver and 'upsd' server locally and can manage the device,
# and it would often tell the UPS to completely power itself off as a step
# in power-race avoidance (see POWERDOWNFLAG for details).
#
# Also, since the primary system stays up the longest, it suffers higher risks
# of ungraceful shutdown if the estimation of remaining runtime (or of the
# time it takes to shut down this system) was guessed wrong. By consequence,
# the "secondary" systems typically monitor the power environment state
# through the 'upsd' processes running on the remote (often "primary") systems
# and do not directly interact with an UPS (no local NUT drivers are running
# on the secondary systems). As such, secondaries typically shut down as
# soon as there is a sufficiently long power outage, or a low-battery alert
# from the UPS, or a loss of connection to the primary while the power was
# last known to be missing.
#
# This assumption and configuration can also make sense for networked UPSes,
# where a rack full of servers might overload the communications capacity
# of the networked management card on the UPS - in this case you might either
# reduce the 'snmp-ups' or 'netxml-ups' driver polling rate, or dedicate a
# "primary" server and set up the rest as "secondary" systems.
#
# In case of such large setups as mentioned above, beware also that shutdown
# times of the rack done all at once can substantially differ from smaller
# scale experiments with single-server shutdowns, since systems can compete
# for shared storage and other limited resources as they go down (and also
# not everyone may safely shut down simultaneously - e.g. a NAS or DB server
# would better go down after all its clients). You would be well served by
# higher-end UPSes with manageable thresholds to declare a critical state.
#
# Examples:
#
# MONITOR myups@bigserver 1 upswired blah primary
# MONITOR su700@server.example.com 1 upsmon secretpass secondary
# MONITOR myups@localhost 1 upsmon pass primary # (or secondary)
# --------------------------------------------------------------------------
# MINSUPPLIES <num>
#
# Give the number of power supplies that must be receiving power to keep
# this system running. Most systems have one power supply, so you would
# put "1" in this field.
#
# Large/expensive server type systems usually have more, and can run with
# a few missing. Some of these can run with 2 out of 4, for example,
# so you'd set that to 2. The idea is to keep the box running as long
# as possible, right?
#
# Obviously you have to put the redundant supplies on different UPS circuits
# for this to make sense! See big-servers.txt in the docs subdirectory
# for more information and ideas on how to use this feature.
MINSUPPLIES 1
# --------------------------------------------------------------------------
# SHUTDOWNCMD "<command>"
#
# upsmon runs this command when the system needs to be brought down.
#
# This should work just about everywhere ... if it doesn't, well, change it,
# perhaps to a more complicated custom script.
#
# Note that while you experiment with the initial setup and want to test how
# your configuration reacts to power state changes and ultimately when power
# is reported to go critical, but do not want your system to actually turn
# off, consider setting the SHUTDOWNCMD temporarily to do something benign -
# such as posting a message with 'logger' or 'wall' or 'mailx'. Do be careful
# to plug the UPS back into the wall in a timely fashion.
SHUTDOWNCMD "/sbin/shutdown -h +0"
# --------------------------------------------------------------------------
# NOTIFYCMD <command>
#
# upsmon calls this to send messages when things happen
#
# This command is called with the full text of the message (from NOTIFYMSG)
# as one argument.
#
# The environment string NOTIFYTYPE will contain the type string of
# whatever caused this event to happen.
#
# The environment string UPSNAME will contain the name of the system/device
# that generated the change.
#
# Note that this is only called for NOTIFY events that have EXEC set with
# NOTIFYFLAG. See NOTIFYFLAG below for more details.
#
# Making this some sort of shell script might not be a bad idea.
# Alternately you can use the upssched program as your NOTIFYCMD for some
# more complex setups (e.g. to ease handling of notification storms).
# For more information and ideas, see docs/scheduling.txt
#
# Example:
# NOTIFYCMD @BINDIR@/notifyme
# --------------------------------------------------------------------------
# POLLFREQ <n>
#
# Polling frequency for normal activities, measured in seconds.
#
# Adjust this to keep upsmon from flooding your network, but don't make
# it too high or it may miss certain short-lived power events.
POLLFREQ 5
# --------------------------------------------------------------------------
# POLLFREQALERT <n>
#
# Polling frequency in seconds while UPS on battery.
#
# You can make this number lower than POLLFREQ, which will make updates
# faster when any UPS is running on battery. This is a good way to tune
# network load if you have a lot of these things running.
#
# The default is 5 seconds for both this and POLLFREQ.
POLLFREQALERT 5
# --------------------------------------------------------------------------
# HOSTSYNC - How long upsmon will wait before giving up on another upsmon
#
# The primary upsmon process uses this number when waiting for secondary
# systems to disconnect once it has set the forced shutdown (FSD) flag.
# If they don't disconnect after this many seconds, it goes on without them.
#
# Similarly, upsmon secondary processes wait up to this interval for the
# primary upsmon to set FSD when an UPS they are monitoring goes critical -
# that is, on battery and low battery. If the primary doesn't do its job,
# the secondaries will shut down anyway to avoid damage to the file systems.
#
# This "wait for FSD" is done to avoid races where the status changes
# to critical and back between polls by the primary.
HOSTSYNC 15
# --------------------------------------------------------------------------
# DEADTIME - Interval to wait before declaring a stale ups "dead"
#
# upsmon requires a UPS to provide status information every few seconds
# (see POLLFREQ and POLLFREQALERT) to keep things updated. If the status
# fetch fails, the UPS is marked stale. If it stays stale for more than
# DEADTIME seconds, the UPS is marked dead.
#
# A dead UPS that was last known to be on battery is assumed to have gone
# to a low battery condition. This may force a shutdown if it is providing
# a critical amount of power to your system.
#
# Note: DEADTIME should be a multiple of POLLFREQ and POLLFREQALERT.
# Otherwise you'll have "dead" UPSes simply because upsmon isn't polling
# them quickly enough. Rule of thumb: take the larger of the two
# POLLFREQ values, and multiply by 3.
DEADTIME 15
# --------------------------------------------------------------------------
# POWERDOWNFLAG - Flag file for forcing UPS shutdown on the primary system
#
# upsmon will create a file with this name in primary mode when it's time
# to shut down the load. You should check for this file's existence in
# your shutdown scripts and run 'upsdrvctl shutdown' if it exists, to tell
# the UPS(es) to power off.
#
# See the config-notes.txt file in the docs subdirectory for more information.
# Refer to the section:
# [[UPS_shutdown]] "Configuring automatic shutdowns for low battery events"
# or refer to the online version.
POWERDOWNFLAG /etc/killpower
# --------------------------------------------------------------------------
# NOTIFYMSG - change messages sent by upsmon when certain events occur
#
# You can change the default messages to something else if you like.
#
# NOTIFYMSG <notify type> "message"
#
# NOTIFYMSG ONLINE "UPS %s on line power"
# NOTIFYMSG ONBATT "UPS %s on battery"
# NOTIFYMSG LOWBATT "UPS %s battery is low"
# NOTIFYMSG FSD "UPS %s: forced shutdown in progress"
# NOTIFYMSG COMMOK "Communications with UPS %s established"
# NOTIFYMSG COMMBAD "Communications with UPS %s lost"
# NOTIFYMSG SHUTDOWN "Auto logout and shutdown proceeding"
# NOTIFYMSG REPLBATT "UPS %s battery needs to be replaced"
# NOTIFYMSG NOCOMM "UPS %s is unavailable"
# NOTIFYMSG NOPARENT "upsmon parent process died - shutdown impossible"
#
# Note that %s is replaced with the identifier of the UPS in question.
#
# Possible values for <notify type>:
#
# ONLINE : UPS is back online
# ONBATT : UPS is on battery
# LOWBATT : UPS has a low battery (if also on battery, it's "critical")
# FSD : UPS is being shutdown by the primary (FSD = "Forced Shutdown")
# COMMOK : Communications established with the UPS
# COMMBAD : Communications lost to the UPS
# SHUTDOWN : The system is being shutdown
# REPLBATT : The UPS battery is bad and needs to be replaced
# NOCOMM : A UPS is unavailable (can't be contacted for monitoring)
# NOPARENT : The process that shuts down the system has died (shutdown impossible)
# --------------------------------------------------------------------------
# NOTIFYFLAG - change behavior of upsmon when NOTIFY events occur
#
# By default, upsmon sends walls (global messages to all logged in users)
# and writes to the syslog when things happen. You can change this.
#
# NOTIFYFLAG <notify type> <flag>[+<flag>][+<flag>] ...
#
# NOTIFYFLAG ONLINE SYSLOG+WALL
# NOTIFYFLAG ONBATT SYSLOG+WALL
# NOTIFYFLAG LOWBATT SYSLOG+WALL
# NOTIFYFLAG FSD SYSLOG+WALL
# NOTIFYFLAG COMMOK SYSLOG+WALL
# NOTIFYFLAG COMMBAD SYSLOG+WALL
# NOTIFYFLAG SHUTDOWN SYSLOG+WALL
# NOTIFYFLAG REPLBATT SYSLOG+WALL
# NOTIFYFLAG NOCOMM SYSLOG+WALL
# NOTIFYFLAG NOPARENT SYSLOG+WALL
#
# Possible values for the flags:
#
# SYSLOG - Write the message in the syslog
# WALL - Write the message to all users on the system
# EXEC - Execute NOTIFYCMD (see above) with the message
# IGNORE - Don't do anything
#
# If you use IGNORE, don't use any other flags on the same line.
# --------------------------------------------------------------------------
# RBWARNTIME - replace battery warning time in seconds
#
# upsmon will normally warn you about a battery that needs to be replaced
# every 43200 seconds, which is 12 hours. It does this by triggering a
# NOTIFY_REPLBATT which is then handled by the usual notify structure
# you've defined above.
#
# If this number is not to your liking, override it here.
RBWARNTIME 43200
# --------------------------------------------------------------------------
# NOCOMMWARNTIME - no communications warning time in seconds
#
# upsmon will let you know through the usual notify system if it can't
# talk to any of the UPS entries that are defined in this file. It will
# trigger a NOTIFY_NOCOMM by default every 300 seconds unless you
# change the interval with this directive.
NOCOMMWARNTIME 300
# --------------------------------------------------------------------------
# FINALDELAY - last sleep interval before shutting down the system
#
# On a primary, upsmon will wait this long after sending the NOTIFY_SHUTDOWN
# before executing your SHUTDOWNCMD. If you need to do something in between
# those events, increase this number. Remember, at this point your UPS is
# almost depleted, so don't make this too high. If needed, on high-end UPS
# devices you can usually configure when the low-battery state is announced
# based on estimated remaining run-time or on charge level of the batteries.
#
# Alternatively, you can set this very low so you don't wait around when
# it's time to shut down. Some UPSes don't give much warning for low
# battery and will require a value of 0 here for a safe shutdown.
#
# Note: If FINALDELAY on the secondary is greater than HOSTSYNC on the
# primary, the primary will give up waiting for that secondary system
# to disconnect.
FINALDELAY 5
# --------------------------------------------------------------------------
# CERTPATH - path to certificates (database directory or directory with CA's)
#
# When compiled with SSL support, you can enter the certificate path here.
#
# With NSS:
# Certificates are stored in a dedicated database (split into 3 files).
# Specify the path of the database directory.
#
# CERTPATH @CONFPATH@/cert/upsmon
#
# With OpenSSL:
# Directory containing CA certificates in PEM format, used to verify
# the server certificate presented by the upsd server. The files each
# contain one CA certificate. The files are looked up by the CA subject
# name hash value, which must hence be available.
#
# CERTPATH /usr/ssl/certs
#
# See 'docs/security.txt' or the Security chapter of NUT user manual
# for more information on the SSL support in NUT.
# --------------------------------------------------------------------------
# CERTIDENT - self certificate name and database password
# CERTIDENT <certificate name> <database password>
#
# When compiled with SSL support with NSS, you can specify the certificate
# name to retrieve from database to authenticate itself and the password
# required to access certificate related private key.
#
# CERTIDENT "my nut monitor" "MyPasSw0rD"
#
# See 'docs/security.txt' or the Security chapter of NUT user manual
# for more information on the SSL support in NUT.
# --------------------------------------------------------------------------
# CERTHOST - security properties for an host
# CERTHOST <hostname> <certificate name> <certverify> <forcessl>
#
# When compiled with SSL support with NSS, you can specify security directive
# for each server you can contact.
# Each entry maps server name with the expected certificate name and flags
# indicating if the server certificate is verified and if the connection
# must be secure.
#
# CERTHOST localhost "My nut server" 1 1
#
# See 'docs/security.txt' or the Security chapter of NUT user manual
# for more information on the SSL support in NUT.
# --------------------------------------------------------------------------
# CERTVERIFY - make upsmon verify all connections with certificates
# CERTVERIFY 1
#
# When compiled with SSL support, make upsmon verify all connections with
# certificates.
# Without this, there is no guarantee that the upsd is the right host.
# Enabling this greatly reduces the risk of man in the middle attacks.
# This effectively forces the use of SSL, so don't use this unless
# all of your upsd hosts are ready for SSL and have their certificates
# in order.
# When compiled with NSS support of SSL, can be overridden for host
# specified with a CERTHOST directive.
# --------------------------------------------------------------------------
# FORCESSL - force upsmon to use SSL
# FORCESSL 1
#
# When compiled with SSL, specify that a secured connection must be used
# to communicate with upsd.
# If you don't use 'CERTVERIFY 1', then this will at least make sure
# that nobody can sniff your sessions without a large effort. Setting
# this will make upsmon drop connections if the remote upsd doesn't
# support SSL, so don't use it unless all of them have it running.
# When compiled with NSS support of SSL, can be overridden for host
# specified with a CERTHOST directive.
# --------------------------------------------------------------------------
# DEBUG_MIN - specify minimal debugging level for upsmon daemon
# e.g. DEBUG_MIN 6
#
# Optionally specify a minimum debug level for `upsmon` daemon, e.g. for
# troubleshooting a deployment, without impacting foreground or background
# running mode directly, and without need to edit init-scripts or service
# unit definitions. Note that command-line option `-D` can only increase
# this verbosity level.
#
# NOTE: if the running daemon receives a `reload` command, presence of the
# `DEBUG_MIN NUMBER` value in the configuration file can be used to tune
# debugging verbosity in the running service daemon (it is recommended to
# comment it away or set the minimum to explicit zero when done, to avoid
# huge journals and I/O system abuse). Keep in mind that for this run-time
# tuning, the `DEBUG_MIN` value *present* in *reloaded* configuration files
# is applied instantly and overrides any previously set value, from file
# or CLI options, regardless of older logging level being higher or lower
# than the newly found number; a missing (or commented away) value however
# does not change the previously active logging verbosity.