In case the config file could not be opened a new but unitialized RSA structure
would be returned, causing a segmentation fault later on. This would only
happen in the case that the config file could be opened before, but not when
read_rsa_public_key() was called. This situation could occur when the --user
option was used, and the config files were not readable by the specified user.
This is mainly important for Windows, where the select() call in the
main thread is not being woken up when the tapreader thread calls
route(), causing a delay of up to 1 second before the output buffer is
flushed. This would cause bad performance when UDP communication is not
possible.
First of all, if there really are two nodes with the same name, much
more than 10 contradicting ADD_EDGE and DEL_EDGE messages will be sent.
Also, we forgot to reset the counters when nothing happened.
In case there is a ADD_EDGE/DEL_EDGE storm, we do not shut down, but
sleep an increasing amount of time, allowing tinc to recover gracefully
from temporary failures.
Instead of UNIX time, the log messages now start with the time in RFC3339
format, which human-readable and still easy for the computer to parse and sort.
The HUP signal will also cause the log file to be closed and reopened, which is
useful when log rotation is used. If there is an error while opening the log
file, this is logged to stderr.
In commit 4a21aabada, code was added to detect
contradicting ADD_EDGE and DEL_EDGE messages being sent, which is an indication
of two nodes with the same Name connected to the same VPN. However, these
contradictory messages can also happen when there is a network partitioning. In
the former case a loop happens which causes many contradictory message, while
in the latter case only a few of those messages will be sent. So, now we
increase the threshold to at least 10 of both ADD_EDGE and DEL_EDGE messages.
Since tinc now handles UDP packets with a different source address and port
than used for TCP connections, the heuristic to treat edges as indirect when
tinc could detect that multiple addresses were used does not make sense
anymore, and can actually reduce performance.
Because we don't want to keep track of that, and this will cause the node
structure from being relinked into the node tree, which results in myself
pointing to an invalid address.
When a UDP packet was received with an unknown source address/port, and if it
failed a HMAC check against known keys, it could still incorrectly assign that
UDP address to another node. This would temporarily cause outgoing UDP packets
to go to the wrong destination address, until packets from the correct address
were received again.
Found by cppcheck, which complained about lenin not being initialized, but the
real problem is that reading packets would fail when using code compiled with
--tunemu on a normal tun device.
Before, if MTU probes failed, tinc would stop sending probes until the next
time keys were regenerated (by default, once every hour). Now it continues to
send them every PingInterval, so it recovers faster from temporary failures.
Although transient errors sometimes happen on the tun/tap device (for example,
if the kernel is temporarily out of buffer space), there are situations where
the tun/tap device becomes permanently broken. Instead of endlessly spamming
the syslog, we now sleep an increasing amount of time between consecutive read
errors, and if reads still fail after 10 attempts (approximately 3 seconds),
tinc will quit.
Treat netname "." in a special way as if there was no netname
specified. Before, f.e. tincd -n. -k didn't work as it tried
to open /var/run/tinc-.pid. Now -n. works as if there was no
-n option is specified.
Signed-Off-By: Michael Tokarev <mjt@tls.msk.ru>
With some exceptions, tinc only accepted host configuration options for the
local node from the corresponding host configuration file. Although this is
documented, many people expect that they can also put those options in
tinc.conf. Tinc now internally merges the contents of both tinc.conf and the
local host configuration file.
Options given on the command line have precedence over configuration from files.
This can be useful, for example, for a roaming node, for which 'ConnectTo' and
<host>.Address depends on its location.
In this situation, the two nodes will start fighting over the edges they announced.
When we have to contradict both ADD_EDGE and DEL_EDGE messages, we log a warning,
and with 25% chance per PingTimeout we quit.
If one uses a symbolic name for the Port option, tinc will send that name
literally to other nodes. However, it is not guaranteed that all nodes have
the same contents in /etc/services, or have such a file at all.
If a node is unreachable, and not connected to an edge anymore, it gets
deleted. When this happens its subnets are also removed, which should
not happen with StrictSubnets=yes.
Solution:
- do not remove subnets in src/net.c::purge(), we know that all subnets
in the list came from our hosts files.
I think here you got the check wrong by looking at the tunnelserver
code below it - with strictsubnets we still inform others but do not
remove the subnet from our data.
- do not remove nodes in net.c::purge() that still have subnets
attached.
When this option is enabled, packets that cannot be sent directly to the destination node,
but which would have to be forwarded by an intermediate node, are dropped instead.
When combined with the IndirectData option,
packets for nodes for which we do not have a meta connection with are also dropped.
This determines if and how incoming packets that are not meant for the local
node are forwarded. It can either be off, internal (tinc forwards them itself,
as in previous versions), or kernel (packets are always sent to the TUN/TAP
device, letting the kernel sort them out).
When this option is enabled, tinc will not accept dynamic updates of Subnets
from other nodes, but will only use Subnets read from local host config files
to build its routing table.
Instead of allocating storage for each line read, we now read into fixed-size
buffers on the stack. This fixes a case where a malformed configuration file
could crash tinc.
Every operating system seems to have its own, slightly different way to disable
packet fragmentation. Emit a compiler warning when no suitable way is found.
On OpenBSD, it seems impossible to do it for IPv4.
To help peers that are behind NAT connect to each other directly via UDP, they
need to know the exact external address and port that they use. Keys exchanged
between NATted peers necessarily go via a third node, which knows this address
and port, and can append this information to the keys, which is in turned used
by the peers.
Since PMTU discovery will immediately trigger UDP communication from both sides
to each other, this should allow direct communication between peers behind
full, address-restricted and port-restricted cone NAT.
When we got a key request for or from a node we don't know, we disconnected the
node that forwarded us that request. However, especially in TunnelServer mode,
disconnecting does not help. We now ignore such requests, but since there is no
way of telling the original sender that the request was dropped, we now retry
sending REQ_KEY requests when we don't get an ANS_KEY back.
Commit 052ff8b2c5 contained a bug that causes
scripts to be called with an empty, or possibly corrupted SUBNET variable when
a Subnet is added or removed while the owner is still online. In router mode,
this normally does not happen, but in switch mode this is normal.
Before, we immediately retried select() if it returned -1 and errno is EAGAIN
or EINTR, and if it returned 0 it would check for network events even if we
know there are none. Now, if -1 or 0 is returned we skip checking network
events, but we do check for timer and signal events.
One reason to send the ALRM signal is to let tinc immediately try to connect to
outgoing nodes, for example when PPP or DHCP configuration of the outgoing
interface finished. Conversely, when the outgoing interface goes down one can
now send this signal to let tinc quickly detect that links are down too.
Some ISPs block the ICMP Fragmentation Needed packets that tinc sends. We
clamp the MSS of IPv4 SYN packets to prevent hosts behind those ISPs from
sending too large packets.
For IPv6, the minimum MTU is 1280 (RFC 2460), for IPv4 the minimum is actually
68, but this is such a low limit that it will probably hurt performance, so we
do as if it is 576 (the minimum packet size hosts should be able to handle, RFC
791). If we detect a path MTU smaller than those minima, and we have to handle
a packet that is bigger than the PMTU but smaller than those minima, we forward
them via TCP instead of fragmenting or returning ICMP packets.
We clear the cached address used for UDP connections when a node becomes
unreachable. This also prevents host-up scripts from passing the old, cached
address from when the host becomes reachable again from a different address.
Before it would check all addresses, and not learn an address if another node
already claimed that address. This caused fast roaming to fail, the code from
commit 6f6f426b35 was never triggered.
In switch mode, if a known MAC address is claimed by a second node before it
expired at the first node, it is likely that this is because a computer has
roamed from the LAN of the first node to that of the second node. To ensure
packets for that computer are routed to the second node, the first node should
delete its corresponding Subnet as soon as possible, without waiting for the
normal expiry timeout.