The Broadcast option can be used to cause tinc to drop all broadcast and
multicast packets. This option might be expanded in the future to selectively
allow only some broadcast packet types.
The code introduced in commit 41a05f59ba is not
needed anymore, since tinc has been able to handle UDP packets from a different
source address than those of the TCP packets since 1.0.10. When using multiple
BindToAddress statements, this code does not make sense anymore, we do want the
kernel to choose the source address on its own.
Tinc will now, by default, decrement the TTL field of incoming IPv4 and IPv6
packets, before forwarding them to the virtual network device or to another
node. Packets with a TTL value of zero will be dropped, and an ICMP Time
Exceeded message will be sent back.
This behaviour can be disabled using the DecrementTTL option.
Scripts called by tinc would inherit its open filedescriptors. This could
be a problem if other long-running daemons are started from those scripts,
if those daemons would not close all filedescriptors before going into the
background.
Problem found and solution suggested by Nick Hibma.
Apart from the platform specific tun/tap driver, link with the dummy and
raw_socket devices, and optionally with support for UML and VDE devices.
At runtime, the DeviceType option can be used to select which driver to
use.
* Exchange nonce and ECDH public key first, calculate the ECDSA signature
over the complete key exchange.
* Make an explicit distinction between client and server in the signatures.
* Add more comments and replace some magic numbers by #defines.
Thanks to Erik Tews for very helpful hints and comments!
In case the config file could not be opened a new but unitialized RSA structure
would be returned, causing a segmentation fault later on. This would only
happen in the case that the config file could be opened before, but not when
read_rsa_public_key() was called. This situation could occur when the --user
option was used, and the config files were not readable by the specified user.
Probably due to a merge, the try_harder() function had duplicated the
rate-limiting code for detecting the sender node based on the HMAC of the
packet. This prevented this detection from running at all. The function is now
identical again to that in the 1.0 branch.
sometimes argv[0] will have directory-less name (when the
command is started by shell searching in $PATH for example).
For tincctl start we want the same rules to run tincd as for
tincctl itself (having full path is better but if shell does
not provide one we've no other choice). Previous code tried
to run ./tincd in this case, which is obviously wrong.
This is a fix for the previous commit.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
For tincctl start, run tincd from dirname($0) not SBINDIR -
this allows painless alternative directory installation and
running from build directory too.
Also while at it, pass the rest of command line to tincd, not
only options before "start" argument. This way it's possible
to pass options to tincd like this:
tincctl -n net start -- -d 1 -R -U tincuser ...
And also add missing newline at the end of error message there.
Signed-Off-By: Michael Tokarev <mjt@tls.msk.ru>
Encryption and authentication of the meta connection is spread out over
meta.c and protocol_auth.c. The new protocol was added there as well,
leading to spaghetti code. To improve things, the new protocol will now
be implemented in sptps.[ch].
The goal is to have a very simplified version of TLS. There is a record
layer, and there are only two record types: application data and
handshake messages. The handshake message contains a random nonce, an
ephemeral ECDH public key, and an ECDSA signature over the former. After
the ECDH public keys are exchanged, a shared secret is calculated, and a
TLS style PRF is used to generate the key material for the cipher and
HMAC algorithm, and further communication is encrypted and authenticated.
A lot of the simplicity comes from the fact that both sides must have
each other's public keys in advance, and there are no options to choose.
There will be one fixed cipher suite, and both peers always authenticate
each other. (Inspiration taken from Ian Grigg's hypotheses[0].)
There might be some compromise in the future, to enable or disable
encryption, authentication and compression, but there will be no choice
of algorithms. This will allow SPTPS to be built with a few embedded
crypto algorithms instead of linking with huge crypto libraries.
The API is also kept simple. There is a start and a stop function. All
data necessary to make the connection work is passed in the start
function. Instead having both send- and receive-record functions, there
is a send-record function and a receive-data function. The latter will
pass protocol data received from the peer to the SPTPS implementation,
which will in turn call a receive-record callback function when
necessary. This hides all the handshaking from the application, and is
completely independent from any event loop or socket characteristics.
[0] http://iang.org/ssl/hn_hypotheses_in_secure_protocol_design.html
This is mainly important for Windows, where the select() call in the
main thread is not being woken up when the tapreader thread calls
route(), causing a delay of up to 1 second before the output buffer is
flushed. This would cause bad performance when UDP communication is not
possible.
First of all, if there really are two nodes with the same name, much
more than 10 contradicting ADD_EDGE and DEL_EDGE messages will be sent.
Also, we forgot to reset the counters when nothing happened.
In case there is a ADD_EDGE/DEL_EDGE storm, we do not shut down, but
sleep an increasing amount of time, allowing tinc to recover gracefully
from temporary failures.
The length parameter for the encoding functions is the length of the
binary input, and for the decoding functions it is the maximum size of
the binary output.
The return value is always the length of the resulting output, excluding
the terminating NULL character for the encoding routines.
All functions can encode and decode in-place. The encoding functions
will always write a terminating NULL character, and the decoding
functions will stop at a NULL character.
If we don't have ECDSA keys for the node we connect to, set protocol_minor
to 1, to indicate this to the other end. This will first complete the
old way of authentication with RSA keys, and will then exchange ECDSA keys.
The connection will be terminated right afterwards, and the next attempt
will use ECDSA keys.
The generate-keys command now generates both an RSA and an ECDSA keypair,
but one can generate-rsa-keys or generate-ecdsa-keys to just generate one type.
It is modelled after the pseudorandom function from RFC4346 (TLS 1.1), the only
significant change is the use of SHA512 and Whirlpool instead of MD5 and SHA1.
REQ_KEY requests have an extra field indicating key exchange version.
If it is present and > 0, the sender supports ECDH. If the receiver also
does, then it will generate a new keypair and sends the public key in a
ANS_KEY request with "ECDH:" prefixed. The ans_key_h() function will
compute the shared secret, which, at the moment,is used as is to set the
cipher and HMAC keys. However, this must be changed to use a proper KDF.
In the future, the ECDH key exchange must also be signed.
The pid is now written first, so that a version 1.0.x tincd can be used to stop
a running version 1.1 tincd. Getsockname() is used to determine the address of
the first listening socket, so that tincctl can connect to the local tincd even
if AddressFamily = ipv6, or if BindToAddress or BindToInterface is used.
Instead of UNIX time, the log messages now start with the time in RFC3339
format, which human-readable and still easy for the computer to parse and sort.
The HUP signal will also cause the log file to be closed and reopened, which is
useful when log rotation is used. If there is an error while opening the log
file, this is logged to stderr.
But we do ignore SIGPIPE, and tinc 1.0.x signals that are no longer used
(SIGUSR1 and SIGUSR2), since the default handler of these signals is to
terminate tincd immediately.
Although we use qsort(), which is not guaranteed to be stable, resorting the
previously sorted array is more stable than recreating and resorting the array
each time.
We live in the 21st century, and we require C99 semantics, so we do not need to
work around buggy libcs. The xmalloc() and related functions are now static
inline functions.
We don't override any signal handlers anymore except those for SIGPIPE and
SIGCHLD. Fatal signals (SIGSEGV, SIGBUS etc.) will terminate tincd and
optionally dump core. The previous behaviour was to terminate gracefully and
try to restart, but that usually failed and made any core dump useless.
This would allow tincctl to connect to a remote tincd, or to a local tincd that
isn't listening on localhost, for example if it is using the BindToInterface or
BindToAddress options.
Also log an error when the input buffer contains more than MAXBUFSIZE bytes
already, instead of silently claiming the other side closed the connection.
Libevent 2.0's buffer code is not completely backward compatible with 1.4's.
In order to not (mis)use it anymore, we implement it ourselves. The buffers
are automatically expanding when necessary. When consuming data from the
buffer, no memmove()s are performed. Only when adding to the buffer would
write past the end do we shift everything back to the start.
In commit 4a21aabada, code was added to detect
contradicting ADD_EDGE and DEL_EDGE messages being sent, which is an indication
of two nodes with the same Name connected to the same VPN. However, these
contradictory messages can also happen when there is a network partitioning. In
the former case a loop happens which causes many contradictory message, while
in the latter case only a few of those messages will be sent. So, now we
increase the threshold to at least 10 of both ADD_EDGE and DEL_EDGE messages.
Since tinc now handles UDP packets with a different source address and port
than used for TCP connections, the heuristic to treat edges as indirect when
tinc could detect that multiple addresses were used does not make sense
anymore, and can actually reduce performance.
Because we don't want to keep track of that, and this will cause the node
structure from being relinked into the node tree, which results in myself
pointing to an invalid address.
When a UDP packet was received with an unknown source address/port, and if it
failed a HMAC check against known keys, it could still incorrectly assign that
UDP address to another node. This would temporarily cause outgoing UDP packets
to go to the wrong destination address, until packets from the correct address
were received again.
Found by cppcheck, which complained about lenin not being initialized, but the
real problem is that reading packets would fail when using code compiled with
--tunemu on a normal tun device.
Before, if MTU probes failed, tinc would stop sending probes until the next
time keys were regenerated (by default, once every hour). Now it continues to
send them every PingInterval, so it recovers faster from temporary failures.
Although transient errors sometimes happen on the tun/tap device (for example,
if the kernel is temporarily out of buffer space), there are situations where
the tun/tap device becomes permanently broken. Instead of endlessly spamming
the syslog, we now sleep an increasing amount of time between consecutive read
errors, and if reads still fail after 10 attempts (approximately 3 seconds),
tinc will quit.
Treat netname "." in a special way as if there was no netname
specified. Before, f.e. tincd -n. -k didn't work as it tried
to open /var/run/tinc-.pid. Now -n. works as if there was no
-n option is specified.
Signed-Off-By: Michael Tokarev <mjt@tls.msk.ru>
With some exceptions, tinc only accepted host configuration options for the
local node from the corresponding host configuration file. Although this is
documented, many people expect that they can also put those options in
tinc.conf. Tinc now internally merges the contents of both tinc.conf and the
local host configuration file.
Options given on the command line have precedence over configuration from files.
This can be useful, for example, for a roaming node, for which 'ConnectTo' and
<host>.Address depends on its location.
In this situation, the two nodes will start fighting over the edges they announced.
When we have to contradict both ADD_EDGE and DEL_EDGE messages, we log a warning,
and with 25% chance per PingTimeout we quit.
If one uses a symbolic name for the Port option, tinc will send that name
literally to other nodes. However, it is not guaranteed that all nodes have
the same contents in /etc/services, or have such a file at all.
When forwarding a metadata request through forward_request() we were
adding the required newline char to our buffer, but then sending the
data without it - this results in the forwarded request and the next one
to be garbled together.
Additionally while at it add a warning comment that request string is
not zero terminated anymore after a call to the forward_request()
function - for now this is ok as it is not used by any caller after this.
If a node is unreachable, and not connected to an edge anymore, it gets
deleted. When this happens its subnets are also removed, which should
not happen with StrictSubnets=yes.
Solution:
- do not remove subnets in src/net.c::purge(), we know that all subnets
in the list came from our hosts files.
I think here you got the check wrong by looking at the tunnelserver
code below it - with strictsubnets we still inform others but do not
remove the subnet from our data.
- do not remove nodes in net.c::purge() that still have subnets
attached.
When this option is enabled, packets that cannot be sent directly to the destination node,
but which would have to be forwarded by an intermediate node, are dropped instead.
When combined with the IndirectData option,
packets for nodes for which we do not have a meta connection with are also dropped.
This determines if and how incoming packets that are not meant for the local
node are forwarded. It can either be off, internal (tinc forwards them itself,
as in previous versions), or kernel (packets are always sent to the TUN/TAP
device, letting the kernel sort them out).
When this option is enabled, tinc will not accept dynamic updates of Subnets
from other nodes, but will only use Subnets read from local host config files
to build its routing table.
Instead of allocating storage for each line read, we now read into fixed-size
buffers on the stack. This fixes a case where a malformed configuration file
could crash tinc.
Every operating system seems to have its own, slightly different way to disable
packet fragmentation. Emit a compiler warning when no suitable way is found.
On OpenBSD, it seems impossible to do it for IPv4.
To help peers that are behind NAT connect to each other directly via UDP, they
need to know the exact external address and port that they use. Keys exchanged
between NATted peers necessarily go via a third node, which knows this address
and port, and can append this information to the keys, which is in turned used
by the peers.
Since PMTU discovery will immediately trigger UDP communication from both sides
to each other, this should allow direct communication between peers behind
full, address-restricted and port-restricted cone NAT.
When we got a key request for or from a node we don't know, we disconnected the
node that forwarded us that request. However, especially in TunnelServer mode,
disconnecting does not help. We now ignore such requests, but since there is no
way of telling the original sender that the request was dropped, we now retry
sending REQ_KEY requests when we don't get an ANS_KEY back.
Commit 052ff8b2c5 contained a bug that causes
scripts to be called with an empty, or possibly corrupted SUBNET variable when
a Subnet is added or removed while the owner is still online. In router mode,
this normally does not happen, but in switch mode this is normal.
Before, we immediately retried select() if it returned -1 and errno is EAGAIN
or EINTR, and if it returned 0 it would check for network events even if we
know there are none. Now, if -1 or 0 is returned we skip checking network
events, but we do check for timer and signal events.
One reason to send the ALRM signal is to let tinc immediately try to connect to
outgoing nodes, for example when PPP or DHCP configuration of the outgoing
interface finished. Conversely, when the outgoing interface goes down one can
now send this signal to let tinc quickly detect that links are down too.
Some ISPs block the ICMP Fragmentation Needed packets that tinc sends. We
clamp the MSS of IPv4 SYN packets to prevent hosts behind those ISPs from
sending too large packets.
The utility functions in the lib/ directory do not really form a library.
Also, now that we build two binaries, tincctl does not need everything that was
in libvpn.a, so it is wasteful to link to it.
For IPv6, the minimum MTU is 1280 (RFC 2460), for IPv4 the minimum is actually
68, but this is such a low limit that it will probably hurt performance, so we
do as if it is 576 (the minimum packet size hosts should be able to handle, RFC
791). If we detect a path MTU smaller than those minima, and we have to handle
a packet that is bigger than the PMTU but smaller than those minima, we forward
them via TCP instead of fragmenting or returning ICMP packets.
If the result of an RSA encryption or decryption operation can be represented
in less bytes than given, gcry_mpi_print() will not add leading zero bytes. Fix
this by adding those ourself.
This wasn't working at all, since we didn't do HMAC but just a plain hash.
Also, verification of packets failed because it was checking the whole packet,
not the packet minus the HMAC.
We clear the cached address used for UDP connections when a node becomes
unreachable. This also prevents host-up scripts from passing the old, cached
address from when the host becomes reachable again from a different address.
Before it would check all addresses, and not learn an address if another node
already claimed that address. This caused fast roaming to fail, the code from
commit 6f6f426b35 was never triggered.
The control socket code was completely different from how meta connections are
handled, resulting in lots of extra code to handle requests. Also, not every
operating system has UNIX sockets, so we have to resort to another type of
sockets or pipes for those anyway. To reduce code duplication and make control
sockets work the same on all platforms, we now just connect to the TCP port
where tincd is already listening on.
To authenticate, the program that wants to control a running tinc daemon must
send the contents of a cookie file. The cookie is a random 256 bits number that
is regenerated every time tincd starts. The cookie file should only be readable
by the same user that can start a tincd.
Instead of the binary-ish protocol previously used, we now use an ASCII
protocol similar to that of the meta connections, but this can still change.
Since event.h is not part of tinc, we include it in have.h were all other
system header files are included. We also ensure -levent comes before -lgdi32
when compiling with MinGW, apparently it doesn't work when the order is
reversed.
UNIX domain sockets, of course, don't exist on Windows. For now, when compiling
tinc in a MinGW environment, try to use a TCP socket bound to localhost as an
alternative.
In switch mode, if a known MAC address is claimed by a second node before it
expired at the first node, it is likely that this is because a computer has
roamed from the LAN of the first node to that of the second node. To ensure
packets for that computer are routed to the second node, the first node should
delete its corresponding Subnet as soon as possible, without waiting for the
normal expiry timeout.
If MTU probing discovered a node was not reachable via UDP, packets for it were
forwarded to the next hop, but always via TCP, even if the next hop was
reachable via UDP. This is now fixed by retrying to send the packet using
send_packet() if the destination is not the same as the nexthop.
Options should have a fixed width anyway, but this also fixes a possible MinGW
compiler bug where %lx tries to print a 64 bit value, even though a long int is
only 32 bits.
We now handle MAC Subnets in exactly the same way as IPv4 and IPv6 Subnets.
This also fixes a problem that causes unncessary broadcasting of unicast
packets in VPNs where some daemons run 1.0.10 and some run other versions.
When the HUP signal is sent while some outgoing connections have not been made
yet, or are being retried, a NULL pointer could be dereferenced resulting in
tinc crashing. We fix this by more careful handling of outgoing_ts, and by
deleting all connections that have not been fully activated yet at the HUP
signal is received.
This device works like /dev/tun on Linux, automatically creating a new tap
interface when a program opens it. We now pass the actual name of the newly
created interface in $INTERFACE.
This keeps NAT mappings for UDP alive, and will also detect when a node is not
reachable via UDP anymore or if the path MTU is decreasing. Tinc will fall back
to TCP if the node has become unreachable.
If UDP communication is impossible, we stop sending probes, but we retry if it
changes its keys.
We also decouple the UDP and TCP ping mechanisms completely, to ensure tinc
properly detects failure of either method.
Although it would be better to have the new defaults, only the most recent
releases of most of the platforms supported by tinc come with a version of
OpenSSL that supports SHA256. To ensure people can compile tinc and that nodes
can interact with each other, we revert the default back to Blowfish and SHA1.
This reverts commit 4bb3793e38.
Git's log and blame tools were used to find out which files had significant
contributions from authors who sent in patches that were applied before we used
git.
This feature is not necessary anymore since we have tools like valgrind today
that can catch stack overflow errors before they make a backtrace in gdb
impossible.
- Update year numbers in copyright headers.
- Add copyright information for Michael Tokarev and Florian Forster to the
copyright headers of files to which they have contributed significantly.
- Mention Michael and Florian in AUTHORS.
- Mention that tinc is GPLv3 or later if compiled with the --enable-tunemu
flag.
During the path MTU discovery phase, we might not know the maximum MTU yet, but
we do know a safe minimum. If we encounter a packet that is larger than that
the minimum, we now send it via TCP instead to ensure it arrives. We also
allow large packets that we cannot fragment or create ICMP replies for to be
sent via TCP.
The TAP-Win32 device is not a socket, and select() under Windows only works
with sockets. Tinc used a separate thread to read from the TAP-Win32 device,
and passed this via a local socket to the main thread which could then select()
from it. We now use a global mutex, which is only unlocked when the main thread
is waiting for select(), to allow the TAP reader thread to process packets
directly.
In light of the recent improvements of attacks on SHA1, the default hash
algorithm in tinc is now SHA256. At the same time, the default symmetric
encryption algorithm has been changed to AES256.
We used both rand() and random() in our code. Since it returns an int, we have
to use %x in our format strings instead of %lx. This fixes a crash under
Windows when cross-compiling tinc with a recent version of MinGW.
If two pointers do not belong to the same array, pointer subtraction gives
nonsensical results, depending on the level of optimisation and the
architecture one is compiling for. It is apparently not just subtracting the
pointer values and dividing by the size of the object, but uses some kind of
higher magic not intended for mere mortals. GCC will not warn about this at
all. Casting to void * is also a no-no, because then GCC does warn that strict
aliasing rules are being broken. The only safe way to query the ordering of two
pointers is to use the (in)equality operators.
The unsafe implementation of connection_compare() has probably caused the "old
connection_t for ... still lingering" messages. Our implementation of AVL trees
is augmented with a doubly linked list, which is normally what is traversed.
Only when deleting an old connection the tree itself is traversed.
If PMTUDiscovery is enabled, and we see a unicast packet that is larger than
the path MTU in switch mode, treat it just like we would do in router mode.
PMTUDiscovery was disabled in commit d5b56bbba5
because tinc did not handle packets larger than the path MTU in switch and hub
modes. We now allow it again in preparation of proper support, but default to
off.
Commit 5674bba5c5 introduced weighted Subnets,
but the weight was included in the SUBNET variable passed to subnet-up/down
scripts. This makes it harder to use in those scripts. The weight is now
stripped from the SUBNET variable and put in the WEIGHT variabel.
Grzegorz Dymarek noted that tinc segfaults at the stat() call in
execute_script() on the iPhone. We can omit the stat() call for the moment,
the subsequent call to system() will fail with just a warning.
This is a slightly modified patch from Grzegorz Dymarek that allows tinc to use
the tunemu device, which allows tinc to be compiled for iPhones and recent
iPods. To enable support for tunemu, the --enable-tunemu option has to be used
when running the configure script.
Valgrind caught tinc reading free'd memory during a purge(). This was caused by
first removing it from the main node tree, which will already call free_node(),
and then removing it from the UDP tree. This might cause spurious segmentation
faults.
Although we select() before we call recvfrom(), it sometimes happens that
select() tells us we can read but a subsequent read fails anyway. This is
harmless.
If there is an outstanding MTU probe event for a node which is not reachable
anymore, a UDP packet would be sent to that node, which caused a key request to
be sent to that node, which triggered a NULL pointer dereference. Probes and
other UDP packets to unreachable nodes are now dropped.
When chrooted, we either need to force-initialize resolver
and/or nsswitch somehow (no clean way) or resolve all the
names we want before entering chroot jail. The latter
looks cleaner, easier and it is actually safe because
we still don't talk with the remote nodes there, only
initiating outgoing connections.
This option can be set to low, normal or high. On UNIX flavours, this changes
the nice value of the process by +10, 0 and -10 respectively. On Windows, it
sets the priority to BELOW_NORMAL_PRIORITY_CLASS, NORMAL_PRIORITY_CLASS and
HIGH_PRIORITY_CLASS respectively.
A high priority might help to reduce latency and packet loss on the VPN.
If a host has multiple addresses on an interface, the source address of the TCP
connection(s) was picked by the operating system while the UDP packets used a
bound socket, i. e. the source address was the address specified by the user.
This caused problems because the receiving code requires the TCP connection and
the UDP connection to originate from the same IP address.
This patch adds support for the `BindToInterface' and `BindToAddress' options
to the setup of outgoing TCP connections.
Tested with Debian Etch on x86 and Debian Lenny on x86_64.
Signed-off-by: Florian Forster <octo@verplant.org>
If running without `--net', the (global) variable `netname' is NULL. This
creates a segmentation fault because this NULL-pointer is passed to strdup:
Program terminated with signal 11, Segmentation fault.
#0 0xb7d30463 in strlen () from /lib/tls/i686/cmov/libc.so.6
(gdb) bt
#0 0xb7d30463 in strlen () from /lib/tls/i686/cmov/libc.so.6
#1 0xb7d30175 in strdup () from /lib/tls/i686/cmov/libc.so.6
#2 0x0805bf47 in xstrdup (s=0x0) at xmalloc.c:118 <---
#3 0x0805be33 in setup_device () at device.c:66
#4 0x0805072e in setup_myself () at net_setup.c:432
#5 0x08050db2 in setup_network () at net_setup.c:536
#6 0x0805b27f in main (argc=Cannot access memory at address 0x0) at tincd.c:580
This patch fixes this by checking `netname' in `setup_device'. An alternative
would be to check for NULL-pointers in `xstrdup' and return NULL in this case.
Signed-off-by: Florian Forster <octo@verplant.org>
Add some logging about refused ADD_SUBNET
(it causes subsequent client disconnect so it's
important to know which subnet was at fault).
Maybe we should just ignore it completely.
First of all, the idea behind the TunnelServer option is to hide all other
nodes from each other, so we shouldn't forward broadcast packets from them
anyway. The other reason is that since edges from other nodes are ignored, the
calculated minimum spanning tree might not be correct, which can result in
routing loops.
Since compression can either grow or shrink a packet, the size of an MTU probe
after decompression might not reflect the real path MTU. Now we use the size
before decompression, which is independent of the compression algorithm, and
substract a safety margin such that the calculated path MTU will be safe even
for packets which grow as much as possible after compression.