This introduces a new type of identifier for nodes, which complements
node names: node IDs. Node IDs are defined as the first 6 bytes of the
SHA-256 hash of the node name. They will be used in future code in lieu
of node names as unique node identifiers in contexts where space is at
a premium (such as VPN packets).
The semantics of node IDs is that they are supposed to be unique in a
tinc graph; i.e. two different nodes that are part of the same graph
should not have the same ID, otherwise things could break. This
solution provides this guarantee based on realistic probabilities:
indeed, according to the birthday problem, with a 48-bit hash, the
probability of at least one collision is 1e-13 with 10 nodes, 1e-11
with 100 nodes, 1e-9 with 1000 nodes and 1e-7 with 10000 nodes. Things
only start getting hairy with more than 1 million nodes, as the
probability gets over 0.2%.
Currently, when tinc receives an UDP packet from an unexpected address
(i.e. an address different from the node's current address), it just
updates its internal UDP address record and carries on like nothing
happened.
This poses two problems:
- It assumes that the PMTU for the new address is the same as the
old address, which is risky. Packets might get dropped if the PMTU
turns out to be smaller (or if UDP communication on the new address
turns out to be impossible).
- Because the source address in the UDP packet itself is not
authenticated (i.e. it can be forged by an attacker), this
introduces a potential vulnerability by which an attacker with
control over one link can trick a tinc node into dumping its network
traffic to an arbitrary IP address.
This commit fixes the issue by invalidating UDP/PMTU state for a node
when its UDP address changes. This will trigger a temporary fallback
to indirect communication until we get confirmation via PMTU discovery
that the node is indeed sitting at the other end of the new UDP address.
Currently tinc only uses type 2 MTU probe replies if the recipient uses
protocol version 17.3. It should of course support any higher minor
protocol version as well.
In this commit, if a node receives a REQ_PUBKEY message from a node it
doesn't have the key for, it will send a REQ_PUBKEY message in return
*before* sending its own key.
The rationale is to prevent delays when establishing communication
between two nodes that see each other for the first time. These delays
are caused by the first SPTPS packet being dropped on the floor, as
shown in the following typical exchange:
node1: No Ed25519 key known for node2
REQ_PUBKEY ->
<- ANS_PUBKEY
node1: Learned Ed25519 public key from node2
REQ_SPTPS_START ->
node2: No Ed25519 key known for zyklos
<- REQ_PUBKEY
ANS_PUBKEY ->
node2: Learned Ed25519 public key from node1
-- 10-second delay --
node1: No key from node2 after 10 seconds, restarting SPTPS
REQ_SPTPS_START ->
<- SPTPS ->
node1: SPTPS key exchange with node2 succesful
node2: SPTPS key exchange with node1 succesful
With this patch, the following happens instead:
node1: No Ed25519 key known for node2
REQ_PUBKEY ->
node2: Preemptively requesting Ed25519 key for node1
<- REQ_PUBKEY
<- ANS_PUBKEY
ANS_PUBKEY ->
node2: Learned Ed25519 public key from node1
node1: Learned Ed25519 public key from node2
REQ_SPTPS_START ->
<- SPTPS ->
node1: SPTPS key exchange with node2 succesful
node2: SPTPS key exchange with node1 succesful
There are platforms on which it is impossible to rename the TUN/TAP
device. An example is Mac OS X (tuntapx). On these platforms,
specifying the Interface option will not rename the interface, but
the specified name will still be passed to tinc-up scripts and the
like, resulting in potential confusion for the user.
A logic bug was introduced in bd451cfe15
in which running graph() several times with zero reachable nodes had
the effect of calling device_enable() (instead of keeping the device
disabled).
This results in weird behavior when DeviceStandby is enabled, especially
on Windows where calling device_enable() several times in a row corrupts
I/O structures for the device, rendering it unusable.
The Windows build was broken by commit
826ad11e41 which introduced a dependency
on the HOST_NAME_MAX macro, which is not defined on Windows. According
to MSDN for gethostname(), the maximum length of the returned string
is 256 bytes (including the terminating null byte), so let's use that
as a fallback.
Successfully getting an existing variable ("tinc get name") should
not result in an error exitcode (1) from the tinc command.
This changes the result of test/commandline.test from FAIL to PASS.
The handling of TAP-Win32 virtual network device reads that complete
immediately (ReadFile() returns TRUE) is incorrect - instead of
starting a new read, tinc will continue listening for the overlapped
read completion event which will never fire. As a result, tinc stops
receiving packets on the interface.
With newer TAP-Win32 versions (such as the experimental
tap-windows6 9.21.0), tinc is unable to read from the virtual network
device:
Error while reading from (null) {23810A13-BCA9-44CE-94C6-9AEDFBF85736}: No such file or directory
This is because these new drivers apparently don't accept reads when
the device is not in the connected state (media status).
This commit fixes the issue by making sure we start reading no sooner
than when the device is enabled, and that we stop reading when the
device is disabled. This also makes the behavior somewhat cleaner,
because it doesn't make much sense to read from a disabled device
anyway.
Some tinc commands, such as "tinc generate-keys", use the terminal to
ask the user for information. This can be bypassed by making sure
there is no terminal, which is trivial on *nix but might require
jumping through some hoops on Windows depending on how the command is
invoked.
This commit adds a --batch option that ensures tinc will never ask the
user for input, even if it is attached to a terminal.
This is a slight optimization for sptps_verify_datagram(), which might
come in handy since this function is called in a loop via try_harder().
It turns out that since sptps_verify_datagram() doesn't update any
state, it doesn't matter in which order verifications are done. However,
it does affect performance since it's much cheaper to check the seqno
than to try to decrypt the packet.
Since this function is called with the wrong node most of the time, it
makes verification vastly faster for the majority of calls because the
seqno will be wrong in most cases.
There are two caveats to be aware of which are documented in this
commit:
- Because the system will likely assign different ports when binding
several times to different address families, it is recommended to
only use a single address family, otherwise other nodes will only
get one port among the several that were assigned, possibly breaking
communication.
- AutoConnect won't work in this scenario, because it relies on the UDP
port being the same as the TCP port, which is not the case when using
system-assigned ports.
When invoking tincd, tinc start currently uses the execvp() function,
which doesn't behave well in a console as the console displays a new
prompt before the subprocess finishes (which makes me suspect the exit
value is not handled at all). This new code uses spawnvp() instead,
which seems like a better fit.
When invoking "tinc start" with spaces in the path, the following
happens:
> "c:\Program Files (x86)\tinc\tinc.exe" start
c:\Program: unrecognized argument 'Files'
Try `c:\Program --help' for more information.
This is caused by inconsistent handling of command line strings between
execvp() and the spawned process' CRT, as documented on MSDN:
http://msdn.microsoft.com/library/431x4c1w.aspx
This commit makes tinc exit cleanly on Windows when hitting CTRL+C at
the console or when the user logs off. This change has no effect when
running tinc as a service.
This fixes the following compiler warning when building for Windows:
In file included from top.c:24:0:
/usr/local/mingw/ncurses/include/curses.h:1478:0: error: "KEY_EVENT" redefined [-Werror]
#define KEY_EVENT 0633 /* We were interrupted by an event */
^
In file included from /usr/share/mingw-w64/include/windows.h:74:0,
from /usr/share/mingw-w64/include/winsock2.h:23,
from have.h:46,
from system.h:26,
from top.c:20:
/usr/share/mingw-w64/include/wincon.h:101:0: note: this is the location of the previous definition
#define KEY_EVENT 0x1
^
This removes a bunch of variables that are never actually used anywhere.
This fixes the following compiler warning when building for Windows:
mingw/device.c:46:17: error: ‘device_total_in’ defined but not used [-Werror=unused-variable]
static uint64_t device_total_in = 0;
^
This fixes the following compiler warning when building for Windows:
mingw/device.c: In function ‘setup_device’:
mingw/device.c:92:9: error: unused variable ‘thread’ [-Werror=unused-variable]
HANDLE thread;
^
This fixes the following compiler warning when building for Windows:
mingw/device.c: In function ‘setup_device’:
mingw/device.c:186:2: error: passing argument 2 of ‘io_add_event’ from incompatible pointer type [-Werror]
io_add_event(&device_read_io, device_handle_read, NULL, CreateEvent(NULL, TRUE, FALSE, NULL));
^
In file included from mingw/../net.h:27:0,
from mingw/../subnet.h:24,
from mingw/../conf.h:34,
from mingw/device.c:26:
mingw/../event.h:61:13: note: expected ‘io_cb_t’ but argument is of type ‘void (*)(void *)’
extern void io_add_event(io_t *io, io_cb_t cb, void* data, WSAEVENT event);
This fixes the following compiler warning when building for Windows:
script.c: In function ‘execute_script’:
script.c:52:5: error: value computed is not used [-Werror=unused-value]
*q++;
^
This fixes the following compiler warning when building for Windows:
net_packet.c: In function ‘send_udppacket’:
net_packet.c:633:6: error: unused variable ‘origpriority’ [-Werror=unused-variable]
int origpriority = origpkt->priority;
^
This is so the positions of the other bits don't change, making it easier to
debug problems with different versions of tinc.
Also fix the padding so connection_status_t is exactly 32 bits.
The only places where connection_t::status.active is modified is in
ack_h() and terminate_connection(). In both cases, connection_t::edge
is added and removed at the same time, and that's the only places
connection_t::edge is set. Therefore, the following is true at all
times:
!c->status.active == !c->edge
This commit removes the redundant state information by getting rid of
connection_t::status.active, and using connection_t::edge instead.
in receive_udppacket(), we initialize outpkt to a default value but the
value is never read anywhere, as every read is preceded by a write.
This issue was found by the clang static analyzer tool:
http://clang-analyzer.llvm.org/
If choose_local_address() is unable to find a local address (e.g.
because of old nodes that don't send their local address information),
then send_sptps_data() ends up using uninitialized variables for the
socket and address.
This regression was introduced in
4159108971. The commit took care of
handling that case in send_udppacket() but was missing the same fix
for send_sptps_data().
This bug was found by the clang static analyzer tool:
http://clang-analyzer.llvm.org/
Based on a patch from Etienne Dechamps. We avoid the use of %hhx, since even
though it is C99, not all compilers support it yet. We use %x instead, since
it's guaranteed that the minimum size of function arguments on the stack or in
registers is that of an int.
On Windows, the event loop io tree uses the Windows Event handle to
differentiate between io_t objects. Unfortunately, there is a bug in
the io_add_event() function (introduced in
2f9a1d4ab5) as it sets the event after
inserting the object into the tree, resulting in objects appearing in
io_tree out of order.
This can lead to crashes on Windows as the event loop is unable to
determine which events fired.