Commit graph

2608 commits

Author SHA1 Message Date
Etienne Dechamps
2bb567c6a3 Add a new optional dependency on the miniupnpc library.
The miniupnpc library is a lightweight UPnP-IGD client.

http://miniupnp.free.fr/

Contrary to other libraries, this dependency is disabled by default.
This is because the library is somewhat obscure and is only tangentially
useful, so enabling it by default would probably annoy most users.
2015-11-21 15:49:25 +00:00
Etienne Dechamps
bdd84660c7 Make sure the packet source MAC address is always set.
When tinc is used in router mode with a TAP device, Ethernet (MAC)
headers are not present in packets flowing over the VPN; it is the
node's responsibility to fill out this header before handing the
packet over to the TAP interface (which expects such headers).

Currently, tinc fills out the destination MAC address of the packet
(otherwise the host would not recognize the packets, and nothing would
work), but it does not fill out the source MAC address. In practice this
doesn't seem to cause any real issues (the host doesn't care about the
source address), but it does look weird when looking at the packets with
a sniffer, and it also result in the following valgrind warning:

    ==13651== Syscall param write(buf) points to uninitialised byte(s)
    ==13651==    at 0x5C4B620: __write_nocancel (syscall-template.S:81)
    ==13651==    by 0x1445AA: write_packet (device.c:183)
    ==13651==    by 0x118C7C: send_packet (net_packet.c:1259)
    ==13651==    by 0x12B70A: route_ipv4 (route.c:443)
    ==13651==    by 0x12D5F8: route (route.c:971)
    ==13651==    by 0x1152BC: receive_packet (net_packet.c:250)
    ==13651==    by 0x117E1B: receive_sptps_record (net_packet.c:904)
    ==13651==    by 0x1309A8: sptps_receive_data_datagram (sptps.c:488)
    ==13651==    by 0x130A90: sptps_receive_data (sptps.c:508)
    ==13651==    by 0x115569: receive_udppacket (net_packet.c:286)
    ==13651==    by 0x119856: handle_incoming_vpn_data (net_packet.c:1499)
    ==13651==    by 0x10F3DA: event_loop (event.c:287)
    ==13651==  Address 0xffeffea3a is on thread 1's stack
    ==13651==  in frame #6, created by receive_sptps_record (net_packet.c:821)
    ==13651==

This commit fixes the issue by filling out the source MAC address. It is
generated by negating the last byte of the device MAC address, which is
consistent with what route_arp() does.

In addition, this commit stops route_arp() from filling out the Ethernet
header of the packet - this is the responsibility of send_packet(), not
route().
2015-11-07 11:59:16 +00:00
Etienne Dechamps
684bd659ae Revert "Cache node IDs in a hash table for faster lookups."
This reverts commit c2319e90b1.

As a general principle, I do not believe it is worthwhile to cache
nodes. Sure, it brings lookup time down from O(log n) to O(1), but
considering that the scalability target of tinc is around 1000 nodes
and log2(1000) is 10, that looks like premature optimization; tree
lookups should already be very fast. Therefore, I believe it makes sense
to remove the cache as a code cleanup initiative.
2015-11-04 19:36:06 +00:00
Etienne Dechamps
eeebff55c0 Use a splay tree for node UDP addresses in order to avoid collisions.
This commit replaces the node UDP address hash table "cache" with a
full-blown splay tree, aligning it with node_tree (name-indexed) and
node_id_tree (ID-indexed).

I'm doing this for two reasons. The first reason is to make sure we
don't suddenly degrade to O(n) performance when two "hot" nodes end up
in the same hash table bucket (collision).

The second, and most important, reason, has to do with the fact that
the hash table that was being used overrides elements that collide.
Indeed, it turns out that there is one scenario in which the contents of
node_udp_cache has *correctness* implications, not just performance
implications. This has to do with the way handle_incoming_vpn_data() is
implemented.

Assume the following topology:

  A <-> B <-> C

Now let's consider the perspective of tincd running on B, and let's
assume the following is true:

 - All nodes are using the 1.1 protocol with node IDs and relaying
   support.
 - Nodes A and C have UDP addresses that hash to the same value.
 - Node C "wins" in the node_udp_cache (i.e. it overwrites A in the
   cache).
 - Node A has a "dynamic" UDP address (i.e. an UDP address that has been
   detected dynamically and cannot be deduced from edge addresses).

Then, before this commit, A would be unable to relay packets through B.

This is because handle_incoming_vpn_data() will fall back to
try_harder(), which won't be able to match any edge addresses, doesn't
check the dynamic UDP addresses, and won't be able to match any keys
because this is a relayed packet which is encrypted with C's key, not
B's. As a result, tinc will fail to match the source of the packet and
will drop the packet with a "Received UDP packet from unknown source"
message.

I have seen this happen in the wild; it is actually quite likely to
occur when there are more than a handful of nodes because node_udp_cache
only has 256 buckets, making collisions quite likely. This problem is
quite severe because it can completely prevent all packet communication
between nodes - indeed, if node A tries to initiate some communication
with C, it will use relaying at first, until C responds and helps A
establish direct communication with it (e.g. hole punching). If relaying
is broken, C will not help establish direct communication, and as a
result no packets can make it through at all.

The bug can be reproduced fairly easily by reproducing the topology
above while changing the (hardcoded) node_udp_cache size to 1 to force a
collision. One will quickly observe various issues when trying to make A
talk to C. Setting IndirectData on B will make the issue even more
severe and prevent all communication.

Arguably, another way to fix this problem is to make try_harder()
compare the packet's source address to each node's dynamic UDP
addresses. However, I do not like this solution because if two "hot"
nodes are contending on the same hash bucket, try_harder() will be
called very often and packet routing performance will degrade closer to
O(N) (where N is the total number of nodes in the graph). Using a more
appropriate data structure fixes the bug without introducing this
performance problem.
2015-11-04 19:36:02 +00:00
Guus Sliepen
7a8515112a Avoid undefined behavior.
Left shifts of negative values is undefined in C. This happens a lot in
the Ed25519 code. Cast to unsigned first, then cast the result back to
signed where necessary.
2015-10-26 13:46:30 +01:00
Guus Sliepen
7306823843 Fix a few memory leaks in the CLI found by AddressSanitizer. 2015-09-25 10:06:18 +02:00
Guus Sliepen
543c0abbd9 Fix struct node_status_t.
Although not a problem for tinc internally, the size of the struct was 12
bytes instead of 4, causing some problems when interpreting the value
received from tincd by the CLI.
2015-09-25 10:05:24 +02:00
Guus Sliepen
706d855e50 Replace bare if statements with AS_IF in configure.ac. 2015-09-24 22:20:00 +02:00
Guus Sliepen
f54a87b800 Optionally install systemd service files.
If --with-systemd is given when running the configure script, two
systemd service files will be installed. There is a template
tinc@.service, which can be used to control individual instances of
tinc. For example:

systemctl enable tinc@foo

Will create an instance for tinc with netname foo. There is also a
tinc.service, which can be used to start and stop all instances at once.
2015-09-24 22:11:16 +02:00
Guus Sliepen
5ad43673ac Add -I m4 back to ACLOCAL_AMFLAGS.
In commit b7b5d51, AC_CONFIG_MACRO_DIRS([m4]) was added to configure.ac,
which is the current proper way of including the m4 directory. However,
old versions of autoconf ignore it and need the -I m4 statement in
Makefile.am. Both the old and new way of indicating that the m4/
directory should be included can coexist.
2015-09-24 17:10:25 +02:00
Nathan Stratton Treadway
ae89a25695 Fix invalid checksum generation.
Use equation 3 given in RFC 1624 and the UpdateTTL() example function given
RFC 1141.

# Conflicts:
#	src/route.c
2015-09-12 16:41:48 +02:00
Guus Sliepen
56a8b90d86 In sssp_bfs(), never try to update myself. 2015-07-22 14:33:56 +02:00
thorkill
f75e6f61f2 Do not access e->to->prevedge if not defined
In some cases - mostly when e->to == myself the prevedge is set to NULL,
causing invalid memory access. In rare cases this may lead to malformed mst
or segfaults.
2015-07-19 22:33:43 +02:00
Guus Sliepen
f92c3446f2 Use AC_CONFIG_MACRO_DIR() instead of _DIRS().
The former is guaranteed to work with autoconf 2.58 and later, and we
don't have multiple m4 directories anyway.
2015-07-15 15:12:53 +02:00
Guus Sliepen
9ca1750245 Fix the PRF function when compiling without OpenSSL. 2015-07-12 16:31:32 +02:00
thorkill
3c54765bcd Prevent tinc from forgeting e->local_address
If ADD_EDGE came from tinc version 1.0.x local_address.sa.sa_family is set to 0.
If it came from tinc version 1.1.x forwarded for older verion it will be 255 - AF_UNKNOWN.
2015-07-12 13:32:38 +02:00
thorkill
1e7ef38198 Make sure we do not allocate new edge when talking to old nodes and the same edge already exists
When tinc gets ADD_EDGE from older versions it will allocate
new edge in protocol_edge.c:189 due to missed case in lines 149-171 where
local_address is not defined.
2015-07-12 13:31:07 +02:00
Guus Sliepen
7b831804aa Make subnet caches static. 2015-07-12 13:08:34 +02:00
thorkill
322ffadac4 Included missing names.h 2015-07-12 13:06:38 +02:00
Guus Sliepen
b7b5d51613 Use AC_CONFIG_MACRO_DIRS([m4]). 2015-07-12 13:05:51 +02:00
Guus Sliepen
97457716d7 Remove unused code that caused warnings about an uninitialized variable. 2015-07-12 12:55:13 +02:00
thorkill
b22b9d4389 Removed double break; 2015-07-12 12:39:36 +02:00
Guus Sliepen
b396585383 Fix undefined behaviour when left-shifting signed integers.
Found by -fsanitize=undefined.
2015-07-12 12:33:07 +02:00
Guus Sliepen
de7d9ee437 Call sockaddrfree(&e->local_address) in free_edge() instead of exit_edges().
The proper place to clean up resources of objects is in their
destructor. This makes sure proper cleanup when edge_del() is called as
well. At exit, free_edge() is called on all edges by free_edge_tree(),
which is called by exit_nodes().
2015-07-04 17:53:11 +02:00
Guus Sliepen
36cec9af88 Coalesce two if statements that check for the same thing. 2015-07-04 17:51:05 +02:00
Jo-Philipp Wich
14ccf50954 fix musl compatibility
Let configure include sys/if_tun.h when testing for netinet/if_ether.h
to detect the Kernel/libc header conflict on musl.

After this patch, configure will correctly detect netinet/if_ether.h as
unusable and the subsequent compilation will not attempt to use it.

Conflicts:
	src/have.h
2015-07-04 17:34:37 +02:00
Guus Sliepen
37588b8d5c Don't #include OpenSSL headers when compiling without OpenSSL. 2015-07-04 17:34:31 +02:00
thorkill
abb24e9d71 Cleanup local_address in protocol_edge.c
In line 131 local_address has been defined,
but the memory was never freed on return.
2015-07-04 03:24:13 +02:00
thorkill
92df36a610 Cleanup edges stored in edge_weight_tree on exit
protocol_edge.c: 131 defines local_address using str2sockaddr

str2sockaddr() allocates memory which has to be freed on exit.
2015-07-04 03:24:05 +02:00
thorkill
1140ca6d30 Fixed 2 leaks in setup_myself() 2015-07-04 03:23:58 +02:00
Florian Klink
0267aef826 setup_outgoing_connection: log to LOG_DEBUG on if no known address
With AutoConnect = yes, tinc tries to establish connections to known hosts.
However, you could have set no Address for this host, which is perfectly fine
(as long as there is at least one bootstrap node with an address or a local
discovered node already part of the network)

So log this to LOG_DEBUG
2015-07-02 21:22:53 +02:00
Florian Klink
91355b9ac5 (read|append)_config_file: log open errors as LOG_DEBUG
In a "decentrally managed vpn" it is very likely that host config
files for some reachable nodes do not exist. Currently, tinc
fills the logs with "Cannot open config file" messages.

This commit changes the log level to LOG_DEBUG so
syslog doesn't get filled by default.
2015-07-02 21:22:47 +02:00
Etienne Dechamps
ebffa40aa7 Protect against callbacks removing items from the io tree.
The definition of the splay_each() macro is somewhat complicated for
syntactic reasons. Here's what it does in a more readable way:

  for (splay_node_t* node = tree->head; node;) {
    type* item = node->data;
    splay_node_t* next = node->next;

    // RUN USER BLOCK with (item)

    node = next;
  }

list_each() works in the same way. Since node->next is saved before the
user block runs, this construct supports removing the current item from
within the user block. However, what it does *not* support is removing
*other items* from within the user block, especially the next item.
Indeed, that will invalide the next pointer in the above loop and
therefore result in an invalid pointer dereference.

Unfortunately, there is at least one code path where that unsupported
operation happens. It is located in ack_h(), where the authentication
protocol code detects a double connection (i.e. being connected to
another node twice). Running in the context of a socket read event, this
code will happily terminate the *other* metaconnection, resulting in its
socket being removed from the io tree. If, by misfortune, this other
metaconnection happened to have the next socket FD number (which is
quite possible due to FD reuse - albeit unlikely), and was part of the
io tree (which is quite likely because if that connection is stuck, it
will most likely have pending writes) then this will result in the next
pending io item being destroyed. Invalid pointer dereference ensues.

I did a quick audit of other uses of splay_each() and list_each() and
I believe this is the only scenario in which this "next pointer
invalidation" problem can occur in practice. While this bug has been
there since at least 6bc5d626a8 (November
2012), if not sooner, it happens quite rarely due to the very specific
set of conditions required to trigger it. Nevertheless, it does manage
to crash my central production nodes every other week or so.
2015-06-20 14:09:00 +01:00
Dato Simó
7f020cf456 Fix typo in tinc.texi. 2015-06-16 20:53:16 -03:00
Guus Sliepen
45a46f068c Fix crash is sptps_logger().
Unfortunately, sptps_logger() cannot know if s->handle is pointing to a
connection_t or a node_t. But it needs to print name and hostname in
both cases. So make sure both types have name and hostname fields at the
start with the same offset.
2015-06-10 23:42:17 +02:00
Guus Sliepen
bfe231b977 Fix alignment of output of sptps_speed. 2015-06-07 23:20:14 +02:00
Guus Sliepen
a797b4a192 Fix receiving SPTPS data in sptps_speed and sptps_test.
The sptps_receive_data() was changed in commit d237efd to only process
one SPTPS record from a stream input. So now we have to put a loop
around it to ensure we process everything.
2015-06-07 23:17:54 +02:00
Guus Sliepen
d8d1ab4ee1 Fix warnings about missing return value checks.
In some harmless places, checks for the return value of ECDSA and RSA
key generation and verification was omitted. Add them to keep the
compiler happy and to warn end users in case something is wrong.
2015-06-07 22:50:05 +02:00
Guus Sliepen
ab0576a203 Fix autoconf check for function attributes.
GCC warns when a function attribute has no effect. The autoconf check
turns warnings about attributes into errors, therefore thinking that
they did not work. The reason was that the test function returned void,
which is not suitable for checking both __malloc__ and
__warn_unused_result__.
2015-06-07 22:25:22 +02:00
Guus Sliepen
84ecc972e5 Fix missing return value caused by the previous commit. 2015-05-31 23:51:39 +02:00
Etienne Dechamps
eca357ed91 Don't try to relay packets to unreachable nodes.
It is not unusual for tinc to receive SPTPS packets to be relayed to
nodes that just became unreachable, due to state propagation delays in
the metagraph.

Unfortunately, the current code doesn't handle that situation correctly,
and still tries to relay the packet to the unreachable node. This
typically ends up segfaulting.

This commit fixes the issue by checking for reachability before relaying
the packet.
2015-05-31 20:19:48 +01:00
Etienne Dechamps
9e3adef5cb Fix invalid pointer use in get_my_hostname().
clang-3.7 warnings surfaced an actual bug:

invitation.c:185:5: error: address of array 'filename' will always evaluate to 'true'
      [-Werror,-Wpointer-bool-conversion]
        if(filename) {
        ~~ ^~~~~~~~

The regression was introduced in 3ccdf50beb.
2015-05-24 09:49:16 +01:00
Etienne Dechamps
7fcfbe2bd2 Fix wrong format string type in send_sptps_tcppacket().
This issue was found through a clang-3.7 warning:

protocol_misc.c:167:46: error: format specifies type 'short' but the argument has type 'int'
      [-Werror,-Wformat]
        if(!send_request(c, "%d %hd", SPTPS_PACKET, len))
                                ~~~                 ^~~
                                %d
2015-05-24 09:45:09 +01:00
Etienne Dechamps
3e61c7233b Don't set up an ongoing connection to myself.
It is entirely possible that the configuration file could contain a
ConnectTo statement refering to its own name; that's a reasonable
scenario when one deploys semi-automatically generated tinc.conf files.

Amusingly, tinc does not like that at all, and actually sets up an
outgoing_t structure to myself (which obviously makes no sense). This is
mostly benign, though it does result in non-sensical "Already connected
to myself" messages every retry interval.

However, that also makes things blow up in close_network_connections(),
because there we delete the entire outgoing list and *then* the myself
node, which still has a reference to the freshly deleted outgoing
structure. Boom.
2015-05-23 17:33:32 +01:00
Etienne Dechamps
8587e8c0d9 Fix crashes when trying unreachable nodes.
timeout_handler() calls try_tx(c->node) when c->edge exists.
Unfortunately, the existence of c->edge is not enough to conclude that
the node is reachable.

In fact, during connection establishment, there is a short period of
time where we create an edge for the node at the other end of the
metaconnection, but we don't have one from the other side yet.
Unfortunately, if timeout_handler() runs during that short time
window, it will call try_tx() on an unreachable node, which makes
things explode because that function is not prepared to handle that
case.

A typical symptom of this race condition is a hard SEGFAULT while trying
to send packets using metaconnections that don't exist, due to
n->nexthop containing garbage.

This patch fixes the issue by making try_tx() check for reachability,
and then making all code paths use try_tx() instead of the more
specialized methods so that they go through the check.

This regression was introduced in
eb7a0db18e.
2015-05-23 10:24:00 +01:00
Guus Sliepen
537a936671 Update copyright notices. 2015-05-21 11:09:01 +02:00
Guus Sliepen
0a786ffbb9 Set the CLOEXEC flag on the umbilical socket. 2015-05-21 11:06:38 +02:00
Guus Sliepen
87e0952773 Use socketpair() instead of pipe() for the umbilical.
This prepares for a possible conversion of the umbilical socket to a
control socket.
2015-05-20 21:28:54 +02:00
Guus Sliepen
19e0d449eb Don't write log messages to the umbilical pipe if we don't detach.
If we run in the foreground and are started by the CLI, this would
otherwise cause the first few log messages to appear twice.
2015-05-20 21:25:06 +02:00
Guus Sliepen
11868b890d Ensure "tinc start" knows if the daemon really started succesfully.
We do this by creating an umbilical between the CLI and the daemon. The
daemon pipes log messages to the CLI until it starts the main loop. The
daemon then cuts the umbilical. The CLI copies all the received log
messages to stderr, and the last byte indicates whether the daemon
started succesfully or not, so the CLI can exit with a useful exit code.
2015-05-20 16:59:43 +02:00