Commit graph

2814 commits

Author SHA1 Message Date
thorkill
08f74b5603 Fix linker flags 2015-11-27 17:51:34 +01:00
thorkill
519f06e281 Fix a segfault in setup_outgoing_connection() on outgoing removal 2015-11-24 17:25:53 +01:00
thorkill
2ec9f1124d Merged with guus/1.1 2015-11-24 17:01:11 +01:00
thorkill
f58e8679e7 Revert "Working on fix "stuck" outgoing connections."
This reverts commit 703ed7fff6.
2015-11-24 16:55:03 +01:00
Guus Sliepen
9fdf4278f8 Don't leave dead outgoing_t's in the outgoing_list.
If an outgoing connection cannot be made because no address is known for
it, it should be removed from the outgoing_list, otherwise it will
prevent it from being re-added later when we do know addresses for it.
2015-11-24 16:48:44 +01:00
Etienne Dechamps
c58eba587d Add upnp.h to tincd SOURCES.
This was missing from 513bffe1fe.
2015-11-22 23:03:03 +01:00
thorkill
703ed7fff6 Working on fix "stuck" outgoing connections.
This problem occurs on "road-warriors" when tincd setups
outgoing connections but you do not have any active uplink then
dns-lookups will fail and any following attempt to make outgoing
connections will keep failing forever.
2015-11-22 22:50:51 +01:00
Etienne Dechamps
613d586afd Don't unset validkey when receiving SPTPS handshakes over ANS_KEY.
This fixes a hairy race condition that was introduced in
1e89a63f16, which changed
the underlying transport of handshake packets from REQ_KEY to ANS_KEY.
Unfortunately, what I missed in that commit is, on the receiving side,
there is a slight difference between req_key_h() and ans_key_h():
indeed, the latter resets validkey to false.

The reason why this is not a problem during typical operation is
because the normal SPTPS key regeneration procedure looks like this:

    KEX ->
    <- KEX
    SIG ->
    <- SIG

All these messages are sent over ANS_KEY, therefore the receiving side
will unset validkey. However, that's typically not a problem in practice
because upon reception of the last message (SIG), SPTPS will call
sptps_receive_record(), which will set validkey to true again, and
everything works out fine in the end.

However, that was the *typical* scenario. Now let's assume that the
SPTPS channel is in active use at the same time key regeneration
happens. Specifically, let's assume a normal VPN data packet sneaks in
during the key regeneration procedure:

    KEX ->
    <- KEX
    <- (SPTPS packet, over TCP or UDP)
    <- KEX (wtf?)
    SIG -> (refused with Invalid packet seqno: XXX != 0)

At this point, both nodes are extremely confused and the SPTPS channel
becomes unusable with various errors being thrown on both sides. The
channel will stay down until automatic SPTPS channel restart kicks in
after 10 seconds.

(Note: the above is just an example - the race can occur on either side
whenever a packet is sent during the period of time between KEX and SIG
messages are received by the node sending the packet.)

I've seen this race occur in the wild - it is very likely to occur if
key regeneration occurs on a heavily loaded channel. It can be
reproduced fairly easily by setting KeyExpire to a short value (a few
seconds) and then running something like ping -f foobar -i 0.01.

The reason why this occurs is because tinc's TX code path triggers the
following:

 - send_packet()
 - try_tx()
 - try_tx_sptps()
 - validkey is false because we just received an ANS_KEY message
 - waitingforkey is false because it's not used for key regeneration
 - send_req_key()
 - SPTPS channel restart (sptps_stop(), sptps_start()).

Obviously, it all goes downhill from there and the two nodes get very
confused quickly (for example the seqno gets reset, hence the error
messages).

This commit fixes the issue by keeping validkey set when SPTPS data is
received over ANS_KEY messages.
2015-11-22 17:53:52 +00:00
Guus Sliepen
95935cecb6 Update THANKS file. 2015-11-21 19:41:14 +01:00
Etienne Dechamps
0f6d34dc1b Try to ensure we build correctly against various libminiupnpc versions.
Unfortunately, libminiupnpc has a somewhat... "peculiar" approach to
backwards compatibility for their API, where they reserve the right to
make breaking changes when they feel like it, forcing users to resort
to #ifdefs to ensure they use the correct API. Sigh.

Previously, tinc would only build against API versions <= 13, because I
was doing my initial development using miniupnpc-1.9.20140610 which is
the version that ships with Debian. The changes in this commit are
required for tinc to build against more recent versions, from
1.9.20150730 to the latest one at the time of this commit, 1.9.20151026.
2015-11-21 16:18:01 +00:00
Etienne Dechamps
675e3b497b Allow tinc to be built with miniupnpc on Windows.
Contrary to what I expected, it so happens that modern versions of MinGW
include an implementation of pthread natively by default, so there is no
need to introduce Win32-specific threading code. This means the only
changes required to make UPnP work on Windows are just build parameter
tuning.

This commit forces MinGW to be built statically. This makes linking
against miniupnpc simpler (otherwise we would have to handle the mess
of dllimport & co.) and it also prevents libwinpthread from being linked
dynamically (which it is by default), as this would require additional
DLLs to be distributed. Since static linking is how tinc is
traditionally built on Windows, I don't expect this to be a big deal.
2015-11-21 16:18:01 +00:00
Etienne Dechamps
513bffe1fe Add UPnP support to tincd.
This commit makes tincd capable of discovering UPnP-IGD devices on the
local network, and add mappings (port redirects) for its TCP and/or UDP
port.

The goal is to improve reliability and performance of tinc with nodes
sitting behind home routers that support UPnP, by making it less reliant
on UDP Hole Punching, which is prone to failure when "hostile" NATs are
involved.

The way this is implemented is by leveraging the libminiupnpc library,
which we have just added a new dependency on. We use pthread to run the
UPnP client code in a dedicated thread; we can't use the tinc event loop
because libminiupnpc doesn't have a non-blocking API.
2015-11-21 16:17:59 +00:00
Etienne Dechamps
2bb567c6a3 Add a new optional dependency on the miniupnpc library.
The miniupnpc library is a lightweight UPnP-IGD client.

http://miniupnp.free.fr/

Contrary to other libraries, this dependency is disabled by default.
This is because the library is somewhat obscure and is only tangentially
useful, so enabling it by default would probably annoy most users.
2015-11-21 15:49:25 +00:00
thorkill
dcf313cdbf Merge remote-tracking branch 'remotes/guus/1.1' into thkr-1.1-ponyhof 2015-11-07 23:21:18 +01:00
Etienne Dechamps
bdd84660c7 Make sure the packet source MAC address is always set.
When tinc is used in router mode with a TAP device, Ethernet (MAC)
headers are not present in packets flowing over the VPN; it is the
node's responsibility to fill out this header before handing the
packet over to the TAP interface (which expects such headers).

Currently, tinc fills out the destination MAC address of the packet
(otherwise the host would not recognize the packets, and nothing would
work), but it does not fill out the source MAC address. In practice this
doesn't seem to cause any real issues (the host doesn't care about the
source address), but it does look weird when looking at the packets with
a sniffer, and it also result in the following valgrind warning:

    ==13651== Syscall param write(buf) points to uninitialised byte(s)
    ==13651==    at 0x5C4B620: __write_nocancel (syscall-template.S:81)
    ==13651==    by 0x1445AA: write_packet (device.c:183)
    ==13651==    by 0x118C7C: send_packet (net_packet.c:1259)
    ==13651==    by 0x12B70A: route_ipv4 (route.c:443)
    ==13651==    by 0x12D5F8: route (route.c:971)
    ==13651==    by 0x1152BC: receive_packet (net_packet.c:250)
    ==13651==    by 0x117E1B: receive_sptps_record (net_packet.c:904)
    ==13651==    by 0x1309A8: sptps_receive_data_datagram (sptps.c:488)
    ==13651==    by 0x130A90: sptps_receive_data (sptps.c:508)
    ==13651==    by 0x115569: receive_udppacket (net_packet.c:286)
    ==13651==    by 0x119856: handle_incoming_vpn_data (net_packet.c:1499)
    ==13651==    by 0x10F3DA: event_loop (event.c:287)
    ==13651==  Address 0xffeffea3a is on thread 1's stack
    ==13651==  in frame #6, created by receive_sptps_record (net_packet.c:821)
    ==13651==

This commit fixes the issue by filling out the source MAC address. It is
generated by negating the last byte of the device MAC address, which is
consistent with what route_arp() does.

In addition, this commit stops route_arp() from filling out the Ethernet
header of the packet - this is the responsibility of send_packet(), not
route().
2015-11-07 11:59:16 +00:00
thorkill
e95c1a93a7 Merge with guus/1.1 2015-11-06 22:56:46 +01:00
Rafał Leśniak
9b85a5b010 Merge pull request #2 from jan-schreib/malloc-checks
add malloc check
2015-11-06 22:34:40 +01:00
Etienne Dechamps
684bd659ae Revert "Cache node IDs in a hash table for faster lookups."
This reverts commit c2319e90b1.

As a general principle, I do not believe it is worthwhile to cache
nodes. Sure, it brings lookup time down from O(log n) to O(1), but
considering that the scalability target of tinc is around 1000 nodes
and log2(1000) is 10, that looks like premature optimization; tree
lookups should already be very fast. Therefore, I believe it makes sense
to remove the cache as a code cleanup initiative.
2015-11-04 19:36:06 +00:00
Etienne Dechamps
eeebff55c0 Use a splay tree for node UDP addresses in order to avoid collisions.
This commit replaces the node UDP address hash table "cache" with a
full-blown splay tree, aligning it with node_tree (name-indexed) and
node_id_tree (ID-indexed).

I'm doing this for two reasons. The first reason is to make sure we
don't suddenly degrade to O(n) performance when two "hot" nodes end up
in the same hash table bucket (collision).

The second, and most important, reason, has to do with the fact that
the hash table that was being used overrides elements that collide.
Indeed, it turns out that there is one scenario in which the contents of
node_udp_cache has *correctness* implications, not just performance
implications. This has to do with the way handle_incoming_vpn_data() is
implemented.

Assume the following topology:

  A <-> B <-> C

Now let's consider the perspective of tincd running on B, and let's
assume the following is true:

 - All nodes are using the 1.1 protocol with node IDs and relaying
   support.
 - Nodes A and C have UDP addresses that hash to the same value.
 - Node C "wins" in the node_udp_cache (i.e. it overwrites A in the
   cache).
 - Node A has a "dynamic" UDP address (i.e. an UDP address that has been
   detected dynamically and cannot be deduced from edge addresses).

Then, before this commit, A would be unable to relay packets through B.

This is because handle_incoming_vpn_data() will fall back to
try_harder(), which won't be able to match any edge addresses, doesn't
check the dynamic UDP addresses, and won't be able to match any keys
because this is a relayed packet which is encrypted with C's key, not
B's. As a result, tinc will fail to match the source of the packet and
will drop the packet with a "Received UDP packet from unknown source"
message.

I have seen this happen in the wild; it is actually quite likely to
occur when there are more than a handful of nodes because node_udp_cache
only has 256 buckets, making collisions quite likely. This problem is
quite severe because it can completely prevent all packet communication
between nodes - indeed, if node A tries to initiate some communication
with C, it will use relaying at first, until C responds and helps A
establish direct communication with it (e.g. hole punching). If relaying
is broken, C will not help establish direct communication, and as a
result no packets can make it through at all.

The bug can be reproduced fairly easily by reproducing the topology
above while changing the (hardcoded) node_udp_cache size to 1 to force a
collision. One will quickly observe various issues when trying to make A
talk to C. Setting IndirectData on B will make the issue even more
severe and prevent all communication.

Arguably, another way to fix this problem is to make try_harder()
compare the packet's source address to each node's dynamic UDP
addresses. However, I do not like this solution because if two "hot"
nodes are contending on the same hash bucket, try_harder() will be
called very often and packet routing performance will degrade closer to
O(N) (where N is the total number of nodes in the graph). Using a more
appropriate data structure fixes the bug without introducing this
performance problem.
2015-11-04 19:36:02 +00:00
Guus Sliepen
7a8515112a Avoid undefined behavior.
Left shifts of negative values is undefined in C. This happens a lot in
the Ed25519 code. Cast to unsigned first, then cast the result back to
signed where necessary.
2015-10-26 13:46:30 +01:00
Guus Sliepen
7306823843 Fix a few memory leaks in the CLI found by AddressSanitizer. 2015-09-25 10:06:18 +02:00
Guus Sliepen
543c0abbd9 Fix struct node_status_t.
Although not a problem for tinc internally, the size of the struct was 12
bytes instead of 4, causing some problems when interpreting the value
received from tincd by the CLI.
2015-09-25 10:05:24 +02:00
Guus Sliepen
706d855e50 Replace bare if statements with AS_IF in configure.ac. 2015-09-24 22:20:00 +02:00
Guus Sliepen
f54a87b800 Optionally install systemd service files.
If --with-systemd is given when running the configure script, two
systemd service files will be installed. There is a template
tinc@.service, which can be used to control individual instances of
tinc. For example:

systemctl enable tinc@foo

Will create an instance for tinc with netname foo. There is also a
tinc.service, which can be used to start and stop all instances at once.
2015-09-24 22:11:16 +02:00
Guus Sliepen
5ad43673ac Add -I m4 back to ACLOCAL_AMFLAGS.
In commit b7b5d51, AC_CONFIG_MACRO_DIRS([m4]) was added to configure.ac,
which is the current proper way of including the m4 directory. However,
old versions of autoconf ignore it and need the -I m4 statement in
Makefile.am. Both the old and new way of indicating that the m4/
directory should be included can coexist.
2015-09-24 17:10:25 +02:00
Nathan Stratton Treadway
ae89a25695 Fix invalid checksum generation.
Use equation 3 given in RFC 1624 and the UpdateTTL() example function given
RFC 1141.

# Conflicts:
#	src/route.c
2015-09-12 16:41:48 +02:00
hans
a9fb6db249 add malloc check
malloc can fail. check for errors or use xmalloc.
since this is bsd only, it is safe to use err and err.h.
2015-08-26 16:44:51 +02:00
Rafał Leśniak
569b1dbf15 Merge pull request #1 from jan-schreib/openbsd-build
Changes on Makefile.am and configure.ac to enable stack protection build on OpenBSD
2015-08-25 10:17:33 +02:00
hans
4710de8455 Activate fstack-protector-all on OpenBSD 2015-08-25 09:30:43 +02:00
hans
c9515a79de Make it build on openbsd.
Build on amd64 and sparc64.
2015-08-25 09:30:32 +02:00
thorkill
d9a8344467 Fix for unknown subnets
In a case where a node doesn't have AutoConnect = yes and StrictSubnet = yes
is set, the node would discard all ADD_SUBNET.
2015-07-26 15:14:40 +02:00
thorkill
af1213a7ae Revert "Do not recompile version if not needed"
This reverts commit 529576dad6.

This feature works only with gmake, BSD systems do not have
it and we do not want to force users to install it.
2015-07-26 12:22:22 +02:00
thorkill
529576dad6 Do not recompile version if not needed 2015-07-26 12:15:45 +02:00
thorkill
2d38e37168 Make make dist work when /bin/sh != /bin/bash 2015-07-24 19:14:20 +02:00
thorkill
618ddadeab Fixed a segfault when all nodes available for autoconnect has been exhausted
In cases when tinc has all available nodes in outgoing connections and
can not establish those connection due to network outage periodic_handler()
would crash since tmp_node_tree->count is 0.

This commit adds also new flag node->status.has_cfg_address to prevent
update_udp_address() from removing this flag.

Fixed node_status_t->unused - 13 + 19 = 32
2015-07-23 20:46:20 +02:00
thorkill
f12d4a3e6d Merged load_all_subnets and load_all_nodes to make autoconnect and strictsubnets work faster
When AutoConnect is on tinc needs to know if nodes have Address to defined
in thier hosts files. Currently tinc parsed node's host files if StrictSubnet
was enabled. To reduce the parsing overhead I have merged load_all_subnets
with load_all_nodes, such that load_all_subnets has been removed and
load_all_nodes has if-statement extracting Subnet information from node's host
file.
2015-07-23 18:34:29 +02:00
thorkill
3c67735720 Make autoconnect faster
When AutoConnect is enabled tinc tries to connect to other nodes picking them at random.
This may be sane default behavior but it may take ages if only few nodes have
defined Address in thier config.

Proposed solution to this problem:
- Filter out nodes without known address in periodic_handler
  I have added new node->status.has_known_address bool
- On update_node_udp() update this flag
2015-07-23 18:02:30 +02:00
thorkill
d16a43c06c Revert "It seems that this patch is needed. Strange things happens."
This reverts commit 50bf9b5a1a.
2015-07-22 15:32:36 +02:00
Guus Sliepen
24c3bebc5c In sssp_bfs(), never try to update myself. 2015-07-22 15:32:36 +02:00
Guus Sliepen
56a8b90d86 In sssp_bfs(), never try to update myself. 2015-07-22 14:33:56 +02:00
thorkill
0842bc0ca5 Revert "Added missing check to e->to->prevedge"
This reverts commit 4077acd583.
2015-07-21 19:39:08 +02:00
thorkill
512c64980a Merge branch 'thkr-1.1-ponyhof' of github.com:thorkill/tinc into thkr-1.1-ponyhof 2015-07-21 10:11:36 +02:00
thorkill
4077acd583 Added missing check to e->to->prevedge 2015-07-21 10:10:37 +02:00
thorkill
1edf49be14 Reduce logger calls 2015-07-20 11:10:27 +02:00
thorkill
8c4cdfc37c Prevent update_node_udp from changing our udp address
Follup to 6dbcd4eb3d

- myself is always reachable
- do not call update_node_udp if e->to == myself
2015-07-20 08:19:37 +02:00
thorkill
f75e6f61f2 Do not access e->to->prevedge if not defined
In some cases - mostly when e->to == myself the prevedge is set to NULL,
causing invalid memory access. In rare cases this may lead to malformed mst
or segfaults.
2015-07-19 22:33:43 +02:00
thorkill
6dbcd4eb3d Do not access e->to->prevedge if not defined
In some cases - mostly when e->to == myself the prevedge is set to NULL,
causing invalid memory access. In rare cases this may lead to malformed mst
or segfaults.
2015-07-19 18:54:08 +02:00
thorkill
bc747f8146 Merged changes with origin/1.1 2015-07-17 15:36:00 +02:00
thorkill
b68eaa7ce4 merged with origin/1.1 2015-07-17 00:29:46 +02:00
Guus Sliepen
f92c3446f2 Use AC_CONFIG_MACRO_DIR() instead of _DIRS().
The former is guaranteed to work with autoconf 2.58 and later, and we
don't have multiple m4 directories anyway.
2015-07-15 15:12:53 +02:00