j3d1/tinc - Neuland Labor e.V.: Git

j3d1/tinc

Author	SHA1	Message	Date
Guus Sliepen	e4fd81ed2d	Add support for recvmmsg(). Based on a patch from Samuel Thibault and input from Michael Tokarev.	2015-12-10 16:36:10 +01:00
Guus Sliepen	cef40b8b97	list_delete() already free()s the deleted element.	2015-11-26 11:29:54 +01:00
Guus Sliepen	9fdf4278f8	Don't leave dead outgoing_t's in the outgoing_list. If an outgoing connection cannot be made because no address is known for it, it should be removed from the outgoing_list, otherwise it will prevent it from being re-added later when we do know addresses for it.	2015-11-24 16:48:44 +01:00
Etienne Dechamps	c58eba587d	Add upnp.h to tincd SOURCES. This was missing from `513bffe1fe`.	2015-11-22 23:03:03 +01:00
Etienne Dechamps	613d586afd	Don't unset validkey when receiving SPTPS handshakes over ANS_KEY. This fixes a hairy race condition that was introduced in `1e89a63f16`, which changed the underlying transport of handshake packets from REQ_KEY to ANS_KEY. Unfortunately, what I missed in that commit is, on the receiving side, there is a slight difference between req_key_h() and ans_key_h(): indeed, the latter resets validkey to false. The reason why this is not a problem during typical operation is because the normal SPTPS key regeneration procedure looks like this: KEX -> <- KEX SIG -> <- SIG All these messages are sent over ANS_KEY, therefore the receiving side will unset validkey. However, that's typically not a problem in practice because upon reception of the last message (SIG), SPTPS will call sptps_receive_record(), which will set validkey to true again, and everything works out fine in the end. However, that was the typical scenario. Now let's assume that the SPTPS channel is in active use at the same time key regeneration happens. Specifically, let's assume a normal VPN data packet sneaks in during the key regeneration procedure: KEX -> <- KEX <- (SPTPS packet, over TCP or UDP) <- KEX (wtf?) SIG -> (refused with Invalid packet seqno: XXX != 0) At this point, both nodes are extremely confused and the SPTPS channel becomes unusable with various errors being thrown on both sides. The channel will stay down until automatic SPTPS channel restart kicks in after 10 seconds. (Note: the above is just an example - the race can occur on either side whenever a packet is sent during the period of time between KEX and SIG messages are received by the node sending the packet.) I've seen this race occur in the wild - it is very likely to occur if key regeneration occurs on a heavily loaded channel. It can be reproduced fairly easily by setting KeyExpire to a short value (a few seconds) and then running something like ping -f foobar -i 0.01. The reason why this occurs is because tinc's TX code path triggers the following: - send_packet() - try_tx() - try_tx_sptps() - validkey is false because we just received an ANS_KEY message - waitingforkey is false because it's not used for key regeneration - send_req_key() - SPTPS channel restart (sptps_stop(), sptps_start()). Obviously, it all goes downhill from there and the two nodes get very confused quickly (for example the seqno gets reset, hence the error messages). This commit fixes the issue by keeping validkey set when SPTPS data is received over ANS_KEY messages.	2015-11-22 17:53:52 +00:00
Etienne Dechamps	0f6d34dc1b	Try to ensure we build correctly against various libminiupnpc versions. Unfortunately, libminiupnpc has a somewhat... "peculiar" approach to backwards compatibility for their API, where they reserve the right to make breaking changes when they feel like it, forcing users to resort to #ifdefs to ensure they use the correct API. Sigh. Previously, tinc would only build against API versions <= 13, because I was doing my initial development using miniupnpc-1.9.20140610 which is the version that ships with Debian. The changes in this commit are required for tinc to build against more recent versions, from 1.9.20150730 to the latest one at the time of this commit, 1.9.20151026.	2015-11-21 16:18:01 +00:00
Etienne Dechamps	513bffe1fe	Add UPnP support to tincd. This commit makes tincd capable of discovering UPnP-IGD devices on the local network, and add mappings (port redirects) for its TCP and/or UDP port. The goal is to improve reliability and performance of tinc with nodes sitting behind home routers that support UPnP, by making it less reliant on UDP Hole Punching, which is prone to failure when "hostile" NATs are involved. The way this is implemented is by leveraging the libminiupnpc library, which we have just added a new dependency on. We use pthread to run the UPnP client code in a dedicated thread; we can't use the tinc event loop because libminiupnpc doesn't have a non-blocking API.	2015-11-21 16:17:59 +00:00
Etienne Dechamps	bdd84660c7	Make sure the packet source MAC address is always set. When tinc is used in router mode with a TAP device, Ethernet (MAC) headers are not present in packets flowing over the VPN; it is the node's responsibility to fill out this header before handing the packet over to the TAP interface (which expects such headers). Currently, tinc fills out the destination MAC address of the packet (otherwise the host would not recognize the packets, and nothing would work), but it does not fill out the source MAC address. In practice this doesn't seem to cause any real issues (the host doesn't care about the source address), but it does look weird when looking at the packets with a sniffer, and it also result in the following valgrind warning: ==13651== Syscall param write(buf) points to uninitialised byte(s) ==13651== at 0x5C4B620: __write_nocancel (syscall-template.S:81) ==13651== by 0x1445AA: write_packet (device.c:183) ==13651== by 0x118C7C: send_packet (net_packet.c:1259) ==13651== by 0x12B70A: route_ipv4 (route.c:443) ==13651== by 0x12D5F8: route (route.c:971) ==13651== by 0x1152BC: receive_packet (net_packet.c:250) ==13651== by 0x117E1B: receive_sptps_record (net_packet.c:904) ==13651== by 0x1309A8: sptps_receive_data_datagram (sptps.c:488) ==13651== by 0x130A90: sptps_receive_data (sptps.c:508) ==13651== by 0x115569: receive_udppacket (net_packet.c:286) ==13651== by 0x119856: handle_incoming_vpn_data (net_packet.c:1499) ==13651== by 0x10F3DA: event_loop (event.c:287) ==13651== Address 0xffeffea3a is on thread 1's stack ==13651== in frame #6, created by receive_sptps_record (net_packet.c:821) ==13651== This commit fixes the issue by filling out the source MAC address. It is generated by negating the last byte of the device MAC address, which is consistent with what route_arp() does. In addition, this commit stops route_arp() from filling out the Ethernet header of the packet - this is the responsibility of send_packet(), not route().	2015-11-07 11:59:16 +00:00
Etienne Dechamps	684bd659ae	Revert "Cache node IDs in a hash table for faster lookups." This reverts commit `c2319e90b1`. As a general principle, I do not believe it is worthwhile to cache nodes. Sure, it brings lookup time down from O(log n) to O(1), but considering that the scalability target of tinc is around 1000 nodes and log2(1000) is 10, that looks like premature optimization; tree lookups should already be very fast. Therefore, I believe it makes sense to remove the cache as a code cleanup initiative.	2015-11-04 19:36:06 +00:00
Etienne Dechamps	eeebff55c0	Use a splay tree for node UDP addresses in order to avoid collisions. This commit replaces the node UDP address hash table "cache" with a full-blown splay tree, aligning it with node_tree (name-indexed) and node_id_tree (ID-indexed). I'm doing this for two reasons. The first reason is to make sure we don't suddenly degrade to O(n) performance when two "hot" nodes end up in the same hash table bucket (collision). The second, and most important, reason, has to do with the fact that the hash table that was being used overrides elements that collide. Indeed, it turns out that there is one scenario in which the contents of node_udp_cache has correctness implications, not just performance implications. This has to do with the way handle_incoming_vpn_data() is implemented. Assume the following topology: A <-> B <-> C Now let's consider the perspective of tincd running on B, and let's assume the following is true: - All nodes are using the 1.1 protocol with node IDs and relaying support. - Nodes A and C have UDP addresses that hash to the same value. - Node C "wins" in the node_udp_cache (i.e. it overwrites A in the cache). - Node A has a "dynamic" UDP address (i.e. an UDP address that has been detected dynamically and cannot be deduced from edge addresses). Then, before this commit, A would be unable to relay packets through B. This is because handle_incoming_vpn_data() will fall back to try_harder(), which won't be able to match any edge addresses, doesn't check the dynamic UDP addresses, and won't be able to match any keys because this is a relayed packet which is encrypted with C's key, not B's. As a result, tinc will fail to match the source of the packet and will drop the packet with a "Received UDP packet from unknown source" message. I have seen this happen in the wild; it is actually quite likely to occur when there are more than a handful of nodes because node_udp_cache only has 256 buckets, making collisions quite likely. This problem is quite severe because it can completely prevent all packet communication between nodes - indeed, if node A tries to initiate some communication with C, it will use relaying at first, until C responds and helps A establish direct communication with it (e.g. hole punching). If relaying is broken, C will not help establish direct communication, and as a result no packets can make it through at all. The bug can be reproduced fairly easily by reproducing the topology above while changing the (hardcoded) node_udp_cache size to 1 to force a collision. One will quickly observe various issues when trying to make A talk to C. Setting IndirectData on B will make the issue even more severe and prevent all communication. Arguably, another way to fix this problem is to make try_harder() compare the packet's source address to each node's dynamic UDP addresses. However, I do not like this solution because if two "hot" nodes are contending on the same hash bucket, try_harder() will be called very often and packet routing performance will degrade closer to O(N) (where N is the total number of nodes in the graph). Using a more appropriate data structure fixes the bug without introducing this performance problem.	2015-11-04 19:36:02 +00:00
Guus Sliepen	7a8515112a	Avoid undefined behavior. Left shifts of negative values is undefined in C. This happens a lot in the Ed25519 code. Cast to unsigned first, then cast the result back to signed where necessary.	2015-10-26 13:46:30 +01:00
Guus Sliepen	7306823843	Fix a few memory leaks in the CLI found by AddressSanitizer.	2015-09-25 10:06:18 +02:00
Guus Sliepen	543c0abbd9	Fix struct node_status_t. Although not a problem for tinc internally, the size of the struct was 12 bytes instead of 4, causing some problems when interpreting the value received from tincd by the CLI.	2015-09-25 10:05:24 +02:00
Nathan Stratton Treadway	ae89a25695	Fix invalid checksum generation. Use equation 3 given in RFC 1624 and the UpdateTTL() example function given RFC 1141. # Conflicts: # src/route.c	2015-09-12 16:41:48 +02:00
Guus Sliepen	56a8b90d86	In sssp_bfs(), never try to update myself.	2015-07-22 14:33:56 +02:00
thorkill	f75e6f61f2	Do not access e->to->prevedge if not defined In some cases - mostly when e->to == myself the prevedge is set to NULL, causing invalid memory access. In rare cases this may lead to malformed mst or segfaults.	2015-07-19 22:33:43 +02:00
Guus Sliepen	9ca1750245	Fix the PRF function when compiling without OpenSSL.	2015-07-12 16:31:32 +02:00
thorkill	3c54765bcd	Prevent tinc from forgeting e->local_address If ADD_EDGE came from tinc version 1.0.x local_address.sa.sa_family is set to 0. If it came from tinc version 1.1.x forwarded for older verion it will be 255 - AF_UNKNOWN.	2015-07-12 13:32:38 +02:00
thorkill	1e7ef38198	Make sure we do not allocate new edge when talking to old nodes and the same edge already exists When tinc gets ADD_EDGE from older versions it will allocate new edge in protocol_edge.c:189 due to missed case in lines 149-171 where local_address is not defined.	2015-07-12 13:31:07 +02:00
Guus Sliepen	7b831804aa	Make subnet caches static.	2015-07-12 13:08:34 +02:00
thorkill	322ffadac4	Included missing names.h	2015-07-12 13:06:38 +02:00
Guus Sliepen	97457716d7	Remove unused code that caused warnings about an uninitialized variable.	2015-07-12 12:55:13 +02:00
thorkill	b22b9d4389	Removed double break;	2015-07-12 12:39:36 +02:00
Guus Sliepen	b396585383	Fix undefined behaviour when left-shifting signed integers. Found by -fsanitize=undefined.	2015-07-12 12:33:07 +02:00
Guus Sliepen	de7d9ee437	Call sockaddrfree(&e->local_address) in free_edge() instead of exit_edges(). The proper place to clean up resources of objects is in their destructor. This makes sure proper cleanup when edge_del() is called as well. At exit, free_edge() is called on all edges by free_edge_tree(), which is called by exit_nodes().	2015-07-04 17:53:11 +02:00
Guus Sliepen	36cec9af88	Coalesce two if statements that check for the same thing.	2015-07-04 17:51:05 +02:00
Jo-Philipp Wich	14ccf50954	fix musl compatibility Let configure include sys/if_tun.h when testing for netinet/if_ether.h to detect the Kernel/libc header conflict on musl. After this patch, configure will correctly detect netinet/if_ether.h as unusable and the subsequent compilation will not attempt to use it. Conflicts: src/have.h	2015-07-04 17:34:37 +02:00
Guus Sliepen	37588b8d5c	Don't #include OpenSSL headers when compiling without OpenSSL.	2015-07-04 17:34:31 +02:00
thorkill	abb24e9d71	Cleanup local_address in protocol_edge.c In line 131 local_address has been defined, but the memory was never freed on return.	2015-07-04 03:24:13 +02:00
thorkill	92df36a610	Cleanup edges stored in edge_weight_tree on exit protocol_edge.c: 131 defines local_address using str2sockaddr str2sockaddr() allocates memory which has to be freed on exit.	2015-07-04 03:24:05 +02:00
thorkill	1140ca6d30	Fixed 2 leaks in setup_myself()	2015-07-04 03:23:58 +02:00
Florian Klink	0267aef826	setup_outgoing_connection: log to LOG_DEBUG on if no known address With AutoConnect = yes, tinc tries to establish connections to known hosts. However, you could have set no Address for this host, which is perfectly fine (as long as there is at least one bootstrap node with an address or a local discovered node already part of the network) So log this to LOG_DEBUG	2015-07-02 21:22:53 +02:00
Florian Klink	91355b9ac5	(read\|append)_config_file: log open errors as LOG_DEBUG In a "decentrally managed vpn" it is very likely that host config files for some reachable nodes do not exist. Currently, tinc fills the logs with "Cannot open config file" messages. This commit changes the log level to LOG_DEBUG so syslog doesn't get filled by default.	2015-07-02 21:22:47 +02:00
Etienne Dechamps	ebffa40aa7	Protect against callbacks removing items from the io tree. The definition of the splay_each() macro is somewhat complicated for syntactic reasons. Here's what it does in a more readable way: for (splay_node_t* node = tree->head; node;) { type* item = node->data; splay_node_t* next = node->next; // RUN USER BLOCK with (item) node = next; } list_each() works in the same way. Since node->next is saved before the user block runs, this construct supports removing the current item from within the user block. However, what it does not support is removing other items from within the user block, especially the next item. Indeed, that will invalide the next pointer in the above loop and therefore result in an invalid pointer dereference. Unfortunately, there is at least one code path where that unsupported operation happens. It is located in ack_h(), where the authentication protocol code detects a double connection (i.e. being connected to another node twice). Running in the context of a socket read event, this code will happily terminate the other metaconnection, resulting in its socket being removed from the io tree. If, by misfortune, this other metaconnection happened to have the next socket FD number (which is quite possible due to FD reuse - albeit unlikely), and was part of the io tree (which is quite likely because if that connection is stuck, it will most likely have pending writes) then this will result in the next pending io item being destroyed. Invalid pointer dereference ensues. I did a quick audit of other uses of splay_each() and list_each() and I believe this is the only scenario in which this "next pointer invalidation" problem can occur in practice. While this bug has been there since at least `6bc5d626a8` (November 2012), if not sooner, it happens quite rarely due to the very specific set of conditions required to trigger it. Nevertheless, it does manage to crash my central production nodes every other week or so.	2015-06-20 14:09:00 +01:00
Guus Sliepen	45a46f068c	Fix crash is sptps_logger(). Unfortunately, sptps_logger() cannot know if s->handle is pointing to a connection_t or a node_t. But it needs to print name and hostname in both cases. So make sure both types have name and hostname fields at the start with the same offset.	2015-06-10 23:42:17 +02:00
Guus Sliepen	bfe231b977	Fix alignment of output of sptps_speed.	2015-06-07 23:20:14 +02:00
Guus Sliepen	a797b4a192	Fix receiving SPTPS data in sptps_speed and sptps_test. The sptps_receive_data() was changed in commit `d237efd` to only process one SPTPS record from a stream input. So now we have to put a loop around it to ensure we process everything.	2015-06-07 23:17:54 +02:00
Guus Sliepen	d8d1ab4ee1	Fix warnings about missing return value checks. In some harmless places, checks for the return value of ECDSA and RSA key generation and verification was omitted. Add them to keep the compiler happy and to warn end users in case something is wrong.	2015-06-07 22:50:05 +02:00
Guus Sliepen	84ecc972e5	Fix missing return value caused by the previous commit.	2015-05-31 23:51:39 +02:00
Etienne Dechamps	eca357ed91	Don't try to relay packets to unreachable nodes. It is not unusual for tinc to receive SPTPS packets to be relayed to nodes that just became unreachable, due to state propagation delays in the metagraph. Unfortunately, the current code doesn't handle that situation correctly, and still tries to relay the packet to the unreachable node. This typically ends up segfaulting. This commit fixes the issue by checking for reachability before relaying the packet.	2015-05-31 20:19:48 +01:00
Etienne Dechamps	9e3adef5cb	Fix invalid pointer use in get_my_hostname(). clang-3.7 warnings surfaced an actual bug: invitation.c:185:5: error: address of array 'filename' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion] if(filename) { ~~ ^~~~~~~~ The regression was introduced in `3ccdf50beb`.	2015-05-24 09:49:16 +01:00
Etienne Dechamps	7fcfbe2bd2	Fix wrong format string type in send_sptps_tcppacket(). This issue was found through a clang-3.7 warning: protocol_misc.c:167:46: error: format specifies type 'short' but the argument has type 'int' [-Werror,-Wformat] if(!send_request(c, "%d %hd", SPTPS_PACKET, len)) ~~~ ^~~ %d	2015-05-24 09:45:09 +01:00
Etienne Dechamps	3e61c7233b	Don't set up an ongoing connection to myself. It is entirely possible that the configuration file could contain a ConnectTo statement refering to its own name; that's a reasonable scenario when one deploys semi-automatically generated tinc.conf files. Amusingly, tinc does not like that at all, and actually sets up an outgoing_t structure to myself (which obviously makes no sense). This is mostly benign, though it does result in non-sensical "Already connected to myself" messages every retry interval. However, that also makes things blow up in close_network_connections(), because there we delete the entire outgoing list and then the myself node, which still has a reference to the freshly deleted outgoing structure. Boom.	2015-05-23 17:33:32 +01:00
Etienne Dechamps	8587e8c0d9	Fix crashes when trying unreachable nodes. timeout_handler() calls try_tx(c->node) when c->edge exists. Unfortunately, the existence of c->edge is not enough to conclude that the node is reachable. In fact, during connection establishment, there is a short period of time where we create an edge for the node at the other end of the metaconnection, but we don't have one from the other side yet. Unfortunately, if timeout_handler() runs during that short time window, it will call try_tx() on an unreachable node, which makes things explode because that function is not prepared to handle that case. A typical symptom of this race condition is a hard SEGFAULT while trying to send packets using metaconnections that don't exist, due to n->nexthop containing garbage. This patch fixes the issue by making try_tx() check for reachability, and then making all code paths use try_tx() instead of the more specialized methods so that they go through the check. This regression was introduced in `eb7a0db18e`.	2015-05-23 10:24:00 +01:00
Guus Sliepen	537a936671	Update copyright notices.	2015-05-21 11:09:01 +02:00
Guus Sliepen	0a786ffbb9	Set the CLOEXEC flag on the umbilical socket.	2015-05-21 11:06:38 +02:00
Guus Sliepen	87e0952773	Use socketpair() instead of pipe() for the umbilical. This prepares for a possible conversion of the umbilical socket to a control socket.	2015-05-20 21:28:54 +02:00
Guus Sliepen	19e0d449eb	Don't write log messages to the umbilical pipe if we don't detach. If we run in the foreground and are started by the CLI, this would otherwise cause the first few log messages to appear twice.	2015-05-20 21:25:06 +02:00
Guus Sliepen	11868b890d	Ensure "tinc start" knows if the daemon really started succesfully. We do this by creating an umbilical between the CLI and the daemon. The daemon pipes log messages to the CLI until it starts the main loop. The daemon then cuts the umbilical. The CLI copies all the received log messages to stderr, and the last byte indicates whether the daemon started succesfully or not, so the CLI can exit with a useful exit code.	2015-05-20 16:59:43 +02:00
Guus Sliepen	7f96ef081d	Fix check for LOCALSTATEDIR accessibility for the CLI. The CLI does not need write access to the directory where the PID file is stored, it just needs to be able to read the PID file.	2015-05-20 11:11:12 +02:00

1 2 3 4 5 ...

1984 commits