This is a minor cosmetic nit to emphasise the distinction between the
initial MTU discovery phase, and the post-initial phase (i.e. maxmtu
checking).
Furthermore, this is an improvement with regard to the DRY (Don't
Repeat Yourself) principle, as the maximum mtuprobes value is only
written once.
If a probe reply is received that makes minmtu equal to maxmtu, we
have to wait until try_mtu() runs to realize that. Since try_mtu()
runs after a packet is sent, this means there is at least one packet
(possibly more, depending on timing) that won't benefit from the
fixed MTU. This also happens when maxmtu is updated from the send()
path.
This commit fixes that by making sure we check whether the MTU can be
fixed every time minmtu or maxmtu is touched.
This moves related functions together, and is a pure cut-and-paste
change. The reason it was not done in the previous commit is because it
would have made the diff harder to review.
Currently, the PMTU discovery code is run by a timeout callback,
independently of tunnel activity. This commit moves it into the TX
path, meaning that send_mtu_probe_handler() is only called if a
packet is about to be sent. Consequently, it has been renamed to
try_mtu() for consistency with try_tx(), try_udp() and try_sptps().
Running PMTU discovery code only as part of the TX path prevents
PMTU discovery from generating unreasonable amounts of traffic when
the "real" traffic is negligible. One extreme example is sending one
real packet and then going silent: in the current code this one little
packet will result in the entire PMTU discovery algorithm being run
from start to finish, resulting in absurd write traffic amplification.
With this patch, PMTU discovery stops as soon as "real" packets stop
flowing, and will be no more aggressive than the underlying traffic.
Furthermore, try_mtu() only runs if there is confirmed UDP
connectivity as per the UDP discovery mechanism. This prevents
unnecessary network chatter - previously, the PMTU discovery code
would send bursts of (potentially large) probe packets every second
even if there was nothing on the other side. With this patch, the
PMTU code only does that if something replied to the lightweight UDP
discovery pings.
These inefficiencies were made even worse when the node is not a
direct neighbour, as tinc will use PMTU discovery both on the
destination node *and* the relay. UDP discovery is more lightweight for
this purpose.
As a bonus, this code simplifies overall code somewhat - state is
easier to manage when code is run in predictable contexts as opposed
to "surprise callbacks". In addition, there is no need to call PMTU
discovery code outside of net_packet.c anymore, thereby simplifying
module boundaries.
This is a rewrite of the send_mtu_probe_handler() function to make it
focus on the actual discovery of PMTU. In particular, the PMTU
discovery code doesn't care about tunnel state anymore - it only cares
about doing the initial PMTU discovery, and once that's done, making
sure PMTU did not increase by checking it from time to time. All other
duties have already been rewritten in the UDP discovery code.
As a result, the send_mtu_probe_handler(), which previously implemented
a nightmarish state machine which was very difficult to follow and
understand, has been massively simplified. We moved from four persistent
states to only two - initial discovery and steady state.
Furthermore, a side effect is that network chatter is reduced: instead
of sending bursts of three minmtu-sized packets in the steady state,
there is only one such packet that's sent from the UDP discovery code.
However, that introduces a slight regression in the bandwidth estimation
code, which relies on three-packet bursts in order to function.
Considering that this estimation is extremely unreliable (in my
experience) and isn't relied on by anything, this seems like an
acceptable regression.
Since UDP discovery is the place where UDP feasibility is checked, it
makes sense to test for local connectivity as well. This was previously
done as part of PMTU discovery.
This adds a new mechanism by which tinc can determine if a node is
reachable via UDP. The new mechanism is currently redundant with the
PMTU discovery mechanism - that will be fixed in a future commit.
Conceptually, the UDP discovery mechanism works similarly to PMTU
discovery: it sends UDP probes (of minmtu size, to make sure the tunnel
is fully usable), and assumes UDP is usable if it gets replies. It
assumes UDP is broken if too much time has passed since the last reply.
The big difference with the current PMTU discovery mechanism, however,
is that UDP discovery probes are only triggered as part of the
packet TX path (through try_tx()). This is quite interesting, because
it means tinc will never send UDP pings more often than normal packets,
and most importantly, it will automatically stop sending pings as soon
as packets stop flowing, thereby nicely reducing network chatter.
Of course, there are small drawbacks in some edge cases: for example,
if a node only sends one packet every minute to another node, these
packets will only be sent over TCP, because the interval between packets
is too long for tinc to maintain the UDP tunnel. I consider this a
feature, not a bug: I believe it is appropriate to use TCP in scenarios
where traffic is negligible, so that we don't pollute the network with
pings just to maintain a UDP tunnel that's seeing negligible usage.
This moves related functions together. try_tx() is at the right place
since its only caller is send_packet().
This is a pure cut-and-paste change. The reason it was not done in the
previous commit is because it would have made the diff harder to review.
Currently, the TX path (starting from send_packet()) in tinc has three
responsabilities:
- Making sure packets can be sent (e.g. fetching SPTPS keys);
- Making sure they can be sent optimally (e.g. fetching non-SPTPS keys
so that UDP can be used);
- Sending the actual packet, if feasible.
The first two are closely related; the third one, however, can be
cleanly separated from the other two - meaning, we can loosen code
coupling between sending packets and "optimizing" the way packets are
sent. This will become increasingly important as future commits will
move more tunnel establishment and maintenance code into the TX path,
so we will benefit from a cleaner separation of concerns.
This is especially relevant because of the dual nature of the TX path
(SPTPS versus non-SPTPS), which can make things really complicated when
trying to share low-level code between both.
In this commit, code related to establishing or improving tunnels is
moved away from the core TX path by introducing the "try_*()" family of
function, of which try_sptps() already existed before this commit.
This is a pure refactoring; this commit shouldn't introduce any change
in behavior.
This cleans up the PMTU probing function a little bit. It moves the
low-level sending of packets to a separate function, so that the code
reads naturally instead of using a weird for loop with "special
indexes". In addition, comments are moved inside the body of the
function for additional context.
This shouldn't introduce any change of behavior, except for local
discovery which has some minor logic fixes and which now always uses
small packets (16 bytes) because there's no need for a full-length
probe just to try the local network.
The option "--disable-legacy-protocol" was added to the configure
script. The new protocol does not depend on any external crypto
libraries, so when the option is used tinc is no longer linked to
OpenSSL's libcrypto.
It then thinks there should be a rule to make the .c file, which does
not exist of course. Luckily, we can tell it that version.o is .PHONY,
and this will still cause the .o file to be regenerated and linked into
the binaries everytime make is called.
On some platforms (Mac OS X for example), the res_init() function requires
linking with libresolv. On others (Linux, OpenBSD for example), res_init()
lives in libc.
Currently, when sending packets over TCP where the final recipient is
a node we have a direct metaconnection to, tinc first establishes a
SPTPS handshake between the two neighbors.
It turns out this SPTPS tunnel is not actually useful, because the
packet is only being sent over one metaconnection with no intermediate
nodes, and the metaconnection itself is already secured using a separate
SPTPS handshake.
Therefore it seems simpler and more efficient to simply send these
packets directly over the metaconnection itself without any additional
layer. This commits implements this solution without any changes to the
metaprotocol, since the appropriate message already exists: it's the
good old "plaintext" PACKET message.
This change brings two significant benefits:
- Packets to neighbors can be sent immediately - there is no initial
delay and packet loss previously caused by the SPTPS handshake;
- Performance of sending packets to neighbors over TCP is greatly
improved since the data only goes through one round of encryption
instead of two.
Conflicts:
src/net_packet.c
Currently, when tinc establishes a metaconnection, it automatically
starts a VPN SPTPS tunnel with the other side of the metaconnection.
It is not clear what this is trying to accomplish. Having a
metaconnection with a node does not necessarily mean we're going to send
packets to that node. This patch removes this behavior, thereby
simplifying code paths and removing unnecessary network chatter.
Naturally, this introduces a slight delay (as well as at least one
initial packet loss) between the moment a metaconnection is established
and the moment VPN packets can be exchanged between the two nodes.
However this is no different to the non-neighbor case, so it makes
things more consistent and therefore easier to reason about.
The offset value indicates where the actual payload starts, so we can
process both legacy and SPTPS UDP packets without having to do casting
tricks and/or moving memory around.
Limit the amount of address/ID lookups to the minimum in all cases:
1) Legacy packets, need an address lookup.
2) Indirect SPTPS packets, need an address lookup + two ID lookups.
3) Direct SPTPS packets, need an ID or an address lookup.
So we start with an address lookup. If the source is an 1.1 node, we know it's an SPTPS packet,
and then the check for direct packets is a simple check if dstid is zero. If not, do the srcid and dstid
lookup. If the source is an 1.0 node, we don't have to do anything else.
If the address is unknown, we first check whether it's from a 1.1 node by assuming it has a valid srcid
and verifying the packet. If not, use the old try_harder().
If the peer presents a different one from the one we already know, log
an error. Otherwise, log an informational message, and terminate in the
same way as we would if we didn't already have that key.
The SPTPS code doesn't know about nodes, so when it logs an error about
a bad packet, it doesn't log which node it came from. So add a log
message with the node's name and hostname in receive_udppacket().
On Linux, tinc doesn't know the MAC address of the TAP device until the
first read. This means that if no packets are sent through the
interface, tinc won't be able to figure out which MAC address to tag
incoming packets with. As a result, it is impossible to receive any
packet until at least one packet has been sent.
When IPv6 is disabled Linux does not spontanously send any packets
when the interface comes up. At first users wonder why the node is not
responding to ICMP pings, and then as soon as at least one packet is
sent through the interface, pings mysteriously start working, resulting
in user confusion.
This change fixes that problem by making sure tinc is aware of the
device's MAC address even before the first packet is sent.
Currently, when tinc sends UDP SPTPS datagrams through a relay, it
doesn't automatically start discovering PMTU with the relay. This means
that unless something else triggers PMTU discovery, tinc will keep using
TCP when sending packets through the relay.
This patches fixes the issue by explicitly establishing UDP tunnels with
relays.
Currently, we send MTU probes to each node we receive a key for, even if
we know we will never send UDP packets to that node because of
indirection. This commit disables MTU probing between nodes that have
direct communication disabled, otherwise MTU probes end up getting sent
through relays.
With the legacy protocol this was never a problem because we would never
request the key of a node with indirection enabled; with SPTPS this was
not a problem until we introduced relaying because send_sptps_data()
would simply ignore indirections, but this is not the case anymore.
Note that the fix is implemented in a quick and dirty way, by disabling
the call to send_mtu_probe() in ans_key_h(); this is not a clean fix
because there's no code to resume sending MTU probes in case the
indirection disappears because of a graph change.
This commit changes the layout of UDP datagrams to include a 6-byte
destination node ID at the very beginning of the datagram (i.e. before
the source node ID and the seqno). Note that this only applies to SPTPS.
Thanks to this new field, it is now possible to send SPTPS datagrams to
nodes that are not the final recipient of the packets, thereby using
these nodes as relay nodes. Previously SPTPS was unable to relay packets
using UDP, and required a fallback to TCP if the final recipient could
not be contacted directly using UDP. In that sense it fixes a regression
that SPTPS introduced with regard to the legacy protocol.
This change also updates tinc's low-level routing logic (i.e.
send_sptps_data()) to automatically use this relaying facility if at all
possible. Specifically, it will relay packets if we don't have a
confirmed UDP link to the final recipient (but we have one with the next
hop node), or if IndirectData is specified. This is similar to how the
legacy protocol forwards packets.
When sending packets directly without any relaying, the sender node uses
a special value for the destination node ID: instead of setting the
field to the ID of the recipient node, it writes a zero ID instead. This
allows the recipient node to distinguish between a relayed packet and a
direct packet, which is important when determining the UDP address of
the sending node.
On the relay side, relay nodes will happily relay packets that have a
destination ID which is non-zero *and* is different from their own,
provided that the source IP address of the packet is known. This is to
prevent abuse by random strangers, since a node can't authenticate the
packets that are being relayed through it.
This change keeps the protocol number from the previous datagram format
change (source IDs), 17.4. Compatibility is still preserved with 1.0 and
with pre-1.1 releases. Note, however, that nodes running this code won't
understand datagrams sent from nodes that only use source IDs and
vice-versa (not that we really care).
There is one caveat: in the current state, there is no way for the
original sender to know what the PMTU is beyond the first hop, and
contrary to the legacy protocol, relay nodes can't apply MSS clamping
because they can't decrypt the relayed packets. This leads to
inefficient scenarios where a reduced PMTU over some link that's part of
the relay path will result in relays falling back to TCP to send packets
to their final destinations.
Another caveat is that once a packet gets sent over TCP, it will use
TCP over the entire path, even if it is technically possible to use UDP
beyond the TCP-only link(s).
Arguably, these two caveats can be fixed by improving the
metaconnection protocol, but that's out of scope for this change. TODOs
are added instead. In any case, this is no worse than before.
In addition, this change increases SPTPS datagram overhead by another
6 bytes for the destination ID, on top of the existing 6-byte overhead
from the source ID.