net_packet doesn't actually use send_sptps_data(); it only uses
send_sptps_data_priv(). In addition, the only user of send_sptps_data()
is protocol_key. Therefore it makes sense to expose
send_sptps_data_priv() directly, and move send_sptps_data() (which is
basically just boilerplate) as a local function in protocol_key.
In this commit, nodes use MTU_INFO messages to provide MTU information.
The issue this code is meant to address is the non-trivial problem of
finding the proper MTU when UDP SPTPS relays are involved. Currently,
tinc has no idea what the MTU looks like beyond the first relay, and
will arbitrarily use the first relay's MTU as the limit. This will fail
miserably if the MTU decreases after the first relay, forcing relays to
fall back to TCP. More generally, one should keep in mind that relay
paths can be arbitrarily complex, resulting in packets taking "epic
journeys" through the graph, switching back and forth between UDP (with
variable MTUs) and TCP multiple times along the path.
A solution that was considered consists in sending standard MTU probes
through the relays. This is inefficient (if there are 3 nodes on one
side of relay and 3 nodes on the other side, we end up with 3*3=9 MTU
discoveries taking place at the same time, while technically only
3+3=6 are needed) and would involve eyebrow-raising behaviors such as
probes being sent over TCP.
This commit implements an alternative solution, which consists in
the packet receiver sending MTU_INFO messages to the packet sender.
The message contains an MTU value which is set to maximum when the
message is originally sent. The message gets altered as it travels
through the metagraph, such that when the message arrives to the
destination, the MTU value contained in the message can be used to
send packets while making sure no relays will be forced to fall back to
TCP to deliver them.
The operating principles behind such a protocol message are similar to
how the UDP_INFO message works, but there is a key difference that
prevents us from simply reusing the same message: the UDP_INFO message
only cares about relay-to-relay links (i.e. it is sent between static
relays and the information it contains only makes sense between two
adjacent static relays), while the MTU_INFO cares about the end-to-end
MTU, including the entire relay path. Therefore, UDP_INFO messages stop
when they encounter static relays, while MTU_INFO messages don't stop
until they get to the original packet sender.
Note that, technically, the MTU that is obtained through this mechanism
can be slightly pessimistic, because it can be lowered by an
intermediate node that is not being used as a relay. Since nodes have no
way of knowing whether they'll be used as dynamic relays or not (and
have no say in the matter), this is not a trivial problem. That said,
this is highly unlikely to result in noticeable issues in realistic
scenarios.
In this commit, nodes use UDP_INFO messages to provide UDP address
information. The basic principle is that the node that receives packets
sends UDP_INFO messages to the node that's sending the packets. The
message originally contains no address information, and is (hopefully)
updated with relevant address information as it gets relayed through the
metagraph - specifically, each intermediate node will update the message
with its best guess as to what the address is while forwarding it.
When a node receives an UDP_INFO message, and it doesn't have a
confirmed UDP tunnel with the originator node, it will update its
records with the new address for that node, so that it always has the
best possible guess as to how to reach that node. This applies to the
destination node of course, but also to any intermediate nodes, because
there's no reason they should pass on the free intel, and because it
results in nice behavior in the presence of relay chains (multiple nodes
in a path all trying to reach the same destination).
If, on the other hand, the node does have a confirmed UDP tunnel, it
will ignore the address information contained in the message.
In all cases, if the node that receives the message is not the
destination node specified in the message, it will forward the message
but not before overriding the address information with the one from its
own records. If the node has a confirmed UDP tunnel, that means the
message is updated with the address of the confirmed tunnel; if not,
the message simply reflects the records of the intermediate node, which
just happen to be the contents of the UDP_INFO message it just got, so
it's simply forwarded with no modification.
This is similar to the way ANS_KEY messages are currently
overloaded to provide UDP address information, with two differences:
- UDP_INFO messages are sent way more often than ANS_KEY messages,
thereby keeping the address information fresh. Previously, if the UDP
situation were to change after the ANS_KEY message was sent, the
sender would virtually never get the updated information.
- Once a node puts address information in an ANS_KEY message, it is
never changed again as the message travels through the metagraph; in
contrast, UDP_INFO messages behave the opposite way, as they get
rewritten every time they travel through a node with a confirmed UDP
tunnel. The latter behavior seems more appropriate because UDP tunnel
information becomes more relevant as it moves closer to the
destination node. The ANS_KEY behavior is not satisfactory in some
cases such as multi-layered graphs where the first hop is located
before a NAT.
Ultimately, the rationale behind this whole process is to improve UDP
hole punching capabilities when port translation is in effect, and more
generally, to make tinc more reliable in (very) hostile network
conditions (such as multi-layered NAT).
In tinc 1.0.x, this was tracked in node->inkey, however in tinc 1.1 we have an abstraction layer for
the legacy cipher and digest, and we don't keep an explicit copy of the key around. We cannot use
cipher_active() or digest_active(), since it is possible to set both to the null algorithm. So add a bit to
node_status_t.
Currently, the PMTU discovery code is run by a timeout callback,
independently of tunnel activity. This commit moves it into the TX
path, meaning that send_mtu_probe_handler() is only called if a
packet is about to be sent. Consequently, it has been renamed to
try_mtu() for consistency with try_tx(), try_udp() and try_sptps().
Running PMTU discovery code only as part of the TX path prevents
PMTU discovery from generating unreasonable amounts of traffic when
the "real" traffic is negligible. One extreme example is sending one
real packet and then going silent: in the current code this one little
packet will result in the entire PMTU discovery algorithm being run
from start to finish, resulting in absurd write traffic amplification.
With this patch, PMTU discovery stops as soon as "real" packets stop
flowing, and will be no more aggressive than the underlying traffic.
Furthermore, try_mtu() only runs if there is confirmed UDP
connectivity as per the UDP discovery mechanism. This prevents
unnecessary network chatter - previously, the PMTU discovery code
would send bursts of (potentially large) probe packets every second
even if there was nothing on the other side. With this patch, the
PMTU code only does that if something replied to the lightweight UDP
discovery pings.
These inefficiencies were made even worse when the node is not a
direct neighbour, as tinc will use PMTU discovery both on the
destination node *and* the relay. UDP discovery is more lightweight for
this purpose.
As a bonus, this code simplifies overall code somewhat - state is
easier to manage when code is run in predictable contexts as opposed
to "surprise callbacks". In addition, there is no need to call PMTU
discovery code outside of net_packet.c anymore, thereby simplifying
module boundaries.
The option "--disable-legacy-protocol" was added to the configure
script. The new protocol does not depend on any external crypto
libraries, so when the option is used tinc is no longer linked to
OpenSSL's libcrypto.
Currently, we send MTU probes to each node we receive a key for, even if
we know we will never send UDP packets to that node because of
indirection. This commit disables MTU probing between nodes that have
direct communication disabled, otherwise MTU probes end up getting sent
through relays.
With the legacy protocol this was never a problem because we would never
request the key of a node with indirection enabled; with SPTPS this was
not a problem until we introduced relaying because send_sptps_data()
would simply ignore indirections, but this is not the case anymore.
Note that the fix is implemented in a quick and dirty way, by disabling
the call to send_mtu_probe() in ans_key_h(); this is not a clean fix
because there's no code to resume sending MTU probes in case the
indirection disappears because of a graph change.
This commit changes the layout of UDP datagrams to include a 6-byte
destination node ID at the very beginning of the datagram (i.e. before
the source node ID and the seqno). Note that this only applies to SPTPS.
Thanks to this new field, it is now possible to send SPTPS datagrams to
nodes that are not the final recipient of the packets, thereby using
these nodes as relay nodes. Previously SPTPS was unable to relay packets
using UDP, and required a fallback to TCP if the final recipient could
not be contacted directly using UDP. In that sense it fixes a regression
that SPTPS introduced with regard to the legacy protocol.
This change also updates tinc's low-level routing logic (i.e.
send_sptps_data()) to automatically use this relaying facility if at all
possible. Specifically, it will relay packets if we don't have a
confirmed UDP link to the final recipient (but we have one with the next
hop node), or if IndirectData is specified. This is similar to how the
legacy protocol forwards packets.
When sending packets directly without any relaying, the sender node uses
a special value for the destination node ID: instead of setting the
field to the ID of the recipient node, it writes a zero ID instead. This
allows the recipient node to distinguish between a relayed packet and a
direct packet, which is important when determining the UDP address of
the sending node.
On the relay side, relay nodes will happily relay packets that have a
destination ID which is non-zero *and* is different from their own,
provided that the source IP address of the packet is known. This is to
prevent abuse by random strangers, since a node can't authenticate the
packets that are being relayed through it.
This change keeps the protocol number from the previous datagram format
change (source IDs), 17.4. Compatibility is still preserved with 1.0 and
with pre-1.1 releases. Note, however, that nodes running this code won't
understand datagrams sent from nodes that only use source IDs and
vice-versa (not that we really care).
There is one caveat: in the current state, there is no way for the
original sender to know what the PMTU is beyond the first hop, and
contrary to the legacy protocol, relay nodes can't apply MSS clamping
because they can't decrypt the relayed packets. This leads to
inefficient scenarios where a reduced PMTU over some link that's part of
the relay path will result in relays falling back to TCP to send packets
to their final destinations.
Another caveat is that once a packet gets sent over TCP, it will use
TCP over the entire path, even if it is technically possible to use UDP
beyond the TCP-only link(s).
Arguably, these two caveats can be fixed by improving the
metaconnection protocol, but that's out of scope for this change. TODOs
are added instead. In any case, this is no worse than before.
In addition, this change increases SPTPS datagram overhead by another
6 bytes for the destination ID, on top of the existing 6-byte overhead
from the source ID.
In this commit, if a node receives a REQ_PUBKEY message from a node it
doesn't have the key for, it will send a REQ_PUBKEY message in return
*before* sending its own key.
The rationale is to prevent delays when establishing communication
between two nodes that see each other for the first time. These delays
are caused by the first SPTPS packet being dropped on the floor, as
shown in the following typical exchange:
node1: No Ed25519 key known for node2
REQ_PUBKEY ->
<- ANS_PUBKEY
node1: Learned Ed25519 public key from node2
REQ_SPTPS_START ->
node2: No Ed25519 key known for zyklos
<- REQ_PUBKEY
ANS_PUBKEY ->
node2: Learned Ed25519 public key from node1
-- 10-second delay --
node1: No key from node2 after 10 seconds, restarting SPTPS
REQ_SPTPS_START ->
<- SPTPS ->
node1: SPTPS key exchange with node2 succesful
node2: SPTPS key exchange with node1 succesful
With this patch, the following happens instead:
node1: No Ed25519 key known for node2
REQ_PUBKEY ->
node2: Preemptively requesting Ed25519 key for node1
<- REQ_PUBKEY
<- ANS_PUBKEY
ANS_PUBKEY ->
node2: Learned Ed25519 public key from node1
node1: Learned Ed25519 public key from node2
REQ_SPTPS_START ->
<- SPTPS ->
node1: SPTPS key exchange with node2 succesful
node2: SPTPS key exchange with node1 succesful
The only places where connection_t::status.active is modified is in
ack_h() and terminate_connection(). In both cases, connection_t::edge
is added and removed at the same time, and that's the only places
connection_t::edge is set. Therefore, the following is true at all
times:
!c->status.active == !c->edge
This commit removes the redundant state information by getting rid of
connection_t::status.active, and using connection_t::edge instead.
This gets rid of the rest of the symbolic links. However, as a consequence, the
crypto header files have now moved to src/, and can no longer contain
library-specific declarations. Therefore, cipher_t, digest_t, ecdh_t, ecdsa_t
and rsa_t are now all opaque types, and only pointers to those types can be
used.
Keep track of the number of correct, non-replayed UDP packets that have been
received, regardless of their content. This can be compared to the sequence
number to determine the real packet loss.
Only the very first packet of an SPTPS session should be send with REQ_KEY,
this signals the peer to abort any previous session and start a new one as
well.
The tree functions were never used on the connection_tree, a list is more appropriate.
Also be more paranoid about connections disappearing while traversing the list.
Similar to old style key exchange requests, keep track of whether a key
exchange is already in progress and how long it took. If no key is known yet
or if key exchange takes too long, (re)start a new key exchange.
When two nodes which support SPTPS want to send packets to each other, they now
always use SPTPS. The node initiating the SPTPS session send the first SPTPS
packet via an extended REQ_KEY messages. All other handshake messages are sent
using ANS_KEY messages. This ensures that intermediate nodes using an older
version of tinc can still help with NAT traversal. After the authentication
phase is over, SPTPS packets are sent via UDP, or are encapsulated in extended
REQ_KEY messages instead of PACKET messages.
This allows tincctl to receive log messages from a running tincd,
independent of what is logged to syslog or to file. Tincctl can receive
debug messages with an arbitrary level.