Currently, we send MTU probes to each node we receive a key for, even if
we know we will never send UDP packets to that node because of
indirection. This commit disables MTU probing between nodes that have
direct communication disabled, otherwise MTU probes end up getting sent
through relays.
With the legacy protocol this was never a problem because we would never
request the key of a node with indirection enabled; with SPTPS this was
not a problem until we introduced relaying because send_sptps_data()
would simply ignore indirections, but this is not the case anymore.
Note that the fix is implemented in a quick and dirty way, by disabling
the call to send_mtu_probe() in ans_key_h(); this is not a clean fix
because there's no code to resume sending MTU probes in case the
indirection disappears because of a graph change.
This commit changes the layout of UDP datagrams to include a 6-byte
destination node ID at the very beginning of the datagram (i.e. before
the source node ID and the seqno). Note that this only applies to SPTPS.
Thanks to this new field, it is now possible to send SPTPS datagrams to
nodes that are not the final recipient of the packets, thereby using
these nodes as relay nodes. Previously SPTPS was unable to relay packets
using UDP, and required a fallback to TCP if the final recipient could
not be contacted directly using UDP. In that sense it fixes a regression
that SPTPS introduced with regard to the legacy protocol.
This change also updates tinc's low-level routing logic (i.e.
send_sptps_data()) to automatically use this relaying facility if at all
possible. Specifically, it will relay packets if we don't have a
confirmed UDP link to the final recipient (but we have one with the next
hop node), or if IndirectData is specified. This is similar to how the
legacy protocol forwards packets.
When sending packets directly without any relaying, the sender node uses
a special value for the destination node ID: instead of setting the
field to the ID of the recipient node, it writes a zero ID instead. This
allows the recipient node to distinguish between a relayed packet and a
direct packet, which is important when determining the UDP address of
the sending node.
On the relay side, relay nodes will happily relay packets that have a
destination ID which is non-zero *and* is different from their own,
provided that the source IP address of the packet is known. This is to
prevent abuse by random strangers, since a node can't authenticate the
packets that are being relayed through it.
This change keeps the protocol number from the previous datagram format
change (source IDs), 17.4. Compatibility is still preserved with 1.0 and
with pre-1.1 releases. Note, however, that nodes running this code won't
understand datagrams sent from nodes that only use source IDs and
vice-versa (not that we really care).
There is one caveat: in the current state, there is no way for the
original sender to know what the PMTU is beyond the first hop, and
contrary to the legacy protocol, relay nodes can't apply MSS clamping
because they can't decrypt the relayed packets. This leads to
inefficient scenarios where a reduced PMTU over some link that's part of
the relay path will result in relays falling back to TCP to send packets
to their final destinations.
Another caveat is that once a packet gets sent over TCP, it will use
TCP over the entire path, even if it is technically possible to use UDP
beyond the TCP-only link(s).
Arguably, these two caveats can be fixed by improving the
metaconnection protocol, but that's out of scope for this change. TODOs
are added instead. In any case, this is no worse than before.
In addition, this change increases SPTPS datagram overhead by another
6 bytes for the destination ID, on top of the existing 6-byte overhead
from the source ID.
In this commit, if a node receives a REQ_PUBKEY message from a node it
doesn't have the key for, it will send a REQ_PUBKEY message in return
*before* sending its own key.
The rationale is to prevent delays when establishing communication
between two nodes that see each other for the first time. These delays
are caused by the first SPTPS packet being dropped on the floor, as
shown in the following typical exchange:
node1: No Ed25519 key known for node2
REQ_PUBKEY ->
<- ANS_PUBKEY
node1: Learned Ed25519 public key from node2
REQ_SPTPS_START ->
node2: No Ed25519 key known for zyklos
<- REQ_PUBKEY
ANS_PUBKEY ->
node2: Learned Ed25519 public key from node1
-- 10-second delay --
node1: No key from node2 after 10 seconds, restarting SPTPS
REQ_SPTPS_START ->
<- SPTPS ->
node1: SPTPS key exchange with node2 succesful
node2: SPTPS key exchange with node1 succesful
With this patch, the following happens instead:
node1: No Ed25519 key known for node2
REQ_PUBKEY ->
node2: Preemptively requesting Ed25519 key for node1
<- REQ_PUBKEY
<- ANS_PUBKEY
ANS_PUBKEY ->
node2: Learned Ed25519 public key from node1
node1: Learned Ed25519 public key from node2
REQ_SPTPS_START ->
<- SPTPS ->
node1: SPTPS key exchange with node2 succesful
node2: SPTPS key exchange with node1 succesful
The only places where connection_t::status.active is modified is in
ack_h() and terminate_connection(). In both cases, connection_t::edge
is added and removed at the same time, and that's the only places
connection_t::edge is set. Therefore, the following is true at all
times:
!c->status.active == !c->edge
This commit removes the redundant state information by getting rid of
connection_t::status.active, and using connection_t::edge instead.
This gets rid of the rest of the symbolic links. However, as a consequence, the
crypto header files have now moved to src/, and can no longer contain
library-specific declarations. Therefore, cipher_t, digest_t, ecdh_t, ecdsa_t
and rsa_t are now all opaque types, and only pointers to those types can be
used.
Keep track of the number of correct, non-replayed UDP packets that have been
received, regardless of their content. This can be compared to the sequence
number to determine the real packet loss.
Only the very first packet of an SPTPS session should be send with REQ_KEY,
this signals the peer to abort any previous session and start a new one as
well.
The tree functions were never used on the connection_tree, a list is more appropriate.
Also be more paranoid about connections disappearing while traversing the list.
Similar to old style key exchange requests, keep track of whether a key
exchange is already in progress and how long it took. If no key is known yet
or if key exchange takes too long, (re)start a new key exchange.
When two nodes which support SPTPS want to send packets to each other, they now
always use SPTPS. The node initiating the SPTPS session send the first SPTPS
packet via an extended REQ_KEY messages. All other handshake messages are sent
using ANS_KEY messages. This ensures that intermediate nodes using an older
version of tinc can still help with NAT traversal. After the authentication
phase is over, SPTPS packets are sent via UDP, or are encapsulated in extended
REQ_KEY messages instead of PACKET messages.
This allows tincctl to receive log messages from a running tincd,
independent of what is logged to syslog or to file. Tincctl can receive
debug messages with an arbitrary level.
REQ_KEY requests have an extra field indicating key exchange version.
If it is present and > 0, the sender supports ECDH. If the receiver also
does, then it will generate a new keypair and sends the public key in a
ANS_KEY request with "ECDH:" prefixed. The ans_key_h() function will
compute the shared secret, which, at the moment,is used as is to set the
cipher and HMAC keys. However, this must be changed to use a proper KDF.
In the future, the ECDH key exchange must also be signed.