Encryption and authentication of the meta connection is spread out over
meta.c and protocol_auth.c. The new protocol was added there as well,
leading to spaghetti code. To improve things, the new protocol will now
be implemented in sptps.[ch].
The goal is to have a very simplified version of TLS. There is a record
layer, and there are only two record types: application data and
handshake messages. The handshake message contains a random nonce, an
ephemeral ECDH public key, and an ECDSA signature over the former. After
the ECDH public keys are exchanged, a shared secret is calculated, and a
TLS style PRF is used to generate the key material for the cipher and
HMAC algorithm, and further communication is encrypted and authenticated.
A lot of the simplicity comes from the fact that both sides must have
each other's public keys in advance, and there are no options to choose.
There will be one fixed cipher suite, and both peers always authenticate
each other. (Inspiration taken from Ian Grigg's hypotheses[0].)
There might be some compromise in the future, to enable or disable
encryption, authentication and compression, but there will be no choice
of algorithms. This will allow SPTPS to be built with a few embedded
crypto algorithms instead of linking with huge crypto libraries.
The API is also kept simple. There is a start and a stop function. All
data necessary to make the connection work is passed in the start
function. Instead having both send- and receive-record functions, there
is a send-record function and a receive-data function. The latter will
pass protocol data received from the peer to the SPTPS implementation,
which will in turn call a receive-record callback function when
necessary. This hides all the handshaking from the application, and is
completely independent from any event loop or socket characteristics.
[0] http://iang.org/ssl/hn_hypotheses_in_secure_protocol_design.html
This is mainly important for Windows, where the select() call in the
main thread is not being woken up when the tapreader thread calls
route(), causing a delay of up to 1 second before the output buffer is
flushed. This would cause bad performance when UDP communication is not
possible.
First of all, if there really are two nodes with the same name, much
more than 10 contradicting ADD_EDGE and DEL_EDGE messages will be sent.
Also, we forgot to reset the counters when nothing happened.
In case there is a ADD_EDGE/DEL_EDGE storm, we do not shut down, but
sleep an increasing amount of time, allowing tinc to recover gracefully
from temporary failures.
The length parameter for the encoding functions is the length of the
binary input, and for the decoding functions it is the maximum size of
the binary output.
The return value is always the length of the resulting output, excluding
the terminating NULL character for the encoding routines.
All functions can encode and decode in-place. The encoding functions
will always write a terminating NULL character, and the decoding
functions will stop at a NULL character.
If we don't have ECDSA keys for the node we connect to, set protocol_minor
to 1, to indicate this to the other end. This will first complete the
old way of authentication with RSA keys, and will then exchange ECDSA keys.
The connection will be terminated right afterwards, and the next attempt
will use ECDSA keys.
The generate-keys command now generates both an RSA and an ECDSA keypair,
but one can generate-rsa-keys or generate-ecdsa-keys to just generate one type.
It is modelled after the pseudorandom function from RFC4346 (TLS 1.1), the only
significant change is the use of SHA512 and Whirlpool instead of MD5 and SHA1.
REQ_KEY requests have an extra field indicating key exchange version.
If it is present and > 0, the sender supports ECDH. If the receiver also
does, then it will generate a new keypair and sends the public key in a
ANS_KEY request with "ECDH:" prefixed. The ans_key_h() function will
compute the shared secret, which, at the moment,is used as is to set the
cipher and HMAC keys. However, this must be changed to use a proper KDF.
In the future, the ECDH key exchange must also be signed.
The pid is now written first, so that a version 1.0.x tincd can be used to stop
a running version 1.1 tincd. Getsockname() is used to determine the address of
the first listening socket, so that tincctl can connect to the local tincd even
if AddressFamily = ipv6, or if BindToAddress or BindToInterface is used.
Instead of UNIX time, the log messages now start with the time in RFC3339
format, which human-readable and still easy for the computer to parse and sort.
The HUP signal will also cause the log file to be closed and reopened, which is
useful when log rotation is used. If there is an error while opening the log
file, this is logged to stderr.
But we do ignore SIGPIPE, and tinc 1.0.x signals that are no longer used
(SIGUSR1 and SIGUSR2), since the default handler of these signals is to
terminate tincd immediately.
Although we use qsort(), which is not guaranteed to be stable, resorting the
previously sorted array is more stable than recreating and resorting the array
each time.
We live in the 21st century, and we require C99 semantics, so we do not need to
work around buggy libcs. The xmalloc() and related functions are now static
inline functions.
We don't override any signal handlers anymore except those for SIGPIPE and
SIGCHLD. Fatal signals (SIGSEGV, SIGBUS etc.) will terminate tincd and
optionally dump core. The previous behaviour was to terminate gracefully and
try to restart, but that usually failed and made any core dump useless.
This would allow tincctl to connect to a remote tincd, or to a local tincd that
isn't listening on localhost, for example if it is using the BindToInterface or
BindToAddress options.