Testing has revealed that the newer series of Windows TAP drivers (i.e.
9.0.0.21 and later, also known as NDIS6, tap-windows6) suffer from
serious performance issues in the write path. Write operations seems to
take a very long time to complete, resulting in massive packet loss even
for throughputs as low as 10 Mbit/s.
I've made some attempts to alleviate the problem using parellelism. By
using custom code that allows up to 256 write operations at the same
time the results are much better, but it's still about 2 times worse
than the traditional 9.0.0.9 driver.
We need to investigate more and file a bug against tap-windows6, but in
the mean time, let's inform the user that he might not want to use the
latest drivers.
This is generally useful. We've seen issues that are specific to some
version of these drivers (especially the newer 9.0.0.21 version), so
it's relevant to log it, especially since that means it will be
copy-pasted by people posting their logs asking for help.
Write operations to the Windows device do not necessarily complete
immediately; in fact, with the latest TAP-Win32 drivers, this never
seems to be the case.
write_packet() does not handle that case correctly, because the
OVERLAPPED structure and the packet data go out of scope before the
write operation completes, resulting in race conditions.
This commit fixes the issue by making sure these data structures are
kept in global scope, and by dropping any packets that may arrive while
the previous write operation is still pending.
On Windows, when disabling the device, tinc uses the CancelIo() to
cancel the pending read operation, and then proceeds to delete the event
handle immediately.
This assumes that CancelIo() blocks until the pending read request is
completely torn down and no references to it remain. While MSDN is not
completely clear on that subject, it does suggest that this is not the
case:
http://msdn.microsoft.com/en-us/library/windows/desktop/aa363791.aspx
If the function succeeds [...] the cancel operation for all pending
I/O operations issued by the calling thread for the specified file
handle was successfully requested.
This implies that cancellation was merely "requested", and that there
are no guarantees as to the state of the operation when CancelIo()
returns. Therefore, care must be taken not to close event handles
prematurely.
While I'm no aware of this potential race condition causing any problems
in practice, I don't want to take any chances.
The offset value indicates where the actual payload starts, so we can
process both legacy and SPTPS UDP packets without having to do casting
tricks and/or moving memory around.
The handling of TAP-Win32 virtual network device reads that complete
immediately (ReadFile() returns TRUE) is incorrect - instead of
starting a new read, tinc will continue listening for the overlapped
read completion event which will never fire. As a result, tinc stops
receiving packets on the interface.
With newer TAP-Win32 versions (such as the experimental
tap-windows6 9.21.0), tinc is unable to read from the virtual network
device:
Error while reading from (null) {23810A13-BCA9-44CE-94C6-9AEDFBF85736}: No such file or directory
This is because these new drivers apparently don't accept reads when
the device is not in the connected state (media status).
This commit fixes the issue by making sure we start reading no sooner
than when the device is enabled, and that we stop reading when the
device is disabled. This also makes the behavior somewhat cleaner,
because it doesn't make much sense to read from a disabled device
anyway.
This removes a bunch of variables that are never actually used anywhere.
This fixes the following compiler warning when building for Windows:
mingw/device.c:46:17: error: ‘device_total_in’ defined but not used [-Werror=unused-variable]
static uint64_t device_total_in = 0;
^
This fixes the following compiler warning when building for Windows:
mingw/device.c: In function ‘setup_device’:
mingw/device.c:92:9: error: unused variable ‘thread’ [-Werror=unused-variable]
HANDLE thread;
^
This fixes the following compiler warning when building for Windows:
mingw/device.c: In function ‘setup_device’:
mingw/device.c:186:2: error: passing argument 2 of ‘io_add_event’ from incompatible pointer type [-Werror]
io_add_event(&device_read_io, device_handle_read, NULL, CreateEvent(NULL, TRUE, FALSE, NULL));
^
In file included from mingw/../net.h:27:0,
from mingw/../subnet.h:24,
from mingw/../conf.h:34,
from mingw/device.c:26:
mingw/../event.h:61:13: note: expected ‘io_cb_t’ but argument is of type ‘void (*)(void *)’
extern void io_add_event(io_t *io, io_cb_t cb, void* data, WSAEVENT event);
tinc is using a separate thread to read from the TAP device on Windows.
The rationale was that the notification mechanism for packets arriving
on the virtual network device is based on Win32 events, and the event
loop did not support listening to these events.
Thanks to recent improvements, this event loop limitation has been
lifted. Therefore we can get rid of the separate thread and simply add
the Win32 "incoming packet" event to the event loop, just like a socket.
The result is cleaner code that's easier to reason about.
Besides controlling when tinc-up and tinc-down get called, this commit makes
DeviceStandby control when the virtual network interface "cable" is "plugged"
on Windows. This is more user-friendly as the status of the tinc network can
be seen just by looking at the state of the network interface, and it makes
Windows behave better when isolated.
Before, the tapreader thread would just exit immediately after encountering the
first error, without notifying the main thread. Now, the tapreader thead never
exits itself, but tells the main thread to stop when more than ten errors are
encountered in a row.
This allows tincctl to receive log messages from a running tincd,
independent of what is logged to syslog or to file. Tincctl can receive
debug messages with an arbitrary level.
Apart from the platform specific tun/tap driver, link with the dummy and
raw_socket devices, and optionally with support for UML and VDE devices.
At runtime, the DeviceType option can be used to select which driver to
use.
This feature is not necessary anymore since we have tools like valgrind today
that can catch stack overflow errors before they make a backtrace in gdb
impossible.
The TAP-Win32 device is not a socket, and select() under Windows only works
with sockets. Tinc used a separate thread to read from the TAP-Win32 device,
and passed this via a local socket to the main thread which could then select()
from it. We now use a global mutex, which is only unlocked when the main thread
is waiting for select(), to allow the TAP reader thread to process packets
directly.