Syncthing, HAProxy, and TCP buffer inspection

I use Syncthing for sharing files between my devices, primarily for backing up pictures on my phone to my laptop, and copying music on my laptop to my phone. I'm quite pleased with it as a solution overall; it can be somewhat inscrutable to debug, but once it's working it's very nice to use.

The basic premise with Syncthing is that two or more appropriately configured devices can locate each other on a network, authenticate to each other, and then synchronise the contents of shared folders between themselves. Devices can discover each other through IP multicast, if they're on the same layer 2 network, or using an external discovery server. Similarly, data can be transferred directly between devices if they're on the same network, through a direct peer-to-peer TCP connection, or using an intermediate relay server if the two devices can't directly connect to each other. All communication between two devices is end-to-end encrypted, so all the data transferred via the intermediate components should remain confidential.

The Syncthing developers run both a public discovery server cluster and a public pool of relay servers, and Syncthing makes use of both of these out of the box. The default settings are to locate peer devices using both local multicast and the public discovery servers, and to use the public relay server pool when other devices are not directly reachable. It's worth mentioning that it's also possible to configure Syncthing to listen on a fixed address, and then tell other peers to connect directly to that address, though the automatic discovery is really convenient.

The discovery and relay servers are open source, like the main Syncthing client, so you can run your own if you want, which is something that I decided to do. It's all documented Syncthing's documentation website.

The relay server uses a custom protocol, and by default listens on TCP port 22067, which one can point a Syncthing client at by adding an entry like relay://relay.example.com:22067 to the list of client listen addresses. The discovery server uses HTTP(S) as a transport, which by default listens on TCP port 8443, though the documentation also details how to run the discovery server behind an HTTP reverse proxy like nginx, with the proxy listening on the standard TCP port for HTTPS, 443. The discovery server can then be enabled on a client by adding the URL to the server to the list of global discovery servers in the Syncthing client's settings.

The potential for using a reverse proxy caught my attention, as it's conceivable that this may be desirable if one or more client devices are behind a firewall which does not permit outbound TCP connections to port 8443, but does allow port 443 (like the free WiFi available on some trains I've been on in the past).

This then got me thinking about the relay server — it would be convenient if that could also be on a common port like 443. There are multiple ways to achieve this, such as choosing another common port which isn't in use by the discovery server, or running the relay server on port 443 on a different IP address. The first of those is a perhaps boring but otherwise perfectly adequate solution; the second has the issue of needing another IP address — IPv4 addresses are expensive, and free train WiFi tends not to have any IPv6 connectivity, which negates the ubiquity of IPv6 addresses.

Alternatively, one could try and find some way to multiplex both the relay and discovery servers over a single port, which seemed like a lot more fun (and a much more interesting way to waste a few evenings in making it work and then writing about it).

Enter HAProxy!

HAProxy is a TCP and HTTP reverse proxy and load balancer. It's very configurable (which makes it wonderfully arcane if you're not familiar with it), and it can be used in a broad variety of applications.

In HTTP mode, HAProxy can make forwarding decisions based on components of an HTTP request as one might expect, such as the contents of headers or the path given in a request. In TCP mode however, it's possible to sample the raw data received from a client, which ranges from simple matching of static byte sequences to deep inspection of incoming TLS client handshakes (for which HAProxy has quite a bit of built-in functionality).

One thing which I've seen HAProxy's TCP mode used for, which I find particularly cool, is selecting a backend server based on the Server Name Indication (SNI) sent with a TLS client handshake. The SNI is sent in cleartext, so HAProxy doesn't need private key material for any of the backend servers to which it's forwarding connections, and the connection stays encrypted end-to-end between the client and the backends.

So I decided to see if I could use HAProxy for muxing Syncthing's discovery and relay protocols over a single socket. While my discovery and relay servers are running on the same machine, I refer to them using distinct fully-qualified domain names (which are both aliases to the host machine's own domain name), so that any cleartext server names sent by clients uniquely identify the intended destination. This makes handling the discovery protocol quite straightforward, as it runs on top of HTTPS, so this can be identified using the SNI matching technique described above.

The relay protocol is a little more complicated, as it's a custom protocol (as mentioned above), but it's all documented here. It has two different sub-protocols, or "modes": "protocol" and "session" mode.

The protocol mode is for control-plane communictation between clients and the relay server. This mode operates over a TLS session established between a client and the relay, which can be identified using HAProxy's TLS detection/inspection functionality. The session mode is for data exchange between two clients connected to a common relay. The relay provides a data channel between the two clients, blindly copying data between them; the two clients then establish a TLS connection over this channel for securing end-to-end data transfer. The handshake for setting up a session mode connection starts with a static four-byte magic number, so this can be identified by inspecting the first four bytes of data received from a client.

I started by installing HAProxy on the server I'm using for this (which is running Debian Buster), and removed all of the HTTP-related options from the default configuration file which ships with the package. I also raised some of the preset connection-idle timeouts from fifty seconds to five minutes, as the relay server sends keepalive messages every minute by default.

I then wrote the following frontend configuration stanza, based on a couple of the example snippets given in the HAProxy documentation:

frontend default-in
  bind 192.0.2.42:443
  bind [2001:db8:1234::42]:443
  mode tcp
  
  acl client_hello          req_ssl_hello_type 1
  acl relay_session_mode    req.payload(0,4) -m bin 9e79bc40
  acl relay_protocol_mode   req_ssl_sni -i relay.example.com
  acl discovery_protocol    req_ssl_sni -i discovery.example.com
  
  tcp-request inspect-delay 5s
  tcp-request content accept if client_hello || relay_session_mode
  
  use_backend strelay if relay_session_mode || relay_protocol_mode
  use_backend stdisco if discovery_protocol

This tells HAProxy to bind on the given IP addresses and port numbers, in TCP mode. Then, some access control lists are defined, which match:

HAProxy is then configured to wait up to five seconds for data to be received from the remote client, and to only continue processing the connection if either a TLS ClientHello message is received or the first four bytes match the session mode magic number. The backend to use is then selected based on whether the remote client is attempting to reach the discovery server, or whether either of the relay protocol modes is detected.

The corresponding backend definitions are quite straightforward:

backend strelay
  mode tcp
  server relay 127.0.0.1:22067

backend stdisco
  mode tcp
  server disco 127.0.0.1:8443

I've configured my relay server to listen on TCP port 22067 on localhost, and my discovery server is configured to listen on port 8443 on localhost.

The good news is that this is enough to make the discovery server work! The bad news is that neither Syncthing instances on my phone or my laptop could connect to the relay server with this HAProxy configuration.

The console logs from Syncthing on my laptop showed that it was receiving unexpected disconnects from the relay server. I started the relay server in a shell on the server with the -debug flag to try and get some further information.

I found that trying to connect to HAProxy with openssl s_client (emulating the relay protocol mode) would cause the relay server to print a debug message about receiving a connection, as would sending the session mode magic byte sequence (using printf and netcat). However, when I restarted the Syncthing client on my laptop to force it to re-establish a protocol mode connection to the relay server, the server process didn't print any messages, which indicated that for some reason the connections from the client weren't being forwarded correctly.

After a little searching around the Syncthing GitHub repository, I found that the respository contains a little test relay client for testing connections with a relay server. I decided to see what this would do:

molly on flywheel ~> # syncthing needs the most recent version of go to build
molly on flywheel ~> sudo apt-get install -t buster-backports golang

[time passes...]

molly on flywheel ~> go get -v github.com/syncthing/syncthing/cmd/strelaysrv/testutil

[more time passes...]

molly on flywheel ~> mkdir st-test
molly on flywheel ~> cd st-test
molly on flywheel ~/st-test> # need some certificates for the test client
molly on flywheel ~/st-test> syncthing -generate=.
00:50:47 INFO: Device ID: TNATBG6-ATWFWWZ-CEHSYRH-W42JT64-25DYTLL-RVNU4G6-D5GMS62-JKRNXAA
00:50:47 INFO: Default folder created and/or linked to new config
molly on flywheel ~/st-test> ~/go/bin/testutil -test -relay relay://relay.example.com:443 -keys .
2020/12/05 00:51:18 main.go:47: ID: TNATBG6-ATWFWWZ-CEHSYRH-W42JT64-25DYTLL-RVNU4G6-D5GMS62-JKRNXAA
2020/12/05 00:51:18 main.go:114: FAIL: getting invitation: EOF

So no luck with the test client either.

I then ran the test client under strace to try and capture what data it was sending to the network. I identified the lines in the trace file written by strace corresponding to the TLS handshake sent to the relay server, expecting to find the relay server's domain name in these lines, as this would be sent in plaintext as part of the SNI.

However, the relay server's domain name wasn't present in the handshake! I hence inferred that the client side of the relay protocol doesn't send any SNI as part of the TLS connection setup, which would explain why my HAProxy configuration wasn't working, as there was no SNI to match against!

This meant that I couldn't detect the relay protocol mode by matching on SNI, so I went back to the relay protocol specification to try and find some other way to identify the initial client TLS handshake. As it happens, the specification says that the "protocol name defined by the TLS header should be bep-relay", which I (correctly) assumed meant that the Application-Layer Protocol Negotiation (ALPN) field in the TLS handshake should contain the string bep-relay.

I then took a second look at the trace from the test client, and sure enough, the handshake sent to the relay server contains "bep-relay" as a substring. This seemed like it might be a usable heuristic for detecting the relay protocol with HAProxy.

I had to upgrade HAProxy to the version in the backports Debian repository, as the ability to inspect the ALPN in a TLS handshake isn't supported by the version in Buster. I then changed the HAProxy configuration for detecting the relay protocol in protocol mode to this line:

  acl relay_protocol_mode   req.ssl_alpn bep-relay

I restarted HAProxy and then tried the relay protocol test utility again:

molly on flywheel ~/st-test> ~/go/bin/testutil -test -relay relay://relay.example.com:443 -keys .
2020/12/05 01:19:59 main.go:47: ID: TNATBG6-ATWFWWZ-CEHSYRH-W42JT64-25DYTLL-RVNU4G6-D5GMS62-JKRNXAA
01:19:59 INFO: Joined relay relay://relay.example.com:443
2020/12/05 01:20:00 main.go:112: OK

Success at last!

Even better, when I restarted my Syncthing clients on my phone and my laptop, they successfully connected to the both the discovery and relay servers, and with a little fiddling (to turn off direct peer-to-peer data transfer) I was able to convince them to pass data over the relay as well.

Syncthing and HAProxy are both pieces of software I find really useful. It might take a bit of effort to get set up, but when it works, Syncthing just works. The convenience of (reliable) drag-and-drop file sharing between devices is a small but significant quality-of-life improvement for me. HAProxy has a bewildering array of functionality (if the length of the documentation page for the configuration file is anything to go by), with many, many different ways to manage and mangle HTTP(S) requests (or arbitrary TCP socket buffers, if that's what's required...).