The challenge of VPN over satellite and other high latency links
I was working on a case recently where a Peplink customer was trying to get the most out of a Satellite Internet link they had connected to a MAX HD2 cellular router.
First step of course was some speedtests:
Speedtest.net Speeds directly over SAT (no PepVPN)
- This is naturally variable but generally we see 10-14Mbps Down and 3.5Mbps Up @ 710-760ms latency
- Depending on the time of day this can be much lower. Eg I have recorded 2.4Mbps Down and 0.88Mbps Up @ 749ms Latency
MTU 1402 Direct to the internet no PepVPN
12.34Mbps Down 3.16Mbps UP 751ms
16.98Mbps Down 3.15Mbps UP 759ms
10.74Mbps Down 3.11Mbps Up 753ms
10.37Mbps Down 2.54Mbps Up 755ms
11.40Mbps Down 3.40Mbps Up 758ms
11.39Mbps Down 2.69Mbps UP 720ms
11.43Mbps Down 3.22Mbps Up 759ms
Speedtest.net Speeds over a PepVPN Tunnel to a UK hosted 580 (with 100Mbps Fiber)
MTU 1402 SF No Encryption
3.99Mbps Down 0.39Mbps Up 719ms
4.41Mbps Down 0.39Mbps Up 718ms
3.69Mbps Down 0.38Mbps Up 719ms
3.92Mbps Down 0.38Mbps Up 719ms
4.05Mbps Down 0.38Mbps Up 718ms
3.70Mbps Down 0.39Mbps Up 718ms
3.67Mbps Down 0.38Mbps Up 719ms
Then I performed a series of tests using the inbuilt PepVPN test tool which since its point to point using the shortest possible path should be more accurate:
PepVPN Test (Run on the 580) :
TCP Download (10secs – testing HD2 upload)
These results are as expected circa the 0.6Mbps seen against speedtest.net.
TCP Upload (10secs – testing HD2 download)
These results appear much lower than expected (we saw around 3.5Mbps to 4Mbps in the SpeedTest.net results).
Then I ran the PepVPN tests again using UDP with the following results.
UDP Download (10000Kbps / 10secs)
UDP Upload (10000Kbps / 10sec)
Which were obviously more encouraging as they show line speeds for the satellite link.
My focus then was on why the TCP results were so different to the UDP results within the PepVPN tool.
There are times when I get writers envy – this is one of those times. Stop reading this post and go and read this excellent article at Bentley Walker.
Done that? Good wasn’t it? The pertinent paragraphs say:
‘TCP/IP sessions start out sending data slowly. Speed builds as the rate of the acknowledgements verifies the network’s capacity to carry more traffic. This is known as slow-start, followed by a ramp-up in speed. The speed of the connection builds until the sender detects packet loss from a lack of an acknowledgement.’
‘Current two-way satellite networks employ a technique referred to as TCP spoofing to compensate for the extra time required to pass through the space segment. Special software on the satellite modem appears to terminate the TCP session, so it appears to the sender as the remote location. In reality the satellite modem is acting as a forwarder between the originating PC or host and the remote site. When the modem receives Internet traffic destined for a location, it immediately acknowledges receipt of the packet to the sender so more data packets will follow quickly. This way the sender never experiences the actual higher satellite latency to the remote site because acknowledgements return to the sender at LAN speed. As a result, TCP moves out of slow-start mode quickly and builds to the highest link send speed.’
So when we look at a Speedtest over a satellite link (no VPN) we could easily see something like this capture:
As you can see from this time sequence graph there are very fast ACK responses (underlined in red – circa 40ms). This proves that the modem in this case is doing TCP optimisation (Spoofing) as described in the linked article above. This (it seems) is a fairly standard approach for satellite internet modems.
The effect on SpeedFusion/PepVPN is that because the modem can’t read the encrypted headers in the PepVPN traffic, it can’t do its optimisation of the packets, so throughput over VPN is throttled by the high latency ACK response. This also explains the disparity in the PepVPN throughput tests for the HD2 Download (0.6Mbps) against the speedtest.net tests over PepVPN (@4Mbps) as the PepVPN test tool doesn’t send a continuous stream of traffic but rather pulses traffic across the VPN. These pulses aren’t long enough (against the SAT Link Latency) to benefit from the TCP slow-start ramp up process and so are giving slower readings.
So since latency is killing the TCP throughput here – how do we lower the latency over the Satellite link?
Well the short answer of course is that unless you can change the geostationary orbit of the satellite to reduce the round trip time of the data in transit you can’t reduce the latency – however there are ways to mitigate it.
Introducing Asymmetrically Routed VPN*
(*Needs a much better name)
A new approach we have developed in response to this scenario is Asymmetrical Routed VPN. The idea here is that we use a high latency link unidirectionally in combination with another lower latency link.
Most of the time satellite internet is deployed to bandwidth hungry home, business and education customers who have very limited options for internet connectivity. Think of a rural school miles from the exchange with 0.5Mbps DSL connections. Bonding multiple DSLs in this instance doesn’t always make sense since you would need so many of them and a 1350 (with its 13 WAN ports) as a client device (with a hosted SpeedFusion Hub as a core device elsewhere) is not always commercially viable for home and education customers.
However if we could use the download bandwidth of a satellite link for internet access and a DSL for the ACK response packets in the upload direction we could almost half the apparent latency (Satellite Round Trip Time (RTT)/2 + DSL RTT/2).
How to configure Asymmetrically Routed VPN
Set up your device to device PepVPN/SpeedFusion VPN as normal, then on the device with the satellite link connected edit the VPN profile and scroll to the WAN Connection Priority section.
Click the little blue question mark in the top right of this section to show the tool-tip and then click to enable asymmetric connection:
A new ‘Direction’ column will be displayed where you can choose which direction a WAN link should be used for VPN.
As you can see here I have set WAN1 as Up only and the vSAT connection as Down only. Depending on the latency of the DSL conenction this could nearly half the apparent latency of the satellite link.
Also worthy of a footnote is that fact that outside of the Peplink proprietary VPN technology, one way to get as much VPN throughput as possible over a high latency link is to just use UDP (as demonstrated in the UDP speed tests at the beginning of this article). When it comes to normal IPsec VPN you can therefore encapsulate the IPsec in UDP and get some really good results.
On Peplink devices using IPsec instead of PepVPN/SpeedFusion you can edit the IPsec profile and tick the UDP encapsulation check box.
However, in my experience managing high latency links is only one challenge in low bandwidth environments. Normally customers want to be able to use as much bandwidth as they can lay their hands on in combination – whether that be multiple DSLs, multiple cellular connections and even Multiple Satellite links – all at the same time on a single router.
In my opinion no other vendor does multi technology WAN load balancing and link bonding better than Peplink, and if you have the ease of use and power of the super simple configuration that comes with Peplink devices there are many advantages to using the PepVPN/SpeedFusion technology over a bog standard IPsec VPN (eg Intelligent load balancing, Bandwidth Bonding, Ease of management).
This looks like what I have been googling for the last few weeks. Having a lousy yet terrestrial latency dsl link and a speedy yet significantly delayed geostationary one, I expected this problem to be elegantly solved by now. Which peplink devices support the described configuration? I’ve been looking at the balance30 as I’d like to work my AT&T LTE into the mix as well. There seem to be a small but steady stream of used ones available on eBay