Since privacy is a major concern of VPN users, there is one potential flaw with today’s centralized VPNs. The user needs to inherently trust the VPN provider not to interfere or log any of their personal traffic. It is to be noted that VPN providers are commercial entities that might offer their services relying on other commercial entities, e.g., they could use multiple cloud services to obtain a worldwide footprint. It follows that even trusted and respectable vendors might unknowingly incur in issues with a specific provider ranging from surveillance, misconfiguration, and even hacking. Either of these issues can compromise the user privacy.

In [4] the authors actively investigate 62 commercial VPN providers and find unclear policies for non logging, some evidence with tampering of their customer traffic, and a mismatch between advertised VPN node locations and actual network location. In many cases, this misbehavior was not purposely performed by the VPN provider but caused by some misconfigurations. When contacted by the authors, all providers quickly reacted to fix the reported misconfigurations.

Motivated by the above issues, decentralized Virtual Private Networks (dVPNs) are a fairly new trend. In a dVPN, users are both client and server, in the sense that when they join a dVPN they also offer a portion of their upload bandwidth to carry traffic for other users. For example, assuming Alice (France) wants to access some content only available in the US, she can piggyback on Bob’s residential IP address (US) and avoid being geoblocked. A client would discover available dVPNs nodes either via a central repository or by using a distributed repository [15].

To the best of our knowledge, Hola [5] was the first dVPN. Hola is a freemium web and mobile application which offers a dVPN service through a peer-to-peer (P2P) network. When installing Hola, the users agree to either pay a premium per month or offer part of their upload bandwidth to other Hola users. Hola has been quite successful, reaching tens of millions of nodes. At the same time, multiple incidents have been reported when people realized they were indeed carrying other people traffic. In addition, Hola’s organization originated a new company (Luminati [22]) which offers a commercial proxy service which was indeed piggybacking on Hola users.

VPN Gate [7] is another interesting dVPN. Born as a research project, it has some solid foundations that are summarized in this research paper [1]. The main motivation behind VPN Gate was to achieve blocking resistance to censorship firewalls such as the Great Firewall of China. Classic VPNs easily fail at this task because their limited and static network footprint can be easily blocked (IP blacklisting). The rationale of VPN Gate is to build a dVPN atop of volunteer machines, and realize a large set of dynamic IP addresses. The authors further inject innocent IP addresses in their public IP lists which makes it harder to perform large IP blacklist. Further, they allow their VPN nodes to cooperate in order to quickly identify a list of spies, or computers used by censorship authorities to probe the volunteer dVPN nodes. VPN Gate was launched on March 8, 2013 and it currently counts 5,529 dVPN nodes which carry, daily, more than 1 TB of traffic.

With the recent rise of blockchain, a new form of dVPNs has surfaced. In such, the rationale is to share a user’s upload bandwidth in exchange for some crypto tokens. Popular examples of such crypto dVPNs are Mysterium [2] and Sentinel [12]. Nymtech [19] and Substratum [17] are two broader approaches which are somehow related to a dVPN.

Mysterium is an open source dVPN completely built upon a P2P architecture. An immutable smart contract running on Ethereum will be used to make sure that the VPN service is paid adequately. It is currently in alpha test, and incentives will be available soon.

Sentinel is a larger project of which a dVPN is just one of the use cases. The main idea here is to use the blockchain to store a ledger of data transactions with a ‘Proof of Traffic’. A working version of the client can be downloaded for testing [18] and it clearly states that the “liability of traffic at the exit node is also upon the host”.

Nymtech is a decentralized authentication and payment protocol founded on Mixnet, a privacy preserving network which improves upon Tor. The mixnet sends all network traffic through layers of mix-nodes using Sphinx, [20] so that all packets of data are the same size and routing information is kept private. Each mix-node in the network delays the messages and generates fake “dummy messages” to create a uniform pattern of traffic which obfuscates patterns for adversaries observing the network.

In a similar spirit, Substratum [17] aims at rewarding its users for sharing resources (bandwidth, CPU, etc.). Substratum aims at building a decentralized Web with no central entity, implying that anyone can host and serve content and be paid for it. While not directly a dVPN network, it is worth mentioning since it also promises privacy and censorship circumvention.

Requirement Analysis

Open Source

A dVPN client/server code is a very critical piece of software since it can potentially gain access to very sensitive data. Despite popular VPN tunneling protocols (OpenVPN and PPTP) are inherently secure, it is important to note that misconfigurations and/or malicious code are still potential threats. It follows that a first requirement for dVPNs is to be open source so that the community can monitor the evolution of the code and report suspicious activities and/or bugs/misconfigurations, that can jeopardize a user’s privacy.

Code Execution Guarantees

While open sourcing is a good first step, a dVPN should offer stronger guarantees with respect to code execution. A Trusted-Execution-Environment (TEE) is a secure area inside the main processor which guarantees confidentiality and integrity of the code and data herein loaded. In [9], the authors show that it is indeed possible to run a VPN vantage point out of SGX, a popular TEE from Intel [10]. We are not aware of any centralized VPN offering such service, likely due to the extra cost required by such technology. However, and as demonstrated in [9], this is not impossible. The same does not hold for a dVPN due to the strict requirement of SGX.

IP Blacklisting

In order to be usable, a VPN (both centralized and distributed) needs to publish at least a portion of its vantage point list. It follows that it is relatively easy for a censorship entity or a geoblocked content provider to access such list and simply blacklist all the vantage points of a VPN. For centralized VPNs, this is an issue they constantly face and they can hardly solve. For example, content providers applying intensive geoblocking (such as Netflix) currently deny access to all major VPNs.

For dVPNs, IP blacklisting becomes a more serious problem since the IPs being banned are assigned to real users rather than machines into a data-center. At the same time, due to the potential sheer size of a dVPN it can be hard for a censorship entity or a geoblocked content provider to identify such a dynamic set of IPs. This is because VPN nodes are regular Internet users who frequently change network locations and connect from behind Network Address Translators (NATs). In this case, blocking a NATed VPN node implies blocking the whole subnet with a potentially massive service disruption. VPN Gate exploits this feature at its advantage, and it further implements defensive mechanisms to protect its volunteer IPs from being blocked. In [13], the authors proposed a distributed HTTP(S) proxying system that also leverages the same feature to protect from IP blacklisting.

QoS Guarantees

There are multiple ways to benchmark the Quality of Service (QoS) offered by a VPN service.

Networking performance — These are metrics like low latency, limited losses, and high bandwidth. While not always the case for centralized VPNs [4], there is no intrinsic reason why QoS guarantees cannot be offered with respect to these metrics. For example, Cloudflare just announced Warp [8], a large scale VPN-like system which promises both security and a faster web experience. Cloudflare’s approach is to route traffic through their overlay network composed of extremely fast and reliable links. This implies a fast and reliable lane for traffic where, for example, UDP can be used safely and effectively. The rationale behind Warp is the same for startups like Networknext [14] which, for instance, promises to improve their clients’ on-line gaming experience through their fast overlay network.

Offering high networking performance is much harder for dVPNs. This is because of client churn and heterogeneous network conditions, under which it is hard to provide some guaranteed performance. This problem is not specific to dVPNs but an overall generic issue in distributed systems. In his seminal work [16], BitTorrent’s creator (Bram Cohen) discusses the famous tit-for-tat incentive mechanism used by BitTorrent to achieve a high level of robustness and resource utilization. While great, this is still far from any sort of QoS guarantees.

Network footprint: This is another important QoS metric referring to how many unique locations a VPN can offer. As discussed in [1], VPN providers constantly battle to offer more vantage points, either by deploying new physical nodes or by playing tricks, e.g., introducing “virtual locations” based on the information available from geo-IP databases about the physical locations of their vantage points. One shared limitation among centralized VPNs is the lack of residential IP addresses, since they mostly rely on data-centers to deploy their nodes. By definition, dVPNs consist instead of a large network footprint of residential IP addresses. This is indeed one of the most attractive assets of a dVPN today.

Service availability: This refers to the percentage of time that a service is up and running correctly, e.g., the famous five nines availability (99.999%). On paper, the distributed design of a dVPN offers higher availability than a centralized VPN, with either one or N points of failure. For example, an outage in one of the cloud providers used by a centralized VPN would damage the whole service. The large and heterogenous footprint of dVPNs make the latter more unlikely. Nevertheless, serious VPN providers deploy DDOS protection and we are not aware of any big story about astonishing down time for centralized VPNs.

No Logging

Privacy is a main service that should be offered by a VPN. This implies that, at no time, a VPN node should be able to log user traffic. This means both very sensitive data (e.g., accessed URL or actual content exchanged when no HTTPS is used), but also less sensitive data like number of bytes exchanged, domain name contacted, etc. By definition, a VPN node needs visibility into the original traffic in order to forward it either to the client or to the target service, e.g., Netflix. The amount of data being visible then depends on the protocol being used, e.g., in the case of HTTPS the actual content is not visible since encrypted.

Under these conditions, how does a centralized VPN offer a “no-logs” policy? In [4], the authors investigate the usage policy offered by several commercial VPNs on their website. They find that 25% (50) of the VPN services they studied do not have a link to their privacy policy. 42% (85) of the VPN providers also did not provide terms of service. When a privacy policy was available, only 45 VPN services explicitly claimed a “no-logs” policy. This analysis suggests that VPN providers today should do a better job in terms of transparency of their actions. However, it is important to notice that some of these no-logging policies have proven to hold even during an investigation from the FBI [21].

Clearly, for a dVPN we cannot rely on any sort of usage policy. Further, in such a heterogeneous environment an even stricter no-logs requirement is needed. For the reasons above, this is hard to achieve and Hola, for instance, has been previously shamed for this issue [6]. Logging might actually be needed by a dVPN to offer protection against IP blacklisting. This is the case for VPN Gate [1][7], where each VPN node keeps connection logs (and shares them with a central repository) in order to inform the other VPN servers of a potential censorship authority attempting to discover (and block) the current dVPN footprint.

Traffic Accounting

The founding idea of a dVPN is that users share their resources, i.e., they get credited (e.g., via crypto tokens) for the traffic they carry for other dVPN users. The dVPN needs a system to account for such traffic and grant tokens, accordingly. Crypto dVPNs tackle this issue by leveraging the blockchain to keep track of proof of traffic. This can be challenging depending on which logging level is allowed/required, e.g., if just a byte counts or actual visited domains (see no logging requirement above).

Traffic Blame

From a networking perspective, VPN nodes are the entity originating the traffic they carry. This means that serious offenses (child pornography, hate speech, drug smuggling), when investigated, will point the authorities to the entity running the VPN service. At this point, the above no-logs policy comes into play where the VPN might (or not) offer extra information about who was indeed originating such traffic. In a dVPN context, there is no legal entity the authority can reach to. Instead, they would reach a victim dVPN user whose network was used to carry such traffic. In such a situation, for which again Hola has been publicly shamed [6], it can be hard for a private user to defend himself against the authority.

It is thus paramount that a dVPN implements a mechanism to avoid this kind of hairy situation. At the same time, this should be achieved guaranteeing a no-logs policy. This is challenging because, by definition, in order to allow blocking some undesired traffic, the system needs to have a sense of what this traffic is. For example [13] implements selective proxying, a selective proxying mechanisms which allows their client to have full control and transparency over what they proxy.

The table below benchmarks the existing dVPNs solutions with respect to the requirements above. In addition, the last column reports on classic centralized systems as a baseline. Note that this benchmarking was derived from the public information available about existing dVPNs.