How the Great Firewall of China is Blocking Tor

This study investigated how the Great Firewall of China (GFC) is blocking the Tor anonymity network. Tor is an overlay network which provides its users with anonymity on the Internet. A more detailed explanation is available on the project website or in the design paper.

A large number of so-called entry guards and bridge relays serve as the entry points to the network. If these entry points are not reachable, a user finds herself unable to connect to the Tor network. Since Tor is used more and more to circumvent censorship systems, countries such as China are trying to block these very entry points.

According to recent reports, China's firewall is now able to dynamically recognise Tor usage and block the respective relays and bridges. The diagram to the right illustrates how the block works. In a nutshell, 1) the firewall searches for a bunch of bytes which identify a network connection as Tor. If these bytes are found, 2) the firewall initiates a scan of the host which is believed to be a bridge. In particular, 3) the scan is run by seemingly arbitrary Chinese computers which connect to the bridge and try to “speak Tor” to it. If this succeeds, the bridge is blocked.

Effective countermeasures build on a sound understanding of the filtering in place. For this reason, this study was conducted with the goal to reveal and understand the inner workings of the blocking infrastructure. The contributions of this study are threefold:

We reveal how Chinese users are hindered from accessing the Tor network. We conjecture how China's Tor blocking infrastructure is designed. We discuss evasion strategies.

This web site contains the published papers, the developed software and the gathered data. For a short summary of our work, you can have a look at the media articles in MIT Technology Review, V3.co.uk and The Verge.

Papers

Software

brdgrd: The name brdgrd is short for “bridge guard”: A program which is meant to protect Tor bridges from being scanned (and as a result blocked) by Chinese probes. The tool runs independently of Tor and implements a number of strategies to evade the GFC's deep packet inspection (DPI) boxes as well as to block connecting scanners. Update 2013-11-17 : brdgrd no longer seems to be effective as the GFC appears to be doing TCP stream reassembly now.

is short for “bridge guard”: A program which is meant to protect Tor bridges from being scanned (and as a result blocked) by Chinese probes. The tool runs independently of Tor and implements a number of strategies to evade the GFC's deep packet inspection (DPI) boxes as well as to block connecting scanners. : no longer seems to be effective as the GFC appears to be doing TCP stream reassembly now. tcis: The tool tcis simulates the initiation of a Tor connection by sending a TLS client hello to a given IP address. The client hello's cipher list is identified by Chinese DPI boxes as belonging to Tor. The tool can be used to trigger scanning.

simulates the initiation of a Tor connection by sending a TLS client hello to a given IP address. The client hello's cipher list is identified by Chinese DPI boxes as belonging to Tor. The tool can be used to trigger scanning. Update 2013-11-17: We are developing a network protocol, ScrambleSuit, which is immune to the GFC's active probing attacks. Feel free to give it a try!

Data

We first published a technical report about our findings which was then followed by a peer-reviewed workshop paper . We recommend to read the workshop paper which is listed below as it contains several updates and corrections over the original technical report. The ;login: article below contains less technical details and is easier to read.All of the code listed below is licensed under the GPLv3

A large part of our study consisted of the analysis of data we gathered by attracting numerous Chinese scanners. We configured all our bridges to be private (i.e., only known to ourselves) and to listen to randomly chosen TCP high ports. That way, we can be sure that our data only contains automated scanners and no real users. The raw data is available below.

Filename Description scanning_connections.csv Connection data of all scanners which were found connecting to our bridge. The file contains four columns: IP address, source port, UNIX timestamp when the scanner connected and the sent data. scanners_asn.txt The autonomous system numbers of the scanning IP addresses. The ASN was resolved using Team Cymru's IP to ASN Lookup tool. scanners_reverse_lookup.txt All valid reverse DNS lookups of the observed scanners. The lookups were done using Google's open DNS server 8.8.8.8.

Contact

Feel free to contact Philipp using phw at nymity dot ch.You can encrypt e-mails by using this OpenPGP public key