At their peak they reached >10% of the Tor network’s guard capacity. A guard relay is the first relay in the chain of 3 Tor relays forming a circuit and the only relay seeing the Tor user’s real IP address, but not seeing the destination accessed by the user.

To give you a feeling about their size in relation to other known operators:

The biggest known guard relay operator as of 2019–12–08 is bellow 2% guard capacity.

After reporting them to the Tor Project they got removed (the once I knew about initially), but it did not take them long to setup new relays soon after.

Until this day (2019–12–08) they are actively running high bandwidth relays on the Tor network. Due to the sheer size of this particular adversary I had some hope that this discovery would act as a wake-up call and finally spark some improvements, unfortunately it did not so far.

Why didn’t we detect them earlier?

Initially their capacity was somewhat limited and most of their capacity got added in the course of the past year but a year is still a very long time for detection.

To avoid detection they spread their relays across multiple hosting providers and added them relatively slowly over a long period of time.

They make use of the biggest Tor hosters (OVH and Hetzner) to blend in with the rest, but they also make use of hosters rarely seen before they joined (i.e. AS20860). In fact their relays made the autonomous system “Iomart Cloud Services” (AS20860) so big, it is now the 6th biggest ASN by guard capacity on the Tor network:

Whenever I retroactively find out about malicious relays I look into

OrNetRadar archives to see if it triggered on them when they initially joined

the Tor network and it actually did in multiple cases, here are a few examples:

2017–09–15 OVH

2018–03–05 OVH

2018–12–19 Hetzner (the last 4 entries in the table)

2018–12–30 Iomart Cloud Services

Unnatural growth

Everyone can add as many relays as they like, technically there are no restrictions. This helps the Tor network grow and the design of the Tor network even allows for (some) malicious relays without resulting in total

loss of Tor’s privacy properties.

The Tor network is basically run under the assumption that no one is actually exploiting the unrestricted possibility to add new relays or that we can at least detect Sybil attacks and remove them before they cause (too much) harm. The past year has shown that it is not only possible to exploit this but it is actually being exploited without sufficient detective countermeasures being in place.

The open mode of operation was fine until probably somewhere around 2017 but after that so much non-attributable guard capacity was added that it no longer appeared like a natural growth. Two and a half years ago Roger Dingledine gave a talk at DEFCON 25 where he mentions that he knows 2/3 of the Tor network by capacity. This was probably somewhat true back then but the Tor network changed significantly since then.

The graph bellow shows the significant guard-only capacity growth in Gbit/s during the past years

Between 2017–10–01 and 2019–10–01 the advertised guard-only bandwidth increased from 130 to >250 Gbit/s. (data source https://metrics.torproject.org)

while the number of relays actually decreased:

While the Tor network capacity significantly increase the possibility to attribute relays to operators significantly decreased: In the last year the amount of guard capacity with no ContactInfo increased from < 30% to >45%. Most of this can probably be attributed to the discovered Sybil since they had no ContactInfo. The graph bellow shows the amount of guard probability that has no ContactInfo over the past 3 years.