This article is based on observations in the ARAKIS system, which is built on top of a network of honeypots.



1. Introduction

In recent weeks we continued to observe significant increase of uTorrent (uTP based) network activity. Some parts of recorded traffic triggered high-level alerts in the ARAKIS system informing about possible nodes infection. What is more, according to traffic data, among other things, two of the ARAKIS honeypot sensors were involved in a conversation, which is very unlikely. This means that IP adresses that those packet contained were incorrect (or forged). In this report we summarize findings from our analysis of this activity.

OSSTMM 3 methodology gives the following definition of an anomaly:



Anomaly is any unidentifiable or unknown element which has not been controlled and cannot be accounted for in normal operations. […] An anomaly may be an early sign of a security problem. Since unknowns are elements which cannot be controlled, a proper audit requires noting any and all anomalies. […] In PHYSSEC, an anomaly can be dead birds discovered on the roof of a building around communications equipment. […] In COMSEC telecommunications, an anomaly can be a modem response from a number that has no modem. In SPECSEC, an anomaly can be a local signal that cannot be properly located nor does it do any known harm.

2. What is uTP exactly?

uTP protocol uses UDP protocol for transportation and complements it with connection-oriented features. It encapsulates standard BitTorrent packets. This means, that regular BitTorrent traffic takes place inside a communication channel created in the uTP layer. This channel has some features typical to TCP channels, including connection-orientation and congestion control. Standard BitTorrent packets rely on such TCP facilities as received data acknowledgements, regulation of window size, etc., in this case however these facilities are provided by uTP. Why do we need another protocol for that, when you can use TCP instead? UDP/uTP/BT stack is also responsible for bandwidth congestion control. When user is downloading data from HTTP or FTP server, the download speed is limited on the server side. Distributed BitTorrent network doesn’t have such limitations. That’s why it often happens that when a user downloads large amount of data, BitTorrent traffic consumes a large portion – or whole – of network bandwidth and effectively denies access to other networking applications. One possible solution is to apply built-in restrictions on the client side. These are not very sophisticated functions however and users often forget about them too. uTP on the other hand allows BitTorrent nodes to dynamically adjust bandwith congestion at the protocol level and also provides some additional functions, like support clients using low bandwidth or sharing ADSL line with a web browser.

3. Our uTP observations (statistical)

According to our data uTP activity (and, as an effect BT activity as well) has increased significantly in recent months – for example, when compared to the year 2011. We selected traffic samples from April 1st 2011 and from the same day in 2012, from the same location. These are statistical parameters of analysed traffic:

2011.04.01:

-----------

packets: 103546 (8.7MB)

UDP: 183 (~0.2% of traffic)

anomaly: 0

2012.04.01:

-----------

packets: 2142296, (201MB)

UDP: 957047 (~45% of traffic)

anomaly: 393862 (~18% of whole traffic and ~41% of UDP traffic)

When creating these summary we made the following assumptions:

Analysed traffic (packets forming the traffic anomaly) is formed from packets matching characteristic of packets that triggered ARAKIS alerts. This characteristic is as follows:

Packet creating the anomaly is a uTP packet with DATA flag set.

This packet encapsulates BT packet which contains the following data:

BitTorrent protocol hash: 057a315b89b54e53e2ee583dd5cd9ef60648805e

BitTorrent protocol peer: 0000000000000000000000000000000000000000

We can weaken these assumptions and agree that uTP SYN packets with the same source and destination sockets as uTP DATA packets are also part of the anomaly. Then the summary from 1 April 2012 looks like:

anomaly: 787724 (~37% of whole traffic and ~82% of UDP traffic)

To every packet of this kind our sensor replied with an ICMP message informing about a closed UDP socket. If we add these to our summary, it becomes very meaningful:

anomaly: 1575448 (~73% of whole traffic)

It’s plain to see that uTP and related traffic share of the whole is disproportionately large, if compared to sample from 2011 that we selected for comparison. It’s also an 23-fold overall increase of whole traffic. We observed similar symptoms (high-level alerts, statistical disproportions in traffic) in various locations. All this in honeypot traffic!

4. Packet analysis

Below we present comparison of TCP/BT and UDP/uTP/BT stacks. We will examine data found in particular layers.

4.1. IP/UDP layer

Let’s start with the IP/UDP layer. Below is summary of source and destination addresses and ports of transmissions being analysed (incoming uTP traffic)

Source hosts (top 20):

count address

-----+-------------

613 xxx.xxx.192.40

463 xxx.xxx.67.237

463 xxx.xxx.17.54

459 xxx.xxx.115.38

373 xxx.xxx.40.233

367 xxx.xxx.158.104

362 xxx.xxx.183.36

360 xxx.xxx.177.102

360 xxx.xxx.102.55

347 xxx.xxx.221.41

347 xxx.xxx.176.105

344 xxx.xxx.122.46

342 xxx.xxx.208.110

338 xxx.xxx.233.116

335 xxx.xxx.231.118

330 xxx.xxx.180.245

328 xxx.xxx.68.104

318 xxx.xxx.132.178

315 xxx.xxx.142.110

310 xxx.xxx.64.228

Destination hosts:

count address

------+--------------

393850 xxx.xxx.xxx.34

4 xxx.xxx.xxx.27

Source ports (top 20):

count port

------+-----

133014 45571

79677 62100

39658 60598

35461 55025

30830 47013

29605 45770

11555 36610

9697 57902

5996 20995

4989 32692

3436 21512

3084 46957

2715 17632

2334 28328

575 1119

228 1926

10 6974

4 9425

4 9405

4 9159

Destination ports (top 20):

count port

------+-----

133007 45571

79672 62100

39657 60598

35461 55025

30829 47013

29604 45770

11554 36610

9697 57902

5996 20995

4989 32692

3436 21512

3084 46957

2715 17632

2334 28328

575 1119

228 1926

4 9425

4 9405

4 9159

4 8786

Source addresses have a rather uniform distribution if we omit some addresses from the top of the list. These addresses originate from various autonomous systems from different geographic locations (including: Russia, Canada, China, Australia, USA). This distribution has characteristic of a sum of a uniform and a certain non-uniform distribution. It might mean, that part of these addresses are used in a uniform way (e.g. in turns) or that they are randomly chosen (forged).

High geographical differentiation makes the second explanation rather more convincing. Possibly some kind of distributed anonymising network is being employed. Randomly chosen addresses fail test for TOR nodes, however there are other possibilities like: other anonymising networks, VPN services, botnets.

If it comes to source and destination ports, we notice, that high ports are preferred. Similar ports are chosen as source and destination of transmission.

4.2. uTP layer

uTP packet (detailed description on specification site)

0 4 8 16 24 32

+-------+-------+---------------+---------------+---------------+

| ver | type | extension | connection_id |

+-------+-------+---------------+---------------+---------------+

| timestamp_microseconds |

+---------------+---------------+---------------+---------------+

| timestamp_difference_microseconds |

+---------------+---------------+---------------+---------------+

| wnd_size |

+---------------+---------------+---------------+---------------+

| seq_nr | ack_nr |

+---------------+---------------+---------------+---------------+

Incoming traffic forming the anomaly can be viewed as series of flows originating from various sources, consisting of two uTP packets: uTP SYN and uTP DATA. These packets are parts of a uTP handshake mechanism used to set up a uTP session. However, parts of them do not follow the protocol. These are:

4.2.1. Transmission source ignores ICMP messages informing about closed UDP ports.

There’s no explicit proper reaction to ICMP in uTP specification. But this kind of message should be interpreted as information for the client that the connection it’s trying to make is impossible.

4.2.2. Transmission source ignores lack of uTP STATE packet and sends a DATA packet.

Without the STATE packet, packet it doesn’t know a proper acknowledgement number and is unable to create a proper connection. Instead of a correct acknowledgement number it places 0, which violates the protocol. During our research we were sending packets with similar construction to uTorrent client and it responded with FIN packet, that is, it terminated the connection. We suspect it’s an effect of not following the protocol.

4.2.3. Other interesting or otherwise unusual properties of uTP packets:

Most of the packets had value 0 in timestamp difference field:

timestamp difference microseconds (top 10):

count value

------+---------

392850 0000 0000

10 6f63 6f6c

4 fd7a 9ff1

4 fb26 16d3

4 fa5b c083

4 f9e9 d86d

4 f946 4a14

4 f937 8df9

4 f83c a31a

4 f69a ecdf

Most of the packets had value of 0x380000 (3670016 decimal) in window size field.

window size (top 10):

count value

------+---------

393017 0038 0000

738 0004 0000

50 0000 0000

46 0003 2000

4 0003 9999

4 0003 3333

4 0000 1302

3 0001 937c

3 0000 0e32

2 0002 1dfd

Values of timestamp field in two consecutive packets in each flow differ by multiple of 10000 microseconds.

Some of these properties can result from different protocol implementations, but other make use of this protocol pointless. It’s very unlikely that with time resolution this high these values are legitimate. If these values are forged, the protocol loses it’s congestion control features and using it for transportation makes no sense.

4.3. BitTorrent layer

Encapsulated BT packets contain hash value of 057a315b89b54e53e2ee583dd5cd9ef60648805e. This hash corresponds to information about files containing a film “Avgust. Vosmogo”. It’s a Russian action movie, which had its premiere on 21st February 2012. During analysis of traffic to other locations we registered similar packets (in UDP/uTP/BT stack) containing hashes corresponding to information about files containing this film and files containing other Russian film: “Shpion” (more on hash field in BT packets: http://wiki.theory.org/BitTorrentSpecification#Tracker_Request_Parameters).

Below we present interesting time correlations between torrent publication and registered transmissions containing corresponding hashes:

hash – publication date – transmission date

c11ba392ef3dd57942112641ce8f1d9b96f0ddd5 – 26.02.2012 – 17.03.2012

057a315b89b54e53e2ee583dd5cd9ef60648805e – 17.03.2012 – 01.04.2012

In analysed traffic from April 1st 2012 99.99% of uTP DATA packets contained hash 057a315b89b54e53e2ee583dd5cd9ef60648805e.

These packets also contained BT peer IDs (more on peer ID field in BT: http://wiki.theory.org/BitTorrentSpecification#peer_id).

peer IDs (top 10):

ilość wartość

------+-------------------------------------------------

329755 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000

399 2d54 5232 3432 302d xxxx xxxx xxxx xxxx xxxx xxxx // -TR2420-xxxxxxxxxxxx

214 2d54 5232 3530 302d xxxx xxxx xxxx xxxx xxxx xxxx // -TR2500-xxxxxxxxxxxx

213 2d54 5232 3432 302d xxxx xxxx xxxx xxxx xxxx xxxx // -TR2420-xxxxxxxxxxxx

100 2d55 5432 3231 302d xxxx xxxx xxxx xxxx xxxx xxxx // 2�T2210-xxxxxxxxxxxx

97 2d55 5431 3737 302d xxxx xxxx xxxx xxxx xxxx xxxx // -UT1770-xxxxxxxxxxxx

95 2d54 5231 3933 302d xxxx xxxx xxxx xxxx xxxx xxxx // -TR1930-xxxxxxxxxxxx

93 2d54 5231 3933 302d xxxx xxxx xxxx xxxx xxxx xxxx // -TR1930-xxxxxxxxxxxx

91 2d55 5433 3132 302d xxxx xxxx xxxx xxxx xxxx xxxx // -UT3120-xxxxxxxxxxxx

90 2d4d 4732 3125 302d xxxx xxxx xxxx xxxx xxxx xxxx // -MG21%0-xxxxxxxxxxxx

As you can see, most of the packets contained only zeroes.

We queried popular public trackers about these hashes. These are results we received:

tracker – seeders – peers

– 057a315b89b54e53e2ee583dd5cd9ef60648805e – Avgost. Vosmogo

https://torrentproject.com

http://xxxxxxxxxxxx:80/announce – 17 – 0

http://xxxxxxxxxxxx/announce – 16 – 0

http://xxxxxxxxxxxx:6969/announce – 11 – 0

http://xxxxxxxxxxxx/announce – 12 – 0

http://xxxxxxxxxxxx/announce – 9 – 0

http://xxxxxxxxxxxx/announce – 4 – 1

http://torrentz.eu

http://xxxxxxxxxxxx:2710/announce – 172 – 9

udp://xxxxxxxxxxxx:80/announce – 17 – 0

udp://xxxxxxxxxxxx:80/announce – 10

http://xxxxxxxxxxxx/announce.php – 3 – 0

udp://xxxxxxxxxxxx:80/announce – 3 – 0

http://xxxxxxxxxxxx/announce.php – 2 – 0

http://bitsnoop.com

http://xxxxxxxxxxxx/announce.php – 3 – 0

http://xxxxxxxxxxxx/announce.php – 0 – 1

http://xxxxxxxxxxxx/announce – 1 – 0

http://xxxxxxxxxxxx:3310/announce – 0 – 1

http://xxxxxxxxxxxx:6969/announce – 0 – 0

http://xxxxxxxxxxxx/announce.php – 2 – 0

http://xxxxxxxxxxxx/announce – 1 -0

http://xxxxxxxxxxxx/announce – 1 – 0

http://xxxxxxxxxxxx/announce – 1 – 0

http://xxxxxxxxxxxx/announce – 15 – 0

udp://xxxxxxxxxxxx:80/announce – 13 – 0

– c11ba392ef3dd57942112641ce8f1d9b96f0ddd5 – Avgost. Vosmogo

http://torrentz.eu

udp://xxxxxxxxxxxx:80/announce – 62 – 0

https://torrentproject.com

http://xxxxxxxxxxxx/announce – 22 – 0

http://xxxxxxxxxxxx/announce – 26 – 0

http://xxxxxxxxxxxx:6969/announce – 27 – 0

http://xxxxxxxxxxxx:80/announce – 43 – 0

http://xxxxxxxxxxxx/announce – 42 – 0

udp://xxxxxxxxxxxx/announce – 50 – 1

It’s plain to see that peer counts are very low. For comparison, we present information about other torrents, very popular (Lost series) and less popular as well (Naruto, Nocznoj Dozor)

– ab53cb0d665b34fcdf1939b271660b48297b5a74 – Lost

http://torrentz.eu

http://xxxxxxxxxxxx:6969/announce – 167 – 932

udp://xxxxxxxxxxxx:80/announce – 157 – 763

udp://xxxxxxxxxxxx:80/announce – 144 – 779

http://xxxxxxxxxxxx:8000/announce – 19 – 188

http://xxxxxxxxxxxx:8080/announce – 16 – 176

http://xxxxxxxxxxxx/announce – 16 – 194

http://xxxxxxxxxxxx/announce – 8 – 23

http://xxxxxxxxxxxx:3310/announce – 6 – 91

http://xxxxxxxxxxxx:6969/announce – 4 – 38

http://xxxxxxxxxxxx/announce.php – 1 – 0

http://xxxxxxxxxxxx/announce.php – 1 – 1

udp://xxxxxxxxxxxx:80/announce – 1 – 11

http://xxxxxxxxxxxx:6969/announce – 0 – 1

http://xxxxxxxxxxxx:9090/announce – 0 – 17

http://xxxxxxxxxxxx:6969/announce – 0 – 32

http://xxxxxxxxxxxx/announce – 0 – 1

http://xxxxxxxxxxxx/announce – 0 – 24

https://torrentproject.com

http://xxxxxxxxxxxx/announce – 139 – 829

http://xxxxxxxxxxxx/announce – 142 – 830

http://xxxxxxxxxxxx:6969/announce – 173 – 948

http://xxxxxxxxxxxx:80/announce – 189 – 1012

http://xxxxxxxxxxxx/announce – 192 – 1016

udp://xxxxxxxxxxxx/announce – 182 – 982

– 799db5ad3c746823a8df170bb1a717835c1dccc8 – Naruto

http://torrentz.eu

http://xxxxxxxxxxxx:3310/announce – 169 – 16

http://xxxxxxxxxxxx:6969/announce – 75 – 12

http://xxxxxxxxxxxx:2710/announce – 71 – 11

http://xxxxxxxxxxxx:8000/announce – 68 – 11

http://xxxxxxxxxxxx:8080/announce – 57 – 10

http://xxxxxxxxxxxx:6969/announce – 23 – 5

http://xxxxxxxxxxxx/announce – 6 – 0

http://xxxxxxxxxxxx/announce – 6 – 2

http://xxxxxxxxxxxx:6969/announce – 3 – 5

udp://xxxxxxxxxxxx:80/announce – 2 – 0

http://xxxxxxxxxxxx:6969/announce – 2 – 2

https://torrentproject.com

http://xxxxxxxxxxxx/announce – 64 – 8

http://xxxxxxxxxxxx/announce – 53 – 4

http://xxxxxxxxxxxx:6969/announce – 64 – 5

http://xxxxxxxxxxxx:80/announce – 57 – 4

http://xxxxxxxxxxxx/announce – 57 – 4

udp://xxxxxxxxxxxx/announce – 60 – 5

– db839a0773d32a8def122f5e930b2ccf4a21ead2 – Nocznoj Dozor

http://torrentz.eu

http://xxxxxxxxxxxx:3310/announce – 27 – 7

http://xxxxxxxxxxxx:6969/announce – 20 – 3

udp://xxxxxxxxxxxx:80/announce – 13 – 2

udp://xxxxxxxxxxxx:80/announce – 1 – 0

https://torrentproject.com

http://xxxxxxxxxxxx/announce – 18 – 6

http://xxxxxxxxxxxx/announce – 14 – 7

http://xxxxxxxxxxxx:6969/announce – 13 – 4

http://xxxxxxxxxxxx:80/announce – 15 – 3

http://xxxxxxxxxxxx/announce – 13 – 3

udp://xxxxxxxxxxxx/announce – 17 – 3

When examining analysed torrent’s data we can usually see a large amount of so called seeders and very small amount of peers (usually – 0). It’s a rather uncommon distribution. In most of other cases these proportions are much more fuzzy (if not reversed).

5. Hypothesis

Our analysis of traffic forming the anomaly in our honeynet didn’t clearly determined it’s source and purpose. However we were able to formulate several hypothesis regarding this traffic.

5.1. Sources of anomaly

5.1.1. Addresses are forged

This possibility is confirmed by a uniform characteristic of a component of address distribution and – above all – source addresses belonging to our honeynet nodes, which don’t initiate transmissions.

5.1.2. Addresses are legitimate

During detailed analysis we discovered a connection-oriented TCP session transporting BT packets that partially match anomaly characteristic (contain same hash value). Creating a TCP session using forged source IP addresses is possible, but very difficult and virtually impossible in WAN networks, that’s why we suspect that source address of this transmission is legitimate. It’s also very unlikely (mainly due to time correlations), that this session is completely unrelated to anomaly. It contained a similar BT packet (the same hash value) but it also contained non-zero peer value. It’s possible that node using this address is a common uTP or BT client.

5.1.3. Source addresses set contains forged addresses alongside legitimate ones.

This is the most likely hypothesis.

5.1.4. Source and destination ports are standard ports for uTP conversations.

We haven’t delve into this aspect. According to our observations during tests with uTorrent this client listens to connections on port 12144 by default. This port can be changed by user or randomly chosen. It’s possible that set of ports was created based on information from public trackers.

5.1.5. Source of transmission is not a legitimate uTP client but other program or script whose purpose is different from data sharing.

Properties of uTP packets indicate this – round differences between timestamps, breaking the protocol, ignoring ICMP messages. Also choosing destination addresses from ARAKIS honeynet supports this hypothesis.

5.2. Anomaly effects.

5.2.1. Transmissions’ purpose is uTP/BT network poisoning

According to data collected from publicly available trackers seeders to peers ratio of torrents containing hash 057a315b89b54e53e2ee583dd5cd9ef60648805e and other hashes corresponding to film “Avgust. Vosmogo” is uncommon and can be a result of poisoning uTP/BT network elements (e.g. by distributing forged data about peers – series of zeroes).

5.2.2. Transmissions’ purpose is uTP/BT network mapping

It’s possible that these transmissions are used to create some kind of mapping of uTP/BT network nodes. A mapping program or a script would classify tested node as a uTP client based on it’s responses:

Correct uTP response – node is a uTP client

Incorrect uTP response – node isn’t a uTP client

But some facts render this hypothesis unlikely:

Despite the ICMP Port unreachable message the source sends another packet, though it can classify based on this response.

Some packets contain forged source address, which makes receiving an answer and classification virtually impossible.

5.2.3. Transmissions constitute an attack on IT systems

It’s possible, that packets we’re observing are capable of introduce corruption conditions in some applications – they’re exploits. Results of superficial research on uTorrent vulnerabilities however don’t support this hypothesis, but we haven’t got enough knowledge on other client or unpublished vulnerabilities to completely reject it. It’s possible that some properties of uTP packets (e.g. zeroes in ack_nr fields) or BT packets (e.g. series of zeroes in peer ID field) could lead to create corruption conditions in some applications.

Anomaly through it’s nature (large share in daily network traffic) produces visible disruption in IT systems and large amount of our false-positive high-level alerts is a good proof. In terms of Polish law, European Convention on Cybercrime and U.S. Codes (and probably many other sources of domestic law) legality of process producing the anomaly is questionable.

Polish Penalty Code, ch. XXXIII, Art. 269a.. (no translation available)

Convention on Cybercrime, Chapter 2, Article 5:

Each Party shall adopt such legislative and other measures as may be necessary to establish as criminal offences under its domestic law, when committed intentionally, the serious hindering without right of the functioning of a computer system by inputting, transmitting, damaging, deleting, deteriorating, altering or suppressing computer data. […] Article 7:

Each Party shall adopt such legislative and other measures as may be necessary to establish as criminal offences under its domestic law, when committed intentionally and without right, the input, alteration, deletion, or suppression of computer data, resulting in inauthentic data with the intent that it be considered or acted upon for legal purposes as if it were authentic, regardless whether or not the data is directly readable and intelligible. A Party may require an intent to defraud, or similar dishonest intent, before criminal liability attaches.

5.2.4. Observed traffic is – at least partially – an echo

It’s possible that parts of recorded traffic is an echo of remote network events (incidents).

5.3. Purpose of anomaly

5.3.1. Network mapping / poisoning

Data collected from public trackers support this hypothesis. Without delving into details of torrent client reactions it’s plain to see that trackers register small amount of peers downloading analysed resources. It’s possible that it’s an effect of a process which we are currently unable to understand fully and which produce the anomaly. At least one interest group that would benefit from uTP poisoning is easy to point at: multimedia companies and their subcontractors. Conduction of this kind of campaign by these institutions wouldn’t be precedent. It’s also possible that generated traffic is used for BitTorrent network mapping and data gathering for later use in other projects.

5.3.2. Broken implementation / experiment

Failure in following the uTP protocol could be an effect of errors in uTP protocol implementation in less popular or immature BitTorrent client. It’s also possible that this anomaly is an effect of some kind of network experiment.

5.3.3. Camouflaging traffic

Assuming that incoming traffic consists of legitimate and forged source addresses we consider the possibility of parts of the traffic having a specific reason and purpose while other parts of it make up camouflaging traffic. Traffic containing legitimate addresses and acknowledgement numbers (not terminating the session), despite it’s small share, could be effective in poisoning or mapping the uTP network.

5.3.4. Echoes from attacks on other networks

It’s possible that parts of traffic we’re observing in our honeynets is an echo from attacks conducted in other networks. Parts of traffic that support this hypothesis are: packets with uTP STATE flags set (equivalent of TCP SYN/ACK flags in DDoS echoes) and connection-oriented transmissions with seemingly legitimate peer IDs.

6. Summary

During our analysis of abnormal traffic we haven’t come to a clear conclusion on it’s source or purpose. That’s why in OSSTMM methodology context it’s still an anomaly observed by systems protecting our networks. It’s not an anomaly that arises spontaneously (i.e. with no human intervention) like for example (in COMSEC channel) ghost energy on Ethernet mediums:

Cisco CCNA:

Fluke Networks has coined the term ghost to mean energy (noise) detected on the cable that appears to be a frame, but is lacking a valid SFD.

Recorded packets are complex and they were certainly designed. The open question is whether their specific form and amounts is a result of intentional actions (e.g. network poisoning).

In section 5 we proposed some hypothesis regarding this anomaly.

It’s possible that further research would allow us to fully understand the process responsible for this traffic and remove it from anomaly category.

Glossary

μTP – (also: uTP) transport protocol used for transportation of BitTorrent packets

μTorrent – (also: uTorrent) BitTorrent client, uTP-compatible

BitTorrent – (also: BT) distributed P2P network and protocol designed for file sharing

ARAKIS – network incident analysis and aggregation system designed and operated by NASK

OSSTMM – Open Source Security Testing Methodology Manual

ICMP – Internet Control Message Protocol

TCP – Transmission Control Protocol, connection-oriented transport protocol

IP – Internet Protocol, popular networking protocol

UDP – User Datagram Protocol, connectionless transport protocol