What is BitTorrent and how it helps in the next generation data delivery authpaper Follow Oct 22, 2018 · 5 min read

BitTorrent is a protocol for distributing data and electronic files between peers. It is invented in 2001 by Bram Cohen.

The biggest advantage of BitTorrent over normal server client download is that when multiple downloads of the same file happen concurrently, the download peers will upload pieces to each other, making it possible for the file source to support very large number of downloads with only a modest increase in its load. BitTorrent is one of the most common protocols for transferring large files, such as digital video files containing TV shows or video clips or digital audio files containing songs. According to Palo Alto Networks, BitTorrent was responsible for 3.35% of all worldwide bandwidth, more than half of the 6% of total bandwidth dedicated to file sharing as of February 2013. As of 2013, BitTorrent has 15–27 million concurrent users at any time.

BitTorrent Architecture. Image from http://dukeinstitution.org

The BitTorrent protocol can be used to reduce the server and network impact of distributing large files. Rather than downloading a file from a single source server, the BitTorrent protocol allows users to join a “swarm” of hosts to upload to/download from each other simultaneously. The protocol is an alternative to the older single source, multiple mirror sources technique for distributing data, and can work effectively over networks with lower bandwidth. A user who wants to upload a file first creates a small torrent descriptor file that they distribute by conventional means (web, email, etc.). They then make the file itself available through a BitTorrent node acting as a seed. Those with the torrent descriptor file can give it to their own BitTorrent nodes, or peers, to download it by connecting to the seed and/or other peers. The file being distributed is divided into segments called pieces. As each peer receives a new piece of the file, it becomes a source (of that piece) for other peers, relieving the original seed from having to send that piece to every computer or user wishing a copy. With BitTorrent, the task of distributing the file is shared by those who want it; it is entirely possible for the seed to send only a single copy of the file itself and eventually distribute to an unlimited number of peers. Each piece is protected by a cryptographic hash contained in the torrent descriptor. This ensures that any modification of the piece can be reliably detected, and thus prevents both accidental and malicious modifications of any of the pieces received at other nodes. If a node starts with an authentic copy of the torrent descriptor, it can verify the authenticity of the entire file it receives. Distributed downloading protocols in general provide redundancy against system problems, reduce dependence on the original distributor.

From BitTorrent Wikipedia

In the original design of BitTorrent, every torrent file must include a tracker server information so that new coming nodes can get current peer list from the tracker server. Tracker servers also help coordinating efficient transmission and reassembly of the copied file. However, if the tracker server is down, the torrent will soon die. Since the creation of the distributed hash table (DHT) method for “Trackerless” torrents, BitTorrent trackers have largely become redundant. However, they are still often included with torrents to improve the speed of peer discovery.

DHT, or distributed hash table, is a class of a decentralized distributed system that provides a lookup service similar to a hash table: (key, value) pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key. Responsibility for maintaining the mapping from keys to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows a DHT to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.

Using “Trackerless” torrents, new coming peers connect to a list of DHT nodes. After it’s got one node, new peers can use the DHT Network to find other nodes (until it updates/seeds its routing table). From the DHT network, new coming peers can retrieve the peer list of the “Trackerless” torrent file, using the info-hash values of the torrents for queries. As opposed to this, in the conventional tracker approach, new peers need to communicate with the tracker to learn about each additional peer added.

In the original design, a torrent file is necessary in order to join the swarm and start connecting to the peers for a file. Magnet links take a step further, containing only the info-hash value already calculated for a specific torrent file. With the magnet link, new coming peers can query the DHT network to get the peer list for a torrent file, download the torrent file from one of them, and start the BT process.

In Authpaper delivery platform, all data is delivered using BT protocol. Recipients and helping peers only need a magnet link to download the data. All data is encrypted before sending out, hence seeds and peers cannot read the data. Authpaper server only works as a peer that join as many swarms as possible. There is no privilege of Authpaper inside this platform.

A problem here is that as the peers cannot read the data content, they do not have incentive to contribute to the data delivery process. An economic incentive is needed to make them work for others. This is the place where we need Blockchain. Details on Blockchain will be discussed in the next article.