To send or receive files, users use a BitTorrent client on their Internet-connected computer. A BitTorrent client is a computer program that implements the BitTorrent protocol. BitTorrent clients are available for a variety of computing platforms and operating systems, including an official client released by Rainberry, Inc. Popular clients include μTorrent, Xunlei Thunder, Transmission, qBittorrent, Vuze, Deluge, BitComet and Tixati. BitTorrent trackers provide a list of files available for transfer and allow the client to find peer users, known as "seeds", who may transfer the files.
Programmer Bram Cohen designed the protocol in April 2001, and released the first available version on 2 July 2001. On 15 May 2017, BitTorrent, Inc. (later renamed Rainberry, Inc.) released BitTorrent v2 protocol specification. libtorrent was updated to support the new version on 6 September 2020.
The first release of the BitTorrent client had no search engine and no peer exchange. Up until 2005, the only way to share files was by creating a small text file called a "torrent", that they would upload to a torrent index site. The first uploader acted as a seed, and downloaders would initially connect as peers. Those who wish to download the file would download the torrent, which their client would use to connect to a tracker which had a list of the IP addresses of other seeds and peers in the swarm. Once a peer completed a download of the complete file, it could in turn function as a seed. These files contain metadata about the files to be shared and the trackers which keep track of the other seeds and peers.
In 2005, first Vuze and then the BitTorrent client introduced distributed tracking using distributed hash tables which allowed clients to exchange data on swarms directly without the need for a torrent file.
BitTorrent v2 is intended to work seamlessly with previous versions of the BitTorrent protocol. The main reason for the update was that the old cryptographic hash function, SHA-1 is no longer considered safe from malicious attacks by the developers, and as such, v2 uses SHA-256. To ensure backwards compatibility, the v2 .torrent file format supports a hybrid mode where the torrents are hashed through both the new method and the old method, with the intent that the files will be shared with peers on both v1 and v2 swarms. Another update to the specification is adding a hash tree to speed up time from adding a torrent to downloading files, and to allow more granular checks for file corruption. In addition, each file is now hashed individually, enabling files in the swarm to be deduplicated, so that if multiple torrents include the same files, but seeders are only seeding the file from some, downloaders of the other torrents can still download the file. Magnet links for v2 also support a hybrid mode to ensure support for legacy clients.
The file being distributed is divided into segments called pieces. As each peer receives a new piece of the file, it becomes a source (of that piece) for other peers, relieving the original seed from having to send that piece to every computer or user wishing a copy. With BitTorrent, the task of distributing the file is shared by those who want it; it is entirely possible for the seed to send only a single copy of the file itself and eventually distribute to an unlimited number of peers. Each piece is protected by a cryptographic hash contained in the torrent descriptor. This ensures that any modification of the piece can be reliably detected, and thus prevents both accidental and malicious modifications of any of the pieces received at other nodes. If a node starts with an authentic copy of the torrent descriptor, it can verify the authenticity of the entire file it receives.
The BitTorrent protocol provides no way to index torrent files. As a result, a comparatively small number of websites have hosted a large majority of torrents, many linking to copyrighted works without the authorization of copyright holders, rendering those sites especially vulnerable to lawsuits. A BitTorrent index is a "list of .torrent files, which typically includes descriptions" and information about the torrent's content. Several types of websites support the discovery and distribution of data on the BitTorrent network. Public torrent-hosting sites such as The Pirate Bay allow users to search and download from their collection of torrent files. Users can typically also upload torrent files for content they wish to distribute. Often, these sites also run BitTorrent trackers for their hosted torrent files, but these two functions are not mutually dependent: a torrent file could be hosted on one site and tracked by another unrelated site. Private host/tracker sites operate like public ones except that they may restrict access to registered users and may also keep track of the amount of data each user uploads and downloads, in an attempt to reduce "leeching".
Web search engines allow the discovery of torrent files that are hosted and tracked on other sites; examples include The Pirate Bay and BTDigg. These sites allow the user to ask for content meeting specific criteria (such as containing a given word or phrase) and retrieve a list of links to torrent files matching those criteria. This list can often be sorted with respect to several criteria, relevance (seeders-leechers ratio) being one of the most popular and useful (due to the way the protocol behaves, the download bandwidth achievable is very sensitive to this value). Metasearch engines allow one to search several BitTorrent indices and search engines at once.
The Tribler BitTorrent client was among the first to incorporate built-in search capabilities. With Tribler, users can find .torrent files held by random peers and taste buddies. It adds such an ability to the BitTorrent protocol using a gossip protocol, somewhat similar to the eXeem network which was shut down in 2005. The software includes the ability to recommend content as well. After a dozen downloads, the Tribler software can roughly estimate the download taste of the user, and recommend additional content.
A somewhat similar facility but with a slightly different approach is provided by the BitComet client through its "Torrent Exchange" feature. Whenever two peers using BitComet (with Torrent Exchange enabled) connect to each other they exchange lists of all the torrents (name and info-hash) they have in the Torrent Share storage (torrent files which were previously downloaded and for which the user chose to enable sharing by Torrent Exchange). Thus each client builds up a list of all the torrents shared by the peers it connected to in the current session (or it can even maintain the list between sessions if instructed).
At any time the user can search into that Torrent Collection list for a certain torrent and sort the list by categories. When the user chooses to download a torrent from that list, the .torrent file is automatically searched for (by info-hash value) in the DHT Network and when found it is downloaded by the querying client which can after that create and initiate a downloading task.
Users find a torrent of interest on a torrent index site or by using a search engine built into the client, download it, and open it with a BitTorrent client. The client connects to the tracker(s) or seeds specified in the torrent file, from which it receives a list of seeds and peers currently transferring pieces of the file(s). The client connects to those peers to obtain the various pieces. If the swarm contains only the initial seeder, the client connects directly to it, and begins to request pieces. Clients incorporate mechanisms to optimize their download and upload rates.
Although "swarming" scales well to tolerate "flash crowds" for popular content, it is less useful for unpopular or niche market content. Peers arriving after the initial rush might find the content unavailable and need to wait for the arrival of a "seed" in order to complete their downloads. The seed arrival, in turn, may take long to happen (this is termed the "seeder promotion problem"). Since maintaining seeds for unpopular content entails high bandwidth and administrative costs, this runs counter to the goals of publishers that value BitTorrent as a cheap alternative to a client-server approach. This occurs on a huge scale; measurements have shown that 38% of all new torrents become unavailable within the first month. A strategy adopted by many publishers which significantly increases availability of unpopular content consists of bundling multiple files in a single swarm. More sophisticated solutions have also been proposed; generally, these use cross-torrent mechanisms through which multiple torrents can cooperate to better share content.
The peer distributing a data file treats the file as a number of identically sized pieces, usually with byte sizes of a power of 2, and typically between 32 kB and 16 MB each. The peer creates a hash for each piece, using the SHA-1 hash function, and records it in the torrent file. Pieces with sizes greater than 512 kB will reduce the size of a torrent file for a very large payload, but is claimed to reduce the efficiency of the protocol. When another peer later receives a particular piece, the hash of the piece is compared to the recorded hash to test that the piece is error-free. Peers that provide a complete file are called seeders, and the peer providing the initial copy is called the initial seeder. The exact information contained in the torrent file depends on the version of the BitTorrent protocol.
By convention, the name of a torrent file has the suffix .torrent. Torrent files use the Bencode file format, and contain an "announce" section, which specifies the URL of the tracker, and an "info" section, containing (suggested) names for the files, their lengths, the piece length used, and a SHA-1 hash code for each piece, all of which are used by clients to verify the integrity of the data they receive. Though SHA-1 has shown signs of cryptographic weakness, Bram Cohen did not initially consider the risk big enough for a backward incompatible change to, for example, SHA-3. As of BitTorrent v2 the hash function has been updated to SHA-256. 2b1af7f3a8