C.I.D. torrent: Stream or download the show that has been running since 1998

dyamisliepienis
Aug 20, 2023
7 min read

BitTorrent uses links that use the magnet: URI scheme by specifying a property that links to the urn:btih (BitTorrent InfoHash) for a given Torrent.The standard is used by serveral other content networks like Kazaa and eDonkey which were popular file sharing networks that aren't as active these days.The link format can contain extra information like tracker servers for helping discover other peers with data and metadata that lets you know whether a torrent is "private" without needing to download the torrent itself.These links are very well defined, but being URIs rather than URLs, they are not suitable for using as paths within a browser's hostname and require extra effort to be able to link to a specific file within a torrent.In order to address some of the drawbacks of using URI, there's been some talk about standardizing on a bittorrent:// URL format which would be able to better integrate with browsers and link to specific data.

BitTorrent is great in that they were very successful at making peer to peer file transfer "just work".It's data model is based on a concept called "Content Addresibility" where the data inside a torrent gets put through a hash function which generates a unique value based on the content.If you have the hash of some data, you can verify whether data from another peer is valid by checking if it hashes to the same value.If even a single byte of the data is different form the original, the hash will be different and you can ignore the data from the peer.

C.I.D. torrent

Download File

Instead of hashing the entirety of the data at once, BitTorrent splits up the files and folders in the torrent into a tree of "nodes" that link to each other using hashes for IDs.This is called a Merkle DAG (Directed Asyclyc Graph) or Merkle Tree.The files themselves are split into chunks of a few kilobytes or megabytes in size and added as their own subtree within the merkle dag.This is what enables a torent client to download small bits of files from multiple peers at once and verify their data independently rather than needing to load an entire file before verifying it.

A torrent is then identified by the top-most hash in the tree which is called the InfoHash which is stored within the bittorrent magnet link.A .torrent file will contain some metadata about a torrent along with the merkle tree for the files and folders (without having the actual file data) which can then be used to varify data.This also means that by default, if two torrents contain the same chunk of data, they won't be able to share peers in order to discover that bit on the network.

One consequence of this is that torrents are Immutable and in order to change some data within it, you would be required to create a new torrent with a new infohash.This applies not only to the files within the torrent, but any metadata about the torrent such as it's name, description, or creation date will affect the final InfoHash.

Another consequence is that a torrent with a very large set of files,or with files that are incredibly large will have a very large merkle tree and a very large .torrent file which might be slower to load and traverse.

Data in BitTorrent is usually saved by storing the torrent metadata somewhere in application memory, and storing the files for a torrent within the filesystem.On boot a torrent client will typically verify all the existing data on disk to see what chunks are missing or need to be re-loaded.

Like BitTorrent, IPFS datasets can be referenced using the hash of the root of their merkle dag which they call the CID (Content IDentifier), and in order to change any data, you will need to generate and share a new CID.Unlike bittorrent the formats of the hashes used for CIDs are flexible and the same bit of data can use different hashes.The different hash functions and encodings are defined in the multiformats specification.

However, there's been some work on supporting "mutable" torrents and being able to discover updates to torrents via the BitTorrent network in the form of the BEP 46 proposal.It works by using public keys to sign DHT entries which contain the InfoHash of your latest version, as well as a sequence number which can enable you to discover just the latest entry.Sadly, marjor clients don't really support mutable torrents, and there are very few minor clients that support it.Agregore (as of March 2022) has been working on making it easier to publish and load mutable torrents, and there have been efforts in the past to build applications on top of this functionality.

Initially, peers would be discovered using Tracker servers.Torrent files or magnet links would come with a set of servers to use for peer discovery along with built in tracker servers that would come with some torrent clients.Peers that were looking for others around a given torrent file, would advertise themselves.

This, however, made it easy to censor torrents and the network at large.If a given tracker could be blocked or taken down, then peer discovery could be broken.In order to avoid this point of centralization, the Mainline Distributed Hash Table was made to decentralize the peer discovery mechanism.Instead of a central server being used for peer discovery, peers would spread the load of storing advertisements and serving them with others across everyone participating.The more popular mainline becomes, the harder it is to censor all of it.

One thing to note is that peer discovery happens per-infohash so your app will need to join several "swarms" and perform several rounds of advertisement in order to find peers for each dataset.This can add up if you're loading hundreds of torrents on a single machine, particularly since you'll be establishing connections with a few peers at once in order to do adequate downloading and uploading.The protocol itself also doesn't provide the abillity to reuse existing connections since each replication stream is per-torrent.

Outside of this, IPFS is actually very hungry with network bandwidth since it will do content discovery for each piece of content in the Merkle Tree that you're loading up.This means that even if you're traversing a single dataset, you can quickly overtake the amount of traffic a trorrent client does for peer discovery for several torrents at once.

Libp2p also hasn't had as much time to solve the NAT hole punching issue, so connecting two computers on home networks is typically not as reliable as BitTorrent.This has actually improved very recently as you can tell by this announcement in March 2022.Initially libp2p would try to rely on UPnP being available in order to open ports, same as BitTorrent.More recently, they got first class support for NAT traversal via a combination of using AutoNAT to determine if you are behind a NAT, and a public relay node to let the two peers talk to each other.This is very simular to what BitTorrent clients do, but using Libp2p features rather than extensions over torrent replication streams.At the moment (March 2022), you'll need to explicitly enabled with the Swarm.EnalbeHolePunching configuration flag.Also note that NAT hole punching will only work with transports that use UDP, so TCP and WebRTC based connections will not benefit from this functionality.

However, even with that in place, BitTorrent's privacy guarantees aren't great.Services and bots periodically scan the DHT for InfoHashes of content which also enables them to download metadata and content for any torrent.This means that any data you publish in a torrent is effectively public and if you want it to be more private, you will need to manually add layers of encryption to the files themselves.

Additionally, BitTorrent used md5 for it's hashing algorithm for Merkle Trees, which has been since proven to be insecure.That meant that all older torrents that used v1 of the protocols would potentially be vulnerable to peers lying about contents by generating md5 hash collisions.This has since been fixed in BitTorrent v2 by switching to the SHA2-256 hash algorithm which has not yet been broken.

Some people have managed to avoid leaking their IP addresses when using BitTorrent by using the i2p network, but most torrents will not be availabl, and it requires connecting to trackers since i2p does not support a BitTorrent DHT.

Since BitTorrent has been around for a while, it can boast having stable implementations in different programming languages as well as having stable specifications for loading data in between them.For example, you can be sure that if you're running a major operating system, that there is a torrent client out there that will work well enough for your use case.

For C++ enthusiasts, libtorrent is the gold standard for building clients as it's feature rich and performant.This might also be your choice if you want to embed BitTorrent in a different programming language via Foreign Function Interfaces.

Another useful implementation is WebTorrent which enables web browsers to load torrents (if they have webrtc-compatible seeders).This implementation works with Node.js in order to bridge the regular BitTorrent network with the WebRTC/Browser based network by running hybrid nodes that can do both.

There's many different implementations of seedboxes out there, and they can sometimes be in the form of somebody running a torrent client from a command line on a server, and connecting to it to add more torrents.

get_torrent_status returns a vector of the torrent_status forevery torrent which satisfies pred, which is a predicate functionwhich determines if a torrent should be included in the returned setor not. Returning true means it should be included and false meansexcluded. The flags argument is the same as totorrent_handle::status(). Since pred is guaranteed to becalled for every torrent, it may be used to count the number oftorrents of different categories as well.

refresh_torrent_status takes a vector of torrent_status structs(for instance the same vector that was returned byget_torrent_status() ) and refreshes the status based on thehandle member. It is possible to use this function by firstsetting up a vector of default constructed torrent_status objects,only initializing the handle member, in order to request thetorrent status for multiple torrents in a single call. This can save asignificant amount of time if you have a lot of torrents. 2ff7e9595c

C.I.D. torrent: Stream or download the show that has been running since 1998

C.I.D. torrent

Recent Posts

Comments