Show HN: Every single torrent is on this website

I don't understand why so many people seem so fascinated by constructions like the library of Babel. Yes it contains the answers to all your questions, but there are some significant drawbacks.

* It has more wrong information than right information, with no way to tell the difference.

* If you had an oracle that could tell you how to get to the book you need, the navigation instructions to get to the book will be at least as long as the book, on average.

bonoboTP · 5 months ago

The Library of Babel made me aware that choosing/finding is not super distinct from making/creating. Or discovery and invention. In math, there is distinction between "there exists" and "we can construct", but "we can construct" is similar to "we can find".

Deleted Comment

matheusmoreira · 5 months ago

I don't think they're equivalent. I think invention and creation aren't actually real. There is no "making" or "creating" when it comes to intellectual work.

All computer files are sequences of bits. All sequences of bits are integers. All integers already exist in the infinite set of natural numbers. I can even calculate how big those numbers are given their bit count.

  digits(bits)   = ceil(bits * log10(2))

  digits(32)     = 10
  digits(64)     = 20
  digits(128)    = 39
  digits(256)    = 78
  digits(512)    = 155
  digits(1024)   = 309

  digits(20 KiB) = 49,321
  digits(2 GiB)  = 5,171,655,946

We are merely discovering numbers through convoluted mental and technological processes. All our mental exertions result in the discovery of a number. This comment is a number.

cryzinger · 5 months ago

To your first bullet, I believe this is one of the central points of the original Borges story :)

cantor_S_drug · 5 months ago

I think Library of Babel by Borges is a static manifestation of Turing complete behaviour via the fact that some L-systems are Turing complete. or put another way. Where in the Library of Babel, does the real Hamlet reside? If we consider finding and replacing names with other names, is it still a Hamlet? And if we bring the full force of edit operations and do these in a reversible manner, then where does the actual Hamlet reside? An equivalence class of Hamlet?

Chinjut · 5 months ago

Everyone is aware of this. Sites like this aren't created to be useful. They are created to be an amusement, a joke.

AnthonyMouse · 5 months ago

> If you had an oracle that could tell you how to get to the book you need, the navigation instructions to get to the book will be at least as long as the book, on average.

This isn't quite true. Natural language text compresses extremely well and you would only need length equivalent to the compressed form, not the original form. And if you wanted to go further, you could use a mapping where extremely short strings map to known popular books and only unknown works have longer encodings.

recursive · 5 months ago

I suppose this would work if the library was arranged such that comprehensible books were closer to the "origin". The workings of the "real" library of babel are supposed to be more inscrutable though.

But if I built one, it would totally work that way.

variadix · 5 months ago

Kolmogorov’s library

Llamamoe · 5 months ago

I wonder if there is some way to create a latent-space Library of Babel in which you only find incoherent gibberish with extremely long keys, with the shortest ones pointing specifically to the most common/likely strings of text, in manageable computational complexity.

recursive · 5 months ago

Reproducing the text of a book in the library is a synonym for identifying the book. So this is really called "text compression", which is a well-studied field.

samsartor · 5 months ago

In a library of all possible strings, this is just text compression (as the other comment observes). But in a finite library it gets even simpler, in a cool way! We can treat each text as a unique symbol and use an entropy encoding (eg Huffman) to assign length-optimized key to each based on likelihood (eg from an LLM). Building the library is something like O(n log n), which isn't terrible. But adding new texts would change the IDs for existing texts (which is annoying). There might be a good way to reserve space for future entries probabilistically? Out of my depth at this point!

lxgr · 5 months ago

That's arguably just a regular library :)

a_shovel · 5 months ago

Another way of looking at it is that the library of Babel would be less useful than an equivalent quantity of blank paper. For example, you could use it to print books in English instead of gibberish. Multiple copies of those books, even.

Deleted Comment

kristianp · 5 months ago

> If you had an oracle that could tell you how to get to the book you need, the navigation instructions to get to the book will be at least as long as the book, on average.

Only if the oracle has all books that could possibly exist. If you're trying to find a book that already exists, that set is infinitely smaller.

recursive · 5 months ago

The oracle doesn't have the books. The library does. And it has all of them. Directions to each book depend only on the layout and contents of the library.

0cf8612b2e1e · 5 months ago

I am reminded of this SMBC comic

https://www.smbc-comics.com/comic/the-library-of-heaven

megablast · 5 months ago

Thank you captain obvious.

recursive · 5 months ago

At your service.

> Many crawlers and indexers continuously pick random or sequential infohashes and announce themselves so they can later detect other announcers

I can't follow the logic here. How does this detect other announcers?

tdjsnelling · 5 months ago

By announcing itself, the indexer makes itself more likely to be handed out as a peer to anyone else interested in that infohash. Every connection attempt it subsequently receives is evidence of another peer announcing or joining that torrent. In effect, it "baits" peers into revealing themselves

aspenmayer · 5 months ago

The way I understand it, these extraneous infohashes are functional honeytokens.

https://en.wikipedia.org/wiki/Honeytoken

> In the field of computer security, honeytokens are honeypots that are not computer systems. Their value lies not in their use, but in their abuse.

avidiax · 5 months ago

So they are basically detecting bots that indiscriminately try to download any detected infohash, right?

That's not detecting "announcers", but maybe more like detecting "indexers".

hackingonempty · 5 months ago

> There is no validation that an infohash corresponds to a real torrent—any client can announce anything. Many crawlers and indexers continuously pick random or sequential infohashes and announce themselves so they can later detect other announcers, and malicious clients or poorly written bots can spam the network with anything they like.

There are also valid clients for completely unrelated protocols using the BitTorrent DHT to find each other.

sneak · 5 months ago

Which? I'm always fascinated by the use of public p2p nets to serve other protocols. The first complete standalone program I wrote was a gnutella p2p client.

1dom · 5 months ago

I have the same fascination. You might find https://github.com/dmotz/trystero quite interesting - it's fun to play around with, also can use torrent DHT for discovery.

aquariusDue · 5 months ago

iroh does too: https://www.iroh.computer/docs/concepts/discovery

pluto_modadic · 5 months ago

https://github.com/pubky/pkarr is another one

gwbas1c · 5 months ago

I think this would be an even better joke if the site was a setup for plausible deniability for piracy.

"I didn't share that! It was on infohash.lol first!"

redsparrow · 5 months ago

The All The Music project is something like that, but for melodies. They created all possible melodies of a 7 note diatonic scale and wrote them to disk as MIDI files, copyrighting them in the process. The melodies were dedicated to the Creative Commons Zero so that people could freely use them without worrying about being sued by someone else who had used that melody previously.

More details here: https://allthemusic.info/faqs/

wongarsu · 5 months ago

For a more practical version (containing only infohashes that are observed on the dht) there is bitmagnet [1]. No public instances though, you have to self-host

1: https://github.com/bitmagnet-io/bitmagnet

skoll43 · 5 months ago

how to go straight to jail 101

You are only downloading metadata, and csam content is filtered. But yes, I would also rate it as a legally risky activity

sorenjan · 5 months ago

Does running an indexer and crawler help make the content available to others, or why would this be legally risky? Why would anyone care about what kind of Docker container I run on my home server?

mk12345 · 5 months ago

Very cool, reminds me of the library of Babel (of which you also made a version! [1]).

I made something similar a while ago, the Hdd of Babel [2], which contains all possible files(*) , and wrote down some thoughts on it [3].

I really like how it makes us think about the nature of information.

[1] https://libraryofbabel.app/

[2] https://mkaandorp.github.io/hdd-of-babel/

[3] https://dev.to/mkaandorp/this-website-contains-pictures-of-y...

freetonik · 5 months ago

Love this idea of generating pages based on some strictly defined enumeration. Reminds me of https://everyuuid.com/

Me too. That's listed as an inspiration on the index page!

zikduruqe · 5 months ago

Or every bitcoin public and private address.

https://keys.lol

bArray · 5 months ago

Does anybody know what they are using in the browser to perform DHT?

In theory this could be used to share torrent links by a different reference (ideally you could also add an anchor too). Somebody else could have a page that takes keywords and points you to pages hosted on the site.

crumpled · 5 months ago

The page is making a WebSocket connection to the server and getting the peer info through the WebSocket connection. I think the magic happens on the server.

This is a sample of the client-side code I found handling that: https://infohash.lol/_next/static/chunks/pages/p/%5Bpage%5D-...

https://www.npmjs.com/package/bittorrent-dht is used on the server.

DHT crawlers/indexers already exist to perform that function; they crawl and store infohashes (+ metadata when they receive it) and allow users to search that metadata to return relevant infohashes