Remember: If OpenAI/Google does it for $$$, it's not illegal. If idealists do it for public access, full force of the law.
Information wants to be free. Oblige it. Fools with temporary power trying to extract from the work of others will be a blip in the history books if we make them.
Those who create information may have families to feed, house and clothe. Until those items (food/housing/clothes) are also free, information cannot be free.
You might be glad to learn a number of studies (mostly commissioked by the European Union) agree on the fact that piracy doesn't hurt sales.
The main consensus is that people who illegally access content wouldn't have bought it otherwise, and that they still advertise it (thus, still driving up sales).
These studies have then been systematically strong-armed into silence by the EU and constituent countries' anti-piracy organisms.
This is probably because the war on piracy, too, is a billion-dollar industry. I'd be glad to blow it all up and give it all to the starving artists and their families.
The last few years have really put to test copyrights limits and uses. As someone that falls more in line with Copyleft ideals (do whatever you want with my stuff!), it is very funny.
I just grab the popcorn and watch from the side lines, see where it all lands.
I'm used to think that "copyleft" is "do whatever you want with my stuff, but you must agree that others must be able to do whatever you want with your stuff you made out of mine".
Google had a tailored fair use argument because they never made more than snippets public and searchable. It was also prior to Hachette that controlled lending with one-to-one digital copies for every physical copy was a status quo that publishers largely accepted, which IA deliberately tried to upset with the National "Emergency" Library.
I think it's worth fighting back on copyright as a broken institution, and it should be part of the IA's mission, but you have to be responsible on your approach if you're also going to posture as an archival library with stability of information and access. I understand Kahle might lament losing some of the hacker ethos, but the IA is too important to run up against extremes like this without an existential threat.
It's not just that though. They pulled a stupid stunt during covid.
If you're campaigning for fair use, don't give your enemy ammunition to shoot you with by stretching said fair use too far. That was just really dumb.
Besides, for those willing to look outside official channels there's plenty of book library services available already. Just let them do what they do well and don't contaminate an above-board service with that.
You are being intentionally misleading. Public access AI models are not being taken down either. There is a big, transformative, difference with freely giving out books to read compared with using them to train an ML model.
> There is a big, transformative, difference with freely giving out books to read on a small, measured in human reading pace, scale compared with using them at a massive scale at internet and computer memory speeds to train an ML model even if the intellectual property used to train the ML was from unlicensed copies, and which the model regularly and with some frequency regurgitates verbatim.
not wanting you to be intentially misleading, FTFY.
It was an absolutely bone-headed poke-the-bear move and we should count ourselves lucky that it was only a chunk of the library and not the whole archive that got nuked. IA holds priceless and irreplaceable data, and while the library initiative was a well-intentioned move during the pandemic it was way too radical for the keepers of our shared digital history.
And this is why I respect Anna's Archive. If we want information to be free, I think we should consider intentionally violating copyright as an act of civil disobedience. I'm not sure I'm ready to go that far, but I respect the people at AA who are.
Agreed. It was poking the bear. The other big three grey-circuit services have nothing to lose because they never claimed to be legit.
If you claim the moral high ground you have to be impeccable. It also didn't help anyone during COVID because all the stuff was out there already on less legit services. Some make it super easy with telegram bots etc.
I'm sad that they screwed this up. Because the archive is a very valuable service. They've lost a lot of money, goodwill and reputation now. And gained nothing.
The worst part is, it's unimaginable that this would ever have ended well.
It wasn't a bad idea in principle but they should have worked to get some publishers on board, could have been a PR win for them too.
> Kahle thinks “the world became stupider” when the Open Library was gutted—but he’s moving forward with new ideas
> The lawsuits haven’t dampened Kahle’s resolve to expand IA’s digitization efforts, though. Moving forward, the group will be growing a project called Democracy’s Library
please just stop. let IA be what it is. or rather, nothing wrong in doing new projects but don't tie them to IA, just start them as completely separate things. IA is too important as-is to be a playground for random kooky ideas playing with fire.
And then the IA becomes the same thing it’s fighting against. The people in the wrong here are these massive corporations fighting over scraps, not the IA.
No it doesn't. It's extremely valuable with the scope it already has. These massive corporations do not operate the Wayback Machine nor the various (less controversial) public archives that IA hosts, and makes available at no cost, no login-wall, no cloudflare-infinite-captchas, etc.
As the project matures, the risk tolerance should mature too.
Betting your own time and money on the realization of a crazy ideal can be very noble. Betting a resource millions of people are relying on is destructive hubris.
They should take the untamed idealism to a separate legal entity before they ruin all the good they've done.
What capitalism continues to show us: proof that public libraries, if created in the last 10 years, would be deemed illegal and sued out of existence.
It's only because the late 1800's billionaires wanted to leave legacies and made pay-to-enter and free libraries, and migrated them to free, or public libraries. Thats why so many of them are (John) Carnegie Libraries.
A lower stakes but still illustrative example I see is that the DVR is an invention that wouldn't be allowed to succeed today. All power is being wielded to its fullest in order to prevent skipping ads.
Cable to streaming took us from skippable to unskippable ads. Search results to LLM results will result in invisible/undisclosed ads. Each successive generation of technology will increase the power of advertising and strip rights we used to have. Another example, physical to digital media ownership, we lost resale rights.
We need to understand that we've passed a threshold after which innovation is hurting us more than helping us. That trumps everything else.
How do you figure libraries would be deemed illegal? They operate today. The Archive, on the other hand, attempted a fair use argument for whole copies of books (the copyrighted form most legible to copyright law) currently for sale as ebooks. I agree with the comment across the thread calling this a spectacularly boneheaded move and expressing gratitude that the entire Archive wasn't compromised over the stunt.
> How do you figure libraries would be deemed illegal? They operate today.
The history of public libraries is extremely messy, and the RIAA almost managed to get secondhand music made illegal in the 90s. Publishers did not ever support the idea of loaning a single copy of a work to dozens of people. While it's a huge stretch to say that every illegal download represents a lost sale (people download 100x more than they read), it's a lot less of a stretch to say that people who would sit down and read an entire book are fairly likely to have bought it.
Also, when books were relatively more expensive for people (19th century), a lot of income from publishers came from renting their books, rather than selling them. Public libraries involved a lot of positive propaganda and promises of societal uplift from wealthy benefactors, along the same lines and around the same time as the introduction of universal free public education. I remember hearing a lot about this history at the Enoch Pratt Library in Baltimore, which iirc was the first. Libraries were at that time normally private membership clubs.
edit: I also agree that the free book thing was stupid and have been very harsh about it. I don't know if it's possible to be too harsh about it, because it was obviously never going to get past a court. It felt almost like intentional sabotage.
Possibly, yeah. Make a "Deal" <spit> with AI companies to have back-end access to all the Archive org's content. Get 'permission' to copy EVERYTHING and have billionaires run interference.
The AI companies already got blank checks to do that. Anthropic is paying what, like $3000 per book? I remember when the fucks at the RIAA were suing 12 year olds for $10000 for Britney Spears albums.
Or better yet, if it's just $3k a book, can we license every book and have that added into Archive.org? Oh wait, deals for thee, not for me.
Fun fact: the only complete copy of the Internet Archive's library is in Alexandria.
How's that for historic irony and "unteachability" of the human species.
Honestly, now's the time to make copies of it, while we still can. Torrents need seeders and people that care, and we are the last generation that cares about knowledge.
We need to prevent the following generations to grow up as mindless clickmonkeys of the digital Orwellian world.
In my opinion redundancy in a single business entity is no redundancy at all, especially if there's legal obligations of a soon-to-be-burning-books-again regime.
A better strategy would have been to found independent entities in other liberal democracies, so they can act as IP backups.
There was a great vpro documentary called "Digital Amnesia" [1] where they also interviewed the lead of the library of Alexandria, who was the only bidder to buy the national KIT library of the Netherlands and its dissolved inventory at the time.
Interviews with archivists, librarians, web archive and others on the topic. It's insane to see that nations don't want to preserve their history, science, and culture anymore.
That sounds right. I checked on some listings of books that I thought would be cool to check out, but it still keeps saying how borrow in unavailable except for patrons with print disabilities. For the books I'm interested in, at least we can see scans of the front and back covers, and also a little bit of the table of contents.
Most of the IA’s ebook collection still supports controlled digital lending, just like every other library that operates an ebook lending system with CDL.
I am sorry to say that, but copyright protection time should vary on the subject. Programming books - 10 years maybe? 10 years is ages in computer science. TV shows? 5-7 years maybe. After that time nobody wants to pay for watching old big brother or another Fort Boyard... Nor pay for storing it in archive. And this is the culture other creations are referring to.
We've run in Poland into very strange situation - Polish Public TV (TVP) paid for the great dubbing of some Disney shows. They recorded it on VHS which were overwritten by other shows. Now the translation and the dubbing is lost, found sometimes on people's home recorded VHS but in poor quality, because recorded from the aerial.
Many original episodes of Dr Who were destroyed by the BBC, some have only been recovered because of "pirate" home recorded VHS, or people that "stole" tapes from the dumpster.
Information wants to be free. Oblige it. Fools with temporary power trying to extract from the work of others will be a blip in the history books if we make them.
Those who create information may have families to feed, house and clothe. Until those items (food/housing/clothes) are also free, information cannot be free.
The main consensus is that people who illegally access content wouldn't have bought it otherwise, and that they still advertise it (thus, still driving up sales).
These studies have then been systematically strong-armed into silence by the EU and constituent countries' anti-piracy organisms.
This is probably because the war on piracy, too, is a billion-dollar industry. I'd be glad to blow it all up and give it all to the starving artists and their families.
Unrelated: I wonder how much the publishing industry spent on lawyers.
Besides, if I was never going to buy it in the first place because you're charging too much, you've lost nothing if I pirate your product.
A victimless crime.
I just grab the popcorn and watch from the side lines, see where it all lands.
Deleted Comment
I think it's worth fighting back on copyright as a broken institution, and it should be part of the IA's mission, but you have to be responsible on your approach if you're also going to posture as an archival library with stability of information and access. I understand Kahle might lament losing some of the hacker ethos, but the IA is too important to run up against extremes like this without an existential threat.
It's a training set not an archive.
For my enemies... the law.
If you're campaigning for fair use, don't give your enemy ammunition to shoot you with by stretching said fair use too far. That was just really dumb.
Besides, for those willing to look outside official channels there's plenty of book library services available already. Just let them do what they do well and don't contaminate an above-board service with that.
not wanting you to be intentially misleading, FTFY.
To say otherwise is disingenuous.
The 'goodwill' counterparts of ChatGPT, a.k.a. open weight models, are still well alive online.
What do you think is step 1 of training an LLM?
OpenAI just kept their library private and only distribute the digested summaries of the library, are the main differences.
If you claim the moral high ground you have to be impeccable. It also didn't help anyone during COVID because all the stuff was out there already on less legit services. Some make it super easy with telegram bots etc.
I'm sad that they screwed this up. Because the archive is a very valuable service. They've lost a lot of money, goodwill and reputation now. And gained nothing.
The worst part is, it's unimaginable that this would ever have ended well.
It wasn't a bad idea in principle but they should have worked to get some publishers on board, could have been a PR win for them too.
Personally, I say they should be free for everyone, the lawyers however think the complete opposite and they have the means of enforcing this.
Just like in Frankenstein, that monster was created by someone, and those who created the monster are the true villains.
> The lawsuits haven’t dampened Kahle’s resolve to expand IA’s digitization efforts, though. Moving forward, the group will be growing a project called Democracy’s Library
please just stop. let IA be what it is. or rather, nothing wrong in doing new projects but don't tie them to IA, just start them as completely separate things. IA is too important as-is to be a playground for random kooky ideas playing with fire.
IA is the eccentric, untamed idealism. You can’t have the Wayback Machine without the National Emergency Library and the Great 78 Project.
Betting your own time and money on the realization of a crazy ideal can be very noble. Betting a resource millions of people are relying on is destructive hubris.
They should take the untamed idealism to a separate legal entity before they ruin all the good they've done.
It's only because the late 1800's billionaires wanted to leave legacies and made pay-to-enter and free libraries, and migrated them to free, or public libraries. Thats why so many of them are (John) Carnegie Libraries.
Only legal when billionaires do it.
Cable to streaming took us from skippable to unskippable ads. Search results to LLM results will result in invisible/undisclosed ads. Each successive generation of technology will increase the power of advertising and strip rights we used to have. Another example, physical to digital media ownership, we lost resale rights.
We need to understand that we've passed a threshold after which innovation is hurting us more than helping us. That trumps everything else.
https://arstechnica.com/gadgets/2025/11/youtube-tvs-disney-b...
And yet I can go to a site right now off the top of my head and watch any TV show or basically any movie made in the last 50 years for free in HD.
It might be shut down tomorrow and it'll be up against 30s later with a different TLD.
They aren't winning but they really are trying hard to.
The history of public libraries is extremely messy, and the RIAA almost managed to get secondhand music made illegal in the 90s. Publishers did not ever support the idea of loaning a single copy of a work to dozens of people. While it's a huge stretch to say that every illegal download represents a lost sale (people download 100x more than they read), it's a lot less of a stretch to say that people who would sit down and read an entire book are fairly likely to have bought it.
Also, when books were relatively more expensive for people (19th century), a lot of income from publishers came from renting their books, rather than selling them. Public libraries involved a lot of positive propaganda and promises of societal uplift from wealthy benefactors, along the same lines and around the same time as the introduction of universal free public education. I remember hearing a lot about this history at the Enoch Pratt Library in Baltimore, which iirc was the first. Libraries were at that time normally private membership clubs.
edit: I also agree that the free book thing was stupid and have been very harsh about it. I don't know if it's possible to be too harsh about it, because it was obviously never going to get past a court. It felt almost like intentional sabotage.
> proof that public libraries, if created in the last 10 years
Paying the editors is the bigger issue than paying the authors
The AI companies already got blank checks to do that. Anthropic is paying what, like $3000 per book? I remember when the fucks at the RIAA were suing 12 year olds for $10000 for Britney Spears albums.
Or better yet, if it's just $3k a book, can we license every book and have that added into Archive.org? Oh wait, deals for thee, not for me.
How's that for historic irony and "unteachability" of the human species.
Honestly, now's the time to make copies of it, while we still can. Torrents need seeders and people that care, and we are the last generation that cares about knowledge.
We need to prevent the following generations to grow up as mindless clickmonkeys of the digital Orwellian world.
I could see that due to the sheer size but im sure they have a robust disk pool that would take a lot for it to lose data
A better strategy would have been to found independent entities in other liberal democracies, so they can act as IP backups.
There was a great vpro documentary called "Digital Amnesia" [1] where they also interviewed the lead of the library of Alexandria, who was the only bidder to buy the national KIT library of the Netherlands and its dissolved inventory at the time.
Interviews with archivists, librarians, web archive and others on the topic. It's insane to see that nations don't want to preserve their history, science, and culture anymore.
But here we are.
[1] https://youtube.com/watch?v=NdZxI3nFVJs
https://news.ycombinator.com/item?id=45798283
https://news.ycombinator.com/item?id=45809870
https://news.ycombinator.com/item?id=45806643
Good chance the book you wanted is gone at the least
Dead Comment
We've run in Poland into very strange situation - Polish Public TV (TVP) paid for the great dubbing of some Disney shows. They recorded it on VHS which were overwritten by other shows. Now the translation and the dubbing is lost, found sometimes on people's home recorded VHS but in poor quality, because recorded from the aerial.