Readit News logoReadit News
Posted by u/dom96 3 years ago
Ask HN: Legality of archiving/re-hosting Reddit content
For content hosted publicly on reddit.com, what is the legality of downloading/scraping that content and re-hosting it on a separate website?

I am aware that the Archive Team is currently archiving Reddit[1]. As far as I understand what they do is legal. But I would like some reassurance.

Are there any good articles on this topic? Contact details for lawyers specialising in this area of the law also welcome.

1 - https://news.ycombinator.com/item?id=36254172

neovialogistics · 3 years ago
It depends entirely on what nation you're operating in. (And additionally state/territory, if the answer is the United States)
anonu · 3 years ago
This is probably bad legal advice (above and following...). The EULA clearly states reddit content is owned by reddit. If you break their agreement you break it - and the Governing Law section says you would be put in Federal or state court in California, even if you are a foreign national.

Does a California court have jurisdiction over a foreign national. Probably not - which is entirely possible that Reddit would take a lawsuit to whatever court is closest to the person.

As I stated above, this isn't legal advice either, just my layman's view.

ben_w · 3 years ago
I'm also not a lawyer.

My reading of the user agreement[0] is that it distinguishes between "content that belongs to Reddit Inc." (things like the CSS and up/down vote icons) and "content licensed by Reddit" (what users post).

But yeah, lawyering is hard, don't just trust random internet comments like this :)

[0] https://www.redditinc.com/policies/user-agreement

kaba0 · 3 years ago
Are EULA even enforcible? Also, AFAIK you can never completely revoke your rights to content you yourself created in the EU.
dom96 · 3 years ago
Personally I'm most interested in the United Kingdom (since that is where I reside). But I would guess HN would be mostly interested in California (and I would be too since that is where Reddit is based).
pyeri · 3 years ago
Isn't it true that most implementations of copyright laws (irrespective of country/region) will likely have fair use exceptions for things like parody, journalistic reporting and archiving?
codingdave · 3 years ago
Well... no.

Some jurisdictions will have fair use exceptions for some scenarios. I don't think we can assume "most", "likely", or "irrespective of country/region", although it can feel like that if you happen to live in an area where such things are commonly allowed.

tssva · 3 years ago
When someone posts content to Reddit under the Reddit terms of use they grant Reddit a license to use and distribute the content but ownership rights of the content remains with the poster. If you scrape Reddit and post the content in theory you open yourself up to copyright violation claims from the original poster of the content. The odds that someone is going to sue you for redistributing without permission content they posted to Reddit is likely extremely small but it is not zero.
BunnyOSteele · 3 years ago
From https://www.redditinc.com/policies/user-agreement

> Except and solely to the extent such a restriction is impermissible under applicable law, you may not, without our written agreement:

> - license, sell, transfer, assign, distribute, host, or otherwise commercially exploit the Services or Content;

Even without being a lawyer that seems pretty clear that it is not legal. Unless you have Reddits written permission, which I guess the Archive Team has

TazeTSchnitzel · 3 years ago
I would guess that Archive Team don't ask for permission before acting, and certainly don't wait. The sad truth is that copyright law is not on their side. They preserve cultural memory because they believe in it, and they get away with it only because they're a disorganised collective of volunteers, so there's little to gain from suing them.

Deleted Comment

dark-star · 3 years ago
on the contrary, I read this as "it is allowed, unless your local laws allow us to restrict you from doing it" ;-)
3np · 3 years ago
That "except and solely" is what makes it very much not clear at all. They're basically saying "we make these demands iff applicable law allows us to do so".
CM30 · 3 years ago
Given the users own the content, it'd presumably be up to them whether it can be rehosted or not. Reddit gets a license to display the content, but they don't really have any control over what third parties can do with it.

Personally I don't care if anyone reuses stuff I've posted on Reddit or other forms of social media (forums like Hacker News included), but there's always the possibility that someone might. And if you remove their posts when asked, I doubt most of them will take it any further than that.

linuxftw · 3 years ago
There are separate terms for the API, which seem to indicate to me it's legal to use their API to download user content: https://www.reddit.com/wiki/api-terms

According to these terms, the content is owned by the users, and you're not to modify the content. However, if the content is owned by the users, then IMO Reddit cannot really say what you do or don't do with the content, as long as you're not building an application that acts as a proxy to Reddit.

The license is revocable to accessing their API, but they're not licensing you the user content, only the ability to download it. What you can do with that content is likely up to the laws in your jurisdiction. I'd say most content would qualify as public domain, though obviously some content will have copyright protection.

I would do the downloading now before they start charging for the API if you're serious about the project.

dom96 · 3 years ago
> as long as you're not building an application that acts as a proxy to Reddit.

Presumably you mean an app which displays Reddit.com as-is (i.e. with their logos/icons/UX). But what about a service which acts as a proxy to the Reddit API?

If content is not owned by Reddit and API design cannot be copyrighted then this seems legal, right?

linuxftw · 3 years ago
IMO, the API terms look like they're written by a teenager, and you can do just about whatever you want. They admit the content is owned by the users, so not sure what they could actually enforce other than disallowing you to use their API. Once you have the content, it's none of their business at that point.
raldi · 3 years ago
For what it’s worth, Reddit’s TOS reads:

You may not: Access, search, or collect data from the Services by any means (automated or otherwise) except as permitted in these Terms or in a separate agreement with Reddit (we conditionally grant permission to crawl the Services in accordance with the parameters set forth in our robots.txt file, but scraping the Services without Reddit’s prior written consent is prohibited)

dzek69 · 3 years ago
TOS is not a law
chrisshroba · 3 years ago
But in some (not all!) cases it is a lawful contract, so it's worth consideration.
Cloudef · 3 years ago
What's with the bunch of reddit crap on the front page recently?
caturopath · 3 years ago
raldi · 3 years ago
Why are we still posting links to claims they are "raising the price of access to their API from being free to a level that will kill every third party app" when they've made completely clear that over 90% of reddit apps won't have to pay a penny, and neither will the remaining 10% if they fix their inefficiency problems?

https://www.reddit.com/r/reddit/comments/145bram/addressing_...

ben_w · 3 years ago
1. New Reddit API access fees announced

2. Third-party app developers said this is too high, will shut down rather than pay

3. Reddit official response had zero chill

3. a. Repeatedly

4. Significant fraction of subreddits organised a "go private" protest

5. This broke Reddit for everyone else

psychphysic · 3 years ago
You're going to be downvoted to oblivion because the only thing HN hates more than Reddit is people pointing out that HN is infatuated with Reddit.
malermeister · 3 years ago
Which is funny because HN is, let's be real, basically a subreddit.