For content hosted publicly on reddit.com, what is the legality of downloading/scraping that content and re-hosting it on a separate website?
I am aware that the Archive Team is currently archiving Reddit[1]. As far as I understand what they do is legal. But I would like some reassurance.
Are there any good articles on this topic? Contact details for lawyers specialising in this area of the law also welcome.
Does a California court have jurisdiction over a foreign national. Probably not - which is entirely possible that Reddit would take a lawsuit to whatever court is closest to the person.
As I stated above, this isn't legal advice either, just my layman's view.
My reading of the user agreement[0] is that it distinguishes between "content that belongs to Reddit Inc." (things like the CSS and up/down vote icons) and "content licensed by Reddit" (what users post).
But yeah, lawyering is hard, don't just trust random internet comments like this :)
[0] https://www.redditinc.com/policies/user-agreement
Some jurisdictions will have fair use exceptions for some scenarios. I don't think we can assume "most", "likely", or "irrespective of country/region", although it can feel like that if you happen to live in an area where such things are commonly allowed.
> Except and solely to the extent such a restriction is impermissible under applicable law, you may not, without our written agreement:
> - license, sell, transfer, assign, distribute, host, or otherwise commercially exploit the Services or Content;
Even without being a lawyer that seems pretty clear that it is not legal. Unless you have Reddits written permission, which I guess the Archive Team has
Deleted Comment
Personally I don't care if anyone reuses stuff I've posted on Reddit or other forms of social media (forums like Hacker News included), but there's always the possibility that someone might. And if you remove their posts when asked, I doubt most of them will take it any further than that.
According to these terms, the content is owned by the users, and you're not to modify the content. However, if the content is owned by the users, then IMO Reddit cannot really say what you do or don't do with the content, as long as you're not building an application that acts as a proxy to Reddit.
The license is revocable to accessing their API, but they're not licensing you the user content, only the ability to download it. What you can do with that content is likely up to the laws in your jurisdiction. I'd say most content would qualify as public domain, though obviously some content will have copyright protection.
I would do the downloading now before they start charging for the API if you're serious about the project.
Presumably you mean an app which displays Reddit.com as-is (i.e. with their logos/icons/UX). But what about a service which acts as a proxy to the Reddit API?
If content is not owned by Reddit and API design cannot be copyrighted then this seems legal, right?
You may not: Access, search, or collect data from the Services by any means (automated or otherwise) except as permitted in these Terms or in a separate agreement with Reddit (we conditionally grant permission to crawl the Services in accordance with the parameters set forth in our robots.txt file, but scraping the Services without Reddit’s prior written consent is prohibited)
https://www.reddit.com/r/reddit/comments/145bram/addressing_...
2. Third-party app developers said this is too high, will shut down rather than pay
3. Reddit official response had zero chill
3. a. Repeatedly
4. Significant fraction of subreddits organised a "go private" protest
5. This broke Reddit for everyone else