Readit News logoReadit News
woodpeck commented on Download responsibly   blog.geofabrik.de/index.p... · Posted by u/marklit
kevincox · 3 months ago
But that raises the complexity of hosting this data immensely. From a file + nginx you now need active authentication, issuing keys, monitoring, rate limiting...

Yes, this the the "right" solution but it is a huge pain and it would be nice if we could have nice things without needing to do all of this work.

This is tragedy of the commons in action.

woodpeck · 3 months ago
Speaking as the person running it - introducing API keys would not be a big deal, we do this for a couple paid services already. But speaking as a person frequently wanting to download free stuff from somewhere, I absolutely hate having to "set up an account" just to download something once. I started that server well over a decade ago (long before I started the business that now houses it); the goal has always been first and foremost to make access to OSM data as straightforward as possible. I fear that having to register would deter many a legitimate user.
woodpeck commented on Download responsibly   blog.geofabrik.de/index.p... · Posted by u/marklit
ranzhh · 3 months ago
Oh hey, it's me, the dude downloading italy-latest every 8 seconds!

Maybe not, but I can't help but wonder if anybody on my team (I work for an Italian startup that leverages GeoFabrik quite a bit) might have been a bit too trigger happy with some containerisation experiments. I think we got banned from geofabrik a while ago, and to this day I have no clue what caused the ban; I'd love to be able to understand what it was in order to avoid it in the future.

I've tried calling and e-mailing the contacts listed on geofabrik.de, to no avail. If anybody knows of another way to talk to them and get the ban sorted out, plus ideally discover what it was from us that triggered it, please let me know.

woodpeck · 3 months ago
Hey there dude downloading italy-latest every 8 seconds, nice to hear from you. I don't think I saw an email from you at info@geofabrik, could you re-try?
woodpeck commented on Download responsibly   blog.geofabrik.de/index.p... · Posted by u/marklit
1vuio0pswjnm7 · 3 months ago
"There have been individual clients downloading the exact same 20-GB file 100s of times per day, for several days in a row. (Just the other day, one user has managed to download almost 10,000 copies of the italy-latest.osm.pbf file in 24 hours!) Others download every single file we have on the server, every day."

This sounds like problem rate-limiting would easily solve. What am I missing. The page claims almost 10,000 copies of same file were downloaded by the same user

The server operator is able to count the number of downloads in a 24h period for an individual user but cannot or will not set a rate limit

Why not

Will the users mentioned above (a) read the operator's message on this web page and then (b) change their behaviour

I would be bet against (a) and therefore (b) as well

woodpeck · 3 months ago
Geofabrik guy here. You are right - rate limiting is the way to go. It is not trivial though. We use an array of Squid proxies to serve stuff and Squid's built-in rate limiting only does IPv4. While most over-use comes from IPv4 clients it somehow feels stupid to do rate limiting on IPv4 and leave IPv6 wide open. What's more, such rate-limiting would always just be per-server which, again, somehow feels wrong when what one would want to have is limiting the sum of traffic for one client across all proxies... then again, maybe we'll go for the stupid IPv4-per-server-limit only since we're not up against some clever form of attack here but just against carelessness.

u/woodpeck

KarmaCake day10September 23, 2025
About
Works on OpenStreetMap related stuff at Geofabrik.
View Original