hyperknot (u/hyperknot)

hyperknot commented on Ask HN: How to stop an AWS bot sending 2B requests/month? · Posted by u/lgats

hyperknot · 2 months ago

Use a simple block rule, not a WAF rule, those are free.

hyperknot commented on OpenFreeMap survived 100k requests per second blog.hyperknot.com/p/open... · Posted by u/hyperknot

biker142541 · 4 months ago

Curious how this would have compared to a static pmtiles file being read directly by maplibre. I’ve had good luck with virtually equal latency to served tiles when consuming pmtiles via range requests on Bunnycdn.

hyperknot · 4 months ago

Yes, wplace could solve their whole need by a single, custom-built static pmtile. No need to serve 150 GB of OSM data for their use-case.

hyperknot commented on OpenFreeMap survived 100k requests per second blog.hyperknot.com/p/open... · Posted by u/hyperknot

Sesse__ · 4 months ago

Well, “bandwidth is expensive” is a true claim, but it's also a very different claim from “a [normal] caching server couldn't handle 56 Gbit/sec”…?

hyperknot · 4 months ago

You are correct. I was putting "a caching server on their side" in the context of their side being a single dev hobby project running on a VPS, exploding on the weekend. I agree that these servers do exist and some companies do pay for this bandwidth as part of their normal operations.

hyperknot commented on OpenFreeMap survived 100k requests per second blog.hyperknot.com/p/open... · Posted by u/hyperknot

Sesse__ · 4 months ago

It really depends on the size of the active set. If it fits into RAM of whatever server you are using, then it's not a problem at all, even with completely off-the-shelf hardware and software. Slap two 40gig NICs in it, install Varnish or whatever and you're good to go. (This is, of course, assuming that you have someone willing to pay for the bandwidth out to your users!)

If you need to go to disk to serve large parts of it, it's a different beast. But then again, Netflix was doing 800gig already three years ago (in large part from disk) and they are handicapping themselves by choosing an OS where they need to do significant amounts of the scaling work themselves.

hyperknot · 4 months ago

I'm sure the server hardware is not a problem. The full dataset is 150 GB and the server has 64 GB RAM, most of which will be never requested. So I'm sure that the used tiles would actually get served from OS cache. If not, it's on a RAID 0 NVME SSD, connected locally.

What I've been referring to is the fact that even unlimited 1 Gbps connections can be quite expensive, now try to find a 2x40 gig connection for a reasonable money. That one user generated 200 TB in 24 hours! I have no idea about bandwidth pricing, but I bet it ain't cheap to serve that.

hyperknot commented on OpenFreeMap survived 100k requests per second blog.hyperknot.com/p/open... · Posted by u/hyperknot

Sesse__ · 4 months ago

> We are talking about an insane amount of data here. It was 56 Gbit/s. This is not something a "caching server" could handle.

You are not talking about an insane amount of data if it's 56 Gbit/s. Of course a caching server could handle that.

Source: Has written servers that saturated 40gig (with TLS) on an old quadcore.

hyperknot · 4 months ago

OK, technically there might exist such server, I guess Netflix and friends are using those. But we are talking about a community supported, free service here. Hetzner servers are my only options, because of their unmetered bandwidth.

hyperknot commented on OpenFreeMap survived 100k requests per second blog.hyperknot.com/p/open... · Posted by u/hyperknot

internetter · 4 months ago

The tiles need to be rendered. Yes frequent tiles can be cached but you already have a cache… it’s Cloudflare. Theoretically you could port the tileserver to Cloudflare pages but then you’d need to… port it… and it probably wouldn’t be cheaper

hyperknot · 4 months ago

They are actually static files. There is just too many of them, about 300 million. You cannot put that in Pages.

hyperknot commented on OpenFreeMap survived 100k requests per second blog.hyperknot.com/p/open... · Posted by u/hyperknot

ndriscoll · 4 months ago

One thing that might work for you is to actually make the empty tile file, and hard link it everywhere it needs to be. Then you don't need to special case it at runtime, but instead at generation time.

NVMe disks are incredibly fast and 1k rps is not a lot (IIRC my n100 seems to be capable of ~40k if not for the 1 Gbit NIC bottlenecking). I'd try benchmarking without the tuning options you've got. Like do you actually get 40k concurrent connections from cloudflare? If you have connections to your upstream kept alive (so no constant slow starts), ideally you have numCores workers and they each do one thing at a time, and that's enough to max out your NIC. You only add concurrency if latency prevents you from maxing bandwidth.

hyperknot · 4 months ago

Yes, that's a good idea. But we are talking about 90+% of the titles being empty (I might be wrong on that), that's a lot of hard links. I think the nginx config just need to be fixed, I hope I'll receive some help on their forum.

hyperknot commented on OpenFreeMap survived 100k requests per second blog.hyperknot.com/p/open... · Posted by u/hyperknot

sour-taste · 4 months ago

Since the limit you ran into was number of open files could you just raise that limit? I get blocking the spammy traffic but theoretically could you have handled more if that limit was upped?

hyperknot · 4 months ago

I've just written my question to the nginx community forum, after a lengthy debugging session with multiple LLMs. Right now, I believe it was the combination of multi_accept + open_file_cache > worker_rlimit_nofile.

https://community.nginx.org/t/too-many-open-files-at-1000-re...

Also, the servers were doing 200 Mbps, so I couldn't have kept up _much_ longer, no matter the limits.

hyperknot commented on OpenFreeMap survived 100k requests per second blog.hyperknot.com/p/open... · Posted by u/hyperknot

rtaylorgarlock · 4 months ago

Is it always/only 'laziness' (derogatory, i know) when caching isn't implemented by a site like wplace.live ? Why wouldn't they save openfreemap all the traffic when a caching server on their side presumably could serve tiles almost as fast or faster than openfreemap?

hyperknot · 4 months ago

We are talking about an insane amount of data here. It was 56 Gbit/s (or 56 x 1 Gbit servers 100% saturated!). This is not something a "caching server" could handle. We are talking on the order of CDN networks, like Cloudflare, to be able to handle this.