Fair: A Go library for serving resources fairly

ignoramous · a year ago

> Since the state is stored in a multi-level Bloom Filter style data structure, the memory needed is constant and does not scale with the number of clients.

Constant memory, but those hashes will take up CPU cycles. If you're running a workload that completes sub 20 milliseconds, these cycles spent hashing may not be worth it over, say, a constant-time admission control like token bucket.

mnadkvlb · a year ago

Absolutely my thought as well. During my thesis i found token buckets to be better and more fair compared to other algos[1]. This was in libvirt. my finding was that it scaled well up to around 1k buckets if memory serves right. after that the results were weird to say the least. (of course the servers i tested on were quite old with not too much memory). Would be nice to run my thesis tests again to find what happened in the last decade.

I tried writing another algorithm for network splitting but didn't get any better results. [1]: https://www.csg.uzh.ch/publications/details.php?id=1007

JensRantil · a year ago

The world is full of trade-offs. I don't think the author intended to solve CPU-bound workloads like that. Or have they claimed that?

ignoramous · a year ago

Not just CPU-bound, but IO-bound workloads can also complete within milliseconds.

> have they claimed that?

The mention of "token bucket" in the project readme is why I wrote the comment I did.

  ... FAIR [only throttles] when there's a genuine shortage of resources as opposed to the approaches like token bucket or leaky bucket which may reject requests ...

remram · a year ago

I'm not exactly sure what you propose, a token bucket per client? In a hashmap client=>bucket?

otterley · a year ago

That's the typical design for a token-bucket rate limiter. A token-bucket limiter uses more memory but is a simpler design. I believe this bloom-filter based implementation is designed to be more memory efficient at the cost being less CPU efficient.

As usual, tradeoffs are everywhere.

tazu · a year ago

Does anyone have some real-world use cases for something like this? The algorithm is cool but I'm struggling to see where this is applicable.

codaphiliac · a year ago

Thinking this could be useful in a multi tenants service where you need to fairly allocate job processing capacity across tenants to a number of background workers (like data export api requests, encoding requests etc.)

jawns · a year ago

That was my first thought as well. However, in a lot of real world cases, what matters is not the frequency of requests, but the duration of the jobs. For instance, one client might request a job that takes minutes or hours to complete, while another may only have requests that take a couple of seconds to complete. I don't think this library handles such cases.

itake · a year ago

The text suggests a method for managing GPU or rate-limited resources across multiple clients. It highlights the problem of spikey workloads, where a client might generate a large number of events (e.g., from a CSV upload) causing resource starvation. The text advises against using naive solutions like FIFO, which could disadvantage clients with steady live traffic.

mnadkvlb · a year ago

I responded above, but it could be used maybe for network libraries for eg. libvirt. I did my thesis on this topic a couple years ago.

I am very intrigued to find out how this would fit in, if at all.

otterley · a year ago

Rate limiters are used to protect servers from overload and to prevent attackers--or even legitimate but unintentionally greedy tenants--from starving other tenants of resources. They are a key component of a resilient distributed system.

See, e.g., https://docs.aws.amazon.com/wellarchitected/latest/framework...

This project, however, looks like a concurrency limiter, not a rate limiter. I'm also not sure how it works across a load-balanced cluster.

nstateful · a year ago

Very interesting and looking forward to trying this. I am a big fan of SFB for this type of stuff but haven't seen anything in distributed space that's beyond a science project. Would be great if I can use it on my 10k+ user web site.

JensRantil · a year ago

I coded up something like Fair a couple 1-2 years ago: https://github.com/JensRantil/conc It's alpha software, but maybe interesting to someone don't know.

roboben · a year ago

Shouldn’t this be built into a queue somehow? I’d love to see a queuing solution like SQS but has a built in fairness, where you can fully utilize a capacity but as soon as, let’s say customers compete on resources, some fairness kicks in. Is there anything like that?

ahoka · a year ago

Multiple queues?

salomonk_mur · a year ago

Why would you learn and use this over typical load-balancing solutions like K8S? Honest question.

otterley · a year ago

They are complementary solutions, not substitutes. Load balancers distribute traffic among servers and are a key component to enable horizontal scalability. Throttling is a prophylactic feature that prevents attackers or greedy consumers from overutilizing capacity and depriving it from other legitimate users. This library relates to the latter.

Unfortunately the title of the GitHub repo ("A Go library for serving resources fairly") is misleading. This is not a server; it's a library that a server can utilize to determine whether a request has exceeded fairness bounds and should be rejected with an HTTP 429 (too many requests) response.

joshuanapoli · a year ago

In a multi-tenant system, you might have one customer who drops a big job that creates a huge number of tasks. We'd like to process this as fast as possible, but not block the tasks of small jobs from other customers, which should normally be completed very quickly.

kgeist · a year ago

We had to make something similar. We have both huge tenants (200k users in one tenant) and small tenants with 10 users. Sometimes there are spikes when a large tenant generates thousands of jobs. In a naive implementation, 1 tenant would be able to completely block job processing for all other tenants. We have to make sure the next job is picked from a different tenant each time, so that all tenants were served fairly. However, a large tenant may end up waiting for its jobs to complete for too long. In that case, we move such a tenant to a different infrastructure (sometimes, fully dedicated).

dpatterbee · a year ago

Isn't the exactly the problem that preemptive multitasking is built for? For example any program built on the BEAM[1] wouldn't have this problem presumably. Do most languages not have a standard solution for this?

[1]: https://en.m.wikipedia.org/wiki/BEAM_(Erlang_virtual_machine...

remram · a year ago

Those are completely unrelated. K8s does not provide client rate-control and FAIR does not do load-balancing. It appears you misunderstood what this is.