thinkingfish (u/thinkingfish)

thinkingfish commented on Sieve is simpler than LRU cachemon.github.io/SIEVE-... · Posted by u/SerCe

1a1a11a · 2 years ago

And Segcache is available in Rust as part of Pelikan project https://github.com/pelikan-io/pelikan

We also published it as a crate https://crates.io/crates/segcache

thinkingfish commented on Momento launches out of stealth with a serverless cache techcrunch.com/2022/11/02... · Posted by u/jedberg

kshams · 3 years ago

Daniela Miao and I founded Momento. We are passionate about our team and serverless. We are open to answer any questions you may have about our service.

https://www.gomomento.com/blog/we-built-a-serverless-cache-y...

thinkingfish · 3 years ago

What’s your billing model?

thinkingfish commented on Momento launches out of stealth with a serverless cache techcrunch.com/2022/11/02... · Posted by u/jedberg

thinkingfish · 3 years ago

Congrats! Although I look back fondly on the many years I spent on debugging cache incidents (https://danluu.com/cache-incidents/) I don’t think the rest of the world should go through the same bumpy road.

It’s truly interesting to see a new generation of applications not only doing away with using bare metal hardware, but continuing to be built on increasingly higher level abstractions.

Of course, the key is for those abstractions to be dependable in production as well as easy to use. I think Momento actually takes runtime predictability quite seriously.

thinkingfish commented on Pelikan, Twitter’s framework for building caches github.com/twitter/pelika... · Posted by u/MrBuddyCasino

liquidgecka · 3 years ago

Wow! It dawned on me that you have been working on problems in this space for ten years now! It's amazing to see how far this has come. I still remember the days of proxying memcache and kernel modules to increase file descriptor limits on running memcache instances!

thinkingfish · 3 years ago

Hello! I still have your cookie jar LOL

thinkingfish commented on Pelikan, Twitter’s framework for building caches github.com/twitter/pelika... · Posted by u/MrBuddyCasino

salmo · 3 years ago

I’m always more curious about the operational story, but new projects tend to focus on the low-level implementation.

I love Redis, but managing HA is a pain and requires a good bit of engineering on its own. I think this is how RedisLabs stays in business.

This seems to separate backend and front end, maybe so you could use a more appropriate storage for your use case?

But what would a production-ready deployment look like? How would you handle failover for patching or… failure?

Adding a front end can sometimes double the problem needing to have 1 failure setup for it and 1 for the backend. If I were to use the slab storage (looks really ideal for most of our workloads), how would that work?

Too much to answer here, but stuff I’d like to see as it matures. It’s the unsexy stuff, I know. Way more fun to get into the bits.

thinkingfish · 3 years ago

Hopefully we can open source our deployment "generator" sometime soon. It's not gonna work in AWS (Twitter isn't a cloud first company) but you will get the basic idea.

Yes, swappable storage backend is a big driver of the design. Segcache for TTL-centric workloads, maybe some simple allocator-backed data structures for many popular Redis features, tiered storage for large time series... these were all internally discussed and sketched out to various extent.

Failure handling is very, very context dependent, both in terms of what the product does (which drives the ROI analyses) and where it runs (which determines ways things could fail). Still figuring out how to talk about fishing not the fish. Will give this more thoughts.

thinkingfish commented on Pelikan, Twitter’s framework for building caches github.com/twitter/pelika... · Posted by u/MrBuddyCasino

systemvoltage · 3 years ago

> pelikan_pingserver_rs: an over-engineered, production-ready ping server useful as a tutorial and for measuring baseline RPC performance

> an over-engineered...

Ever since I saw Juicero teardown by AvE [1], I'll never see the term "over-engineered" in a positive light. Usually that means, it is sub-optimal and not engineered well, usually in a hurry and developed with a broad brush approach towards safety margins. Juicero was so over-engineered, it was embarrassing.

[1] https://www.youtube.com/watch?v=_Cp-BGQfpHQ

thinkingfish · 3 years ago

It was supposed to be a bit tongue-in-cheek... Obviously nobody should create a pingserver just for the sake of with the specs we had. But, as a halfway marker to a production-ready cache service, it's good to have something that lets you test the parts that will account for well over 50% of the CPU time in production.

thinkingfish commented on Pelikan, Twitter’s framework for building caches github.com/twitter/pelika... · Posted by u/MrBuddyCasino

threeseed · 3 years ago

Would like to see a comparison between this and Facebook's Cachelib:

https://github.com/facebook/CacheLib

thinkingfish · 3 years ago

They are not quite the same thing. Pelikan has the building blocks for an RPC server, storage, wire protocols, data structures, and whatever code that is needed to glue these together in some form to provide functionalities as a service (which is loosely defined, it can be a proxy too).

CacheLib, in terms of the role it plays, is closer to the modules under src/storage, but obviously in itself a lot more sophisticated in design to handle tiered memory/SSD storage.

We actually have been in touch with CacheLib team since before their public release. And we remain interested in integrating CacheLib into Pelikan as a storage backend as soon as their Rust bindings become functional :P

thinkingfish commented on Pelikan, Twitter’s framework for building caches github.com/twitter/pelika... · Posted by u/MrBuddyCasino

skyde · 3 years ago

I have a question for someone with your expertise. Why not use the CLOCK-Pro cache eviction algorithm in a system like this?

thinkingfish · 3 years ago

I assume you meant this as a comparison against Segcache?

We touched upon the CLOCK-Pro a bit in related work of the paper (https://www.usenix.org/system/files/nsdi21-yang.pdf, Section 6.1). I haven't done a deep dive, but can see two general issues. 1. CLOCK-Pro is designed for fixed-size elements, and doesn't work naturally for mixed-size workloads. Naively evicting "cold keys" may not even meet the memory size and contingency requirement needed for the new key. This is why memcached's slab design focuses primarily on size, and using a secondary data structure (LRUq) to manage recency. Something like CLOCK-Pro may be modified to handle mixed size, but I suspect that significantly complicates the design. 2. An algorithm without the notion of a TTL obviously doesn't take the strong hint of TTL. Given the workloads we have, we concluded that TTL is an excellent signal to use to make memory related decisions, and hence went for a TTL centric design.

thinkingfish commented on Pelikan, Twitter’s framework for building caches github.com/twitter/pelika... · Posted by u/MrBuddyCasino

ithkuil · 3 years ago

What is momento?

thinkingfish · 3 years ago

It's a SaaS startup offering managed cache (like Elastic Cache). They have their own protocol but uses Pelikan-segcache as one of their backends.