Thanks antirez for all the work you've done on redis. Personally I can say that no one had such an impact on my work as you did. Reading through your source code is educating and inspiring.
Dude... I do not think you know truly how many people love Redis. For every contrarian armchair quarterback here on HN there are 1000 people who are out using/coding/hacking w/ Redis and not caring about the noise. Myself included. Thank you and all the contributors for the hard work on an absolutely killer tool.
We use Redis in production for millions of customers and never had an issue, super solid and the code looks great even from someone with only a poor knowledge of C. I'd definitely investigate the code further if I had the time as I'm sure it would be very educational. Thanks too for your engagement with the community, I think everyone appreciates that a lot.
I don't use redis, but the code quality is so high and it helped me understand how to write good looking C APIs which are also performant. Congratulations on the release.
I would also like to extend my sincere thank you to @antirez for building such a wonderful piece of software (lean & mean) that has not only helped me personally but has also played an instrumental part in the growth of several large tech entities over the past decade. Hats off to you sir and thank you again!
I'll also take the opportunity to say thank you, for creating Redis. I tried Redis when it first came out, mostly just experimenting with it. I had not used it in production for many years and have recently made aggressive use of Redis 5 as a diverse caching layer, replacing Memcached among other things. It has simplified my service infrastructure and it has been super reliable. It's delightfully easy to use. Probably my favorite software to work with right now.
"Redis 6 is the biggest release of Redis ever, so even if it is stable, handle it with care, test it for your workload before putting it in production. We never saw big issues so far, but make sure to be careful. "
I have a bit of a tangent question for more experienced back-end developers: where do you fit Redis (or other caching mechanisms) in the traditional MVC model? I haven't had a use-case for Redis yet, but I'd like to know how should I approach the architecture of an app using Redis.
We use it to crash our servers on larger customers, by using it to cache all our user entities, pull all of them out at runtime filter them in PHP then stampede ourselves when we clear the cache
Well, the use for a cache is caching expensive operations. Sorry if this is just stating the obvious, but I'm not sure how else to answer how it fits into traditional MVC operations. It could be a front-end HTTP cache (although you'd probably use a CDN for that instead). It could be caching something expensive to look up or calculate, for which it's fine to not have up-to-the-second-current value.
Many people at least in Rails also use Redis to hold a queue for background processing, where it ends up working well, although hypothetically many other things could too.
You can also use redis for all sorts of other stuff that isn't caching or a queue.
I use it for caching, temporary content, pubsub, and distributed locking.
It's been particularly useful in load balancers / proxies for authorization, managing csrf, and tracking session activity to auto-log out users. I do this with OpenResty.
In async job or internal rpc systems, I use pubsub and streams for distributing work, ensuring exactly once processing, and getting results back to the caller.
Redis is a flexible tool. Some things you could potentially do:
- This DB query is pretty slow. Cache the result in Redis.
- I'm using server-side sessions, but I have multiple servers. Where should I store session data where it'll be super fast, but all my servers can access it? You could use Redis.
- I need to do distributed locking. Use Redis
- Simple job queue? Redis.
- I'm processing a lot of inbound webhooks, and I want to avoid trying to process the same one twice. I'd like to keep a list of all webhooks I've seen in the last week, and quickly check if a new webhook is in the list, but it needs to be really fast. Redis.
Basically, Redis has support for some nifty data structures, and it's very fast. There's a lot of places in most apps where being able share a very fast list, set, map, stream, whatever between servers can be useful. Of course, all the above uses cases can be solved by other more specialised tools too, and in some cases better.
(That being said, it's so useful and generally applicable that you should be careful not to ignore fundamental issues. For example, if you have an unavoidably slow query, then by all means cache it in Redis. But if all your queries are slow because you forgot to add indexes, maybe you should fix that instead of using Redis to cache every query!)
Redis can be a very powerful tool, it can also be a sign that something has gone wrong.
Redis is a high performance key/value store with adjustable consistency. It can be used as a caching layer, and it can also do a solid job of representing a basic message queue. It typically fits in on the data layer to store computed results for fast retrieval, but it can also behave like a read replica (storing data from outside your domain).
That being said, when Redis becomes a critical part of your stack for performance reasons it usually means something has gone wrong in your data layer. I often times see teams use an antipattern of taking data from their database, changing it's shape to what they want, and then storing that new model in redis. If you're the only consumer of your data, then your data model is incorrect and you're using Redis to paper over that fact.
This doesn't sound like the worst way to reduce expensive reads of normalised data. Is the implication that it should have been solved with views, was incorrectly normalised, or that a document/key-value store should have been used other than Redis?
Edit: I suppose a hypothetical can go many ways, it could be a poor data access pattern. What was the root cause in some of your experiences?
We use redis for session storage, jobs with celery/redis-rq, view fragment caching, and certain query caching.
It can really give you a good boost in performance especially on frequently accessed pages were the content rarely changes - database queries are often "expensive" - so for frequently accessed data that doesn't change frequently - such as product descriptions - it can be a huge help.
STREAMS, best pubsub solution imho. I used it as a backing store for an MQTT frontend once, and also for generally coordinating worker processes to handle background tasks.
Just to add a bit on that, if you dynamically generate channel names according to a validatable naming convention that any consumer can predict (ideally with a client lib for generating them), you can do pretty complex message passing that doesn't blow up code complexity. Combine that with the locking and consumer groups built in, it's pretty much distributed computing "for free" even if stuck with multiprocessing for runtime scaling (e.g. Python/JS without the builtin concurrency or multithreading of VMs/hosted languages).
We use it for storing user sessions, for caching responses from a third-party API we access, and for imposing per-IP address rate limits on the use of that API. We've also previously used it for lightweight worker queues.
IF you want to use it for caching, THEN you would use it to cache stuff for your controller.
But, Redis is much more than caching. It supports all kinda of fun data-structures like sorted lists, timeouts, sets, pub-sub and more! You can almost think of it like memory that is held by another process. In that way, there are SO many uses cases.
I think a good example of this is session storage - just store the session with a ttl and now redis will automatically "expire the session" when the time is over.
We use it as the main database for low latency (less than 150ms) response time machine learning services. Store a pretty massive amount of data as well - close to 750GB.
However for your specific use case, considering a typical MVC web app with RDBMS data storage, you would add a check at the beginning of your Model method to return cached data if it exists, else go through the method execution and write the data to the cache just before returning it back to the controller. This way the cache would be 'warmed' on first call and data will be served directly from the cache (memory) next time till it is cleared, saving expensive disk I/O.
You need to be careful with caching inside models, because you want your models to reflect the current, completely up-to-date state of the application. Conceptually, the best place to do it is inside controllers, where you know when you can serve data which isn't completely up to date.
With skinny controllers, very often you have some specific places (eg service objects or similar layers) where the controller logic lives, and that is where you can do your caching.
Our main use case in $JOB for Redis is distributed locking. We usually do not need key-value storage and even if we do, we just go with DynamoDB instead.
There's lots of functionality that Redis provides beyond a key-value store. For example the data structures that it supports. These are very powerful on their own. Also I understand after considerable investments vendor freedom is a bit of an illusion, but you know, if you can choose an OS technology under the hood, it's effectively like not nailing shut a door, but still leaving it closed.
I actually use it as primary database on some parts of the mmorpg I’m developing, Redis actually has ACID capabilities so it is actually very suitable to use as primary database in gaming platforms. That being said main game server requires Mongodb due to sheer size of the data.
Well, if you have only a single instance that doesn't support threading it is trivial to get those properties, but what about durability? Do you realize you store all of the data in the RAM?
For PHP applications Redis has been a must for storing sessions, especially when distributing load between multiple servers.
It is also used to collect data for Prometheus exporters, for example, for languages that don't share memory between requests.
As well as caching, we make really heavy use of Redis for Leaderboards for our games. The sorted sets are perfect for storing score along with the userid. Scanning the leaderboard, finding neighboring scores, etc are all really fast operations. This could probably be done with a number of other storage system but we already used Redis and we've never had a problem.
It is many things, but started as more of an architecture for user-data interaction, where the data is in a computer system - and the user wants to interact with it (the user in user-model-view-controller unfortunately fell off at some point).
As for where something like redis fits - I don't think it'd show up on the design that concerns mvc, no (it could be a cache inside model, inside controller - or even inside view.. Caching ui fragments for example?).
I always use it as my default session store. I build out large python backends and I also have been using it as a celery backend, and prefer for it to memcache for those type of tasks also.
Use it to cache certain request payloads that are guaranteed not to change for a certain amount of time (e.g. 1-minute stock market aggregates that only change once a minute).
If you look at single data type you can see how redis takes care about complexity(indicating the Big O notation) of each single data structure and operations.
Many devs use it for caching but in my opinion is super nice for evil-write applications.
I mostly work on solutions used in-house by the client. The most used app that I had created was used by maybe 50 people at the same time, and it was mostly manipulating spreadsheet-like data, so querying the database directly was fast enough.
I know broadly what Redis can be used for, I was just asking for some practical tips.
It didn't, but we identified something that could be done without throwing away our vision. Basically Redis remains completely single threaded, but once is about to re-enter the event loop, when we have to write replies to all the clients, we use threads just for a moment in that code path, and return single threaded immediately after. It is not as performant as full threading, but we believe the right thing to do is sharding and share-nothing. Yet this 2x improvement for the single instance was almost for free so... btw it's disabled by default.
Threaded IO, Server assisted client caching, and Redis proxy in my opinion is an important milestones. Threaded IO will fully utilize multiple cores; and improvements are amazing. Server assisted caching will reduce client round-trips. Redis proxy will remove the debate of which client library to use, Jedis for example has not allowed reading from replicas or doing multiplexing distributed commands; proxy is gonna solve all those gripes. I can't wait for it to be stable and production ready.
One caching pattern I find myself doing a lot on Redis is where multiple clients try to access the same cached value, but if it doesn't exist only one is allowed to revalidate it, all other clients wait until it's revalidated and get notified with the new value.
Currently for the clients it involves:
1. subscribe for updates on the key you're trying to retrieve
2. retrieve cached value or set a lock
3. if lock acquired, unsubscribe, then fetch value
4. set the result in a key and publish event with the value
I wish there was a command that does:
retrieve value of a key, if it doesn't exist lock the key and notify me with an id, if it's locked subscribe for the next update on the key. With a second command that acknowledges the lock with a new value
Now I know this can be implemented and I've done so multiple times. It's just that it's tricky to get right and consistent.
You are a hero.
Congratulations, team!
Many people at least in Rails also use Redis to hold a queue for background processing, where it ends up working well, although hypothetically many other things could too.
You can also use redis for all sorts of other stuff that isn't caching or a queue.
It's been particularly useful in load balancers / proxies for authorization, managing csrf, and tracking session activity to auto-log out users. I do this with OpenResty.
In async job or internal rpc systems, I use pubsub and streams for distributing work, ensuring exactly once processing, and getting results back to the caller.
- This DB query is pretty slow. Cache the result in Redis.
- I'm using server-side sessions, but I have multiple servers. Where should I store session data where it'll be super fast, but all my servers can access it? You could use Redis.
- I need to do distributed locking. Use Redis
- Simple job queue? Redis.
- I'm processing a lot of inbound webhooks, and I want to avoid trying to process the same one twice. I'd like to keep a list of all webhooks I've seen in the last week, and quickly check if a new webhook is in the list, but it needs to be really fast. Redis.
Basically, Redis has support for some nifty data structures, and it's very fast. There's a lot of places in most apps where being able share a very fast list, set, map, stream, whatever between servers can be useful. Of course, all the above uses cases can be solved by other more specialised tools too, and in some cases better.
(That being said, it's so useful and generally applicable that you should be careful not to ignore fundamental issues. For example, if you have an unavoidably slow query, then by all means cache it in Redis. But if all your queries are slow because you forgot to add indexes, maybe you should fix that instead of using Redis to cache every query!)
Redis is a high performance key/value store with adjustable consistency. It can be used as a caching layer, and it can also do a solid job of representing a basic message queue. It typically fits in on the data layer to store computed results for fast retrieval, but it can also behave like a read replica (storing data from outside your domain).
That being said, when Redis becomes a critical part of your stack for performance reasons it usually means something has gone wrong in your data layer. I often times see teams use an antipattern of taking data from their database, changing it's shape to what they want, and then storing that new model in redis. If you're the only consumer of your data, then your data model is incorrect and you're using Redis to paper over that fact.
Edit: I suppose a hypothetical can go many ways, it could be a poor data access pattern. What was the root cause in some of your experiences?
- User session storage
- Asynchronous "queues" for background processing using Streams (was able to eliminate RabbitMQ from my stack when I switched to this)
- Rate-limiting with a GCRA (via https://github.com/brandur/redis-cell)
- Bloom filter checks, specifically for making sure people aren't using passwords in data breaches (via https://github.com/RedisBloom/RedisBloom with data from https://haveibeenpwned.com/passwords)
It can really give you a good boost in performance especially on frequently accessed pages were the content rarely changes - database queries are often "expensive" - so for frequently accessed data that doesn't change frequently - such as product descriptions - it can be a huge help.
But, Redis is much more than caching. It supports all kinda of fun data-structures like sorted lists, timeouts, sets, pub-sub and more! You can almost think of it like memory that is held by another process. In that way, there are SO many uses cases.
However for your specific use case, considering a typical MVC web app with RDBMS data storage, you would add a check at the beginning of your Model method to return cached data if it exists, else go through the method execution and write the data to the cache just before returning it back to the controller. This way the cache would be 'warmed' on first call and data will be served directly from the cache (memory) next time till it is cleared, saving expensive disk I/O.
With skinny controllers, very often you have some specific places (eg service objects or similar layers) where the controller logic lives, and that is where you can do your caching.
0. https://redis.io/topics/distlock
I'm not sure redis, a database, is relevant.
See:
http://heim.ifi.uio.no/~trygver/themes/mvc/mvc-index.html
As for where something like redis fits - I don't think it'd show up on the design that concerns mvc, no (it could be a cache inside model, inside controller - or even inside view.. Caching ui fragments for example?).
2. Use it for quick lookups for user accounts
3. To queue up jobs that need to run whenever the job runner has slots available.
4. Use it to crash your entire web stack when you accidentally clear your redis instance
First you could check at
https://redis.io/topics/data-types-intro
If you look at single data type you can see how redis takes care about complexity(indicating the Big O notation) of each single data structure and operations.
Many devs use it for caching but in my opinion is super nice for evil-write applications.
I know broadly what Redis can be used for, I was just asking for some practical tips.
Woah, I missed that redis changed their mind about this. Clearly haven't been paying attention.
Why is that?
Thanks for the explanation on Threaded I/O, I was about to ask the difference between Redis 6 and KeyDB.
1. subscribe for updates on the key you're trying to retrieve 2. retrieve cached value or set a lock 3. if lock acquired, unsubscribe, then fetch value 4. set the result in a key and publish event with the value
I wish there was a command that does: retrieve value of a key, if it doesn't exist lock the key and notify me with an id, if it's locked subscribe for the next update on the key. With a second command that acknowledges the lock with a new value
Now I know this can be implemented and I've done so multiple times. It's just that it's tricky to get right and consistent.