Devious SQL: Message Queuing Using Native PostgreSQL

This is easily a few thousand dollars worth of advice. I use PostgreSQL regularly but I would never have been able to come up with this solution myself.

Having queues on Postgres is going to be such a great addition to my tool belt especially since there's always a Postgres instance running somewhere anyway.

I know the first use case that jumps to mind is a job queues, but I feel like the method described in the article is quite low level which means it can be used as a base to solve many use cases.

* I won't have to reach for Kafka/Redpanda unless the rate of events/messages reaches 100k-1000k per day.

* I can add one column called queue_id where each unique queue_id refers to a new queue which means I can use a single table for multiple queues.

* If I add a new column called event_type, then I wonder if it's possible to create a composite event queue, where for example the query must return exactly two rows and one row must have event_type=type_1 and second row must have event_type=type_2 and both rows are locked and processed exactly once?

Queuing means different things to different people; e.g. jobs vs events vs messages. Jobs typically run longer, events are typically reactive, messages are often part of the main flow and potentially part of an interactive operation.

I've implemented job queues in PostgreSQL, but I do polling in two phases. One phase to update the rows (using SKIP LOCKED) with metadata around the fact that a job has started, when it started, what agent is running the job, etc. That enables monitoring for stuck jobs, looking at current status, incremental updates on long running jobs, cooperative job cancellation, etc. It requires monitoring for abandoned jobs, but that's not too bad - it also avoids long-running transactions.

The other phase is to copy completed / aborted jobs to a history table for stats, failure reasons etc., keeping the main queue table small.

CodesInChaos · 4 years ago

Another thing those posts rarely specify is what the target throughput is. The proper design for 1 job/s will be very different from the proper design for 10k jobs/s.

For example the database based queue we use in production is even more primitive than what this article describes (no batches, polling once a second, rows never deleted), but works perfectly fine since we're only processing less than 10 jobs/s and don't expect it to grow beyond 100 jobs/s.

jpgvm · 4 years ago

Implementations that use advisory locks like https://github.com/que-rb/que are much more efficient (atleast when I last tested) and will easily reach 10k job/s on even very modest hardware.

There is a Go port of Que but you can also easily port it to any language you like. I have a currently non-OSS implementation in Rust that I might OSS someday when I have time to clean it up.

pdimitar · 4 years ago

A Rust implementation will be highly valued!

nitred · 4 years ago

bfelbo · 4 years ago

Nice post! For anyone serious about using PG for queues, I’d suggest looking into https://github.com/graphile/worker which seems very promising.

barrkel · 4 years ago

benmmurphy · 4 years ago

keeping the transaction open while the job is running is probably dubious. presumably you are using a queue because the job takes a long time to run. having long running transactions steals a scarce PG resource (connections) and also causes issues with vacuuming.

erichocean · 4 years ago

> Queueing jobs in Node.js using PostgreSQL like a boss

You can scale much higher in Postgres 14 if you use partitioned tables, both horizontally and in terms of day-to-day stability because old partitions can be dropped in O(1) time, taking all table bloat with them. Obviously more work to set up, though.

timando · 4 years ago

It's not O(1), it's O(number of gigabytes of table). A DROP TABLE statement has to delete the files backing the table and tables are split into files of 1 gigabyte each.

noodlesUK · 4 years ago

Does anyone have anything that resembles Celery or Huey for python that uses the nice Postgres semantics for queueing? I know ruby has something like this in the form of Que. My application has very modest needs for its task queue, probably like a few thousand events per year, and celery seems like massive overkill.

swuecho · 4 years ago

https://procrastinate.readthedocs.io/en/stable/

This looks exactly like what I’m looking for. Thank you!

ddorian43 · 4 years ago

https://gitlab.com/dalibo/dramatiq-pg

subleq · 4 years ago

https://github.com/gavinwahl/django-postgres-queue

fouadf · 4 years ago

At our company we use Redis, it's lightweight and has list structure which can be used as a push/pop queue

The basic gist of it is that on one end a producer pushes to the list and a consumer(s) on the other end pops the job and executes it. Fire and forget style.

RedShift1 · 4 years ago

I remember coming across a project that implemented a complete queueing system using SQL functions, so to add jobs you would just call a function and same for polling/listening for jobs. It was very well documented, each function was explained and how to use it. But I can't find it anymore.

to11mtm · 4 years ago

Probably not the one you're thinking of, but I wrote something like this for C# once upon a time called OddJob [0]. If nothing else it shows decent patterns for a job queue that is friendly towards multiple producers/consumers.

[0] - https://github.com/to11mtm/oddjob/tree/cleaning-aug-19