A bit of the history as I've been told by 2nd Quadrant/EDB people (my teammates):
BDR1 [0] came first and was, and is, open source. pgactive is based on BDR1. BDR2 was a closed-source rewrite of BDR1 that was later abandoned.
pglogical v1 and v2 (PGL1, PGL2) were, and are, open-source [1].
pglogical v1, after heavy modification, was eventually merged into Postgres 10.
Based on learnings from this logical replication in Postgres 10, 2nd Quadrant started pglogical v2.
pgEdge is based on pglogical v2.
Then later 2nd Quadrant started pglogical v3 (closed source) and BDR v3 (closed source). They were merged into just BDR v4. At some point the BDR product was renamed to Postgres Distributed (PGD) [2].
2ndQuadrant was acquired by EDB. We (EDB) just released PGD v6.
Right, I'm not so familiar with it but from that page:
> The replication mechanism is based on logical decoding and an earlier version of the pglogical extension provided for community by the 2ndQuadrant team.
Looks like it uses Postgres Logical replication to share changes made on one postgres instance to another. Conflict resolution is last-write-wins based on timestamp. Conflicting transactions are logged to a special table (pgactive_conflict_history), so you can see the history, resolve, etc.
Sounds like "yes, with an if" where the "if" is "if you don't really care about data consistency".
"Last write wins" sounds like a recipe for disaster IMO.
This is still one of those things that keeps people on MySQL - there are not one, but two open-source solutions available that provide synchronous cluster replication, allowing for "safe" writes against multiple primaries.
Did Postgres ever get a built-in, blessed replication offering? It's been a while since I set it up, but I remember this was always a big missing feature compared to Mysql.
It's eventual consistency: Latest-write wins after the dust settles.
As I understand it, this is a wrapper on top of Postgres' native logical replication features. Writes are committed locally and then published via a replication slot to subscriber nodes. You have ACID guarantees locally, but not across the entire distributed system.
Tangential, but related. Is there a way to have a "locally writable" read replica, ie. a secondary db that reads from a primary, but that can also hold local changes that doesn't send back to the primary?
One of the use cases is to have a development db that can get data from production or staging (and doesn't send local changes back)
What I've done usually is have some script/cron/worker run periodically to get data, either via dump or running some queries, create a snapshot, store it in S3, then have a script on the local dev code that gets the snapshot and inserts/restores the data in the local db. This works for many cases, but index building can be a pain (take a long time), depending on the data
Just FYI that most people would recommend against doing this for legal reasons. PII information and the likes are not usually allowed to land in a staging or dev environment, for various reasons. Doing this or allowing it, is a huge liability.
Agreed, and there’s a few ways to deal with that, like not including certain tables or excluding the data via queries when creating the snapshot
Having said that, legal exposure and risk will highly depend on what you are working on. Probably for most projects this isn’t a big deal. IANAL, this is not legal advice
Load the snapshot to a "pristine" local database that you never modify. Whenever you need a "reset" of your dev database, drop it, then copy the pristine database using `createdb --template`. This copies prebuilt indexes rather than rebuild them, which is much faster.
You can use Postgres logical replication with a filter to create a one-way replica, then just disable the replication slot when you want to make local changes without affecting the primary.
> then just disable the replication slot when you want to make local changes without affecting the primary
This would be all the time, ie. in my local dev, I’d like to have an up to date (daily is ok), copy of the source db that I can modify anytime, but that everyday will sync with the source, without having to fully rebuild indices
How would you go about managing/coordinating that?
AFAIK that is the standard behavior with a Postgres logical replication setup. There is nothing preventing you doing writes on the replica, they just won't get sent back anywhere else.
Curious about this - How would local writes conflicting with remote updates be handled? I can't think of a merge strategy working on all scenario (or even most of the time)
Great question, I don't know. However, at least in my case, I wouldn't mind the source data always overwriting the local data. In fact, that's the way it works now when loading a newer snapshot, the local db is dropped and then re-built from the snapshot
Thinking about the developer experience though, when loading a snapshot manually, the dev knows they are overwriting their local db. However, if replication happened automatically/continuously on the background, it could lead to some really confusing/annoying behaviors
I'm not tired of reminding everyone that "conflict resolution" is no more than an euphemism for "breaking durability by dropping already committed and acknowledged data".
Either architect for no data overlap on writes across all the "actives" (in which case software like pgactive could be a good deal) or use a purely distributed database (like Yugabyte).
I could see in the docs they recommended a scenario like: each master is the only writer for a given schema, to avoid conflicts, but the replication gives them all a copy of all the schemas to read from.
And I was wondering what other ways, besides schemas, of dividing up 'writer responsibility' would also work? Partitions?
For inserts is "easy" if there's no possibility to write the same PK value twice (e.g. PK are random values like uuids or include as part of a natural key a identifier from the writer that's unique -shard id, region id, you name it-).
Once you have done that, for updates and deletes you need to keep the same rule (i.e. don't update "foreign" rows).
If you do this, no other technique is needed. Partitions, however, are potentially a good technique to enforce some of these invariants, which gives us quick understanding of where data is originating from given the table name. Same could apply to schemas.
In my experience multi-writer is because of latency or HADR stuff - have all your data in all regions at the same time, but the method (via the tlog) seems like it sort of defeats what those sorts of systems might be able to historically do (write multiple places from the app at the same time so as to have the lowest possible chance of data loss.)
After setting up numerous clusters with repmgr and patroni along with running them in zero down time production... This is the very last plugin i would ever install. I like to sleep at night.
Seems rather niche. You can already improve async write availability with CQRS and a (durable) queue. Systems like Kafka implement this out of the box
Seems sort of like a CQRS implementation on top of PG (you're using PG replication as the change queue to loosely separate writes/reads, losing transaction guarantees in the process)
BDR1 [0] came first and was, and is, open source. pgactive is based on BDR1. BDR2 was a closed-source rewrite of BDR1 that was later abandoned.
pglogical v1 and v2 (PGL1, PGL2) were, and are, open-source [1].
pglogical v1, after heavy modification, was eventually merged into Postgres 10.
Based on learnings from this logical replication in Postgres 10, 2nd Quadrant started pglogical v2.
pgEdge is based on pglogical v2.
Then later 2nd Quadrant started pglogical v3 (closed source) and BDR v3 (closed source). They were merged into just BDR v4. At some point the BDR product was renamed to Postgres Distributed (PGD) [2].
2ndQuadrant was acquired by EDB. We (EDB) just released PGD v6.
[0] https://github.com/2ndQuadrant/bdr/tree/bdr-plugin/REL1_0_ST...
[1] https://github.com/2ndquadrant/pglogical
[2] https://www.enterprisedb.com/docs/pgd/latest/
> The replication mechanism is based on logical decoding and an earlier version of the pglogical extension provided for community by the 2ndQuadrant team.
https://github.com/aws/pgactive/tree/main/docs
"Last write wins" sounds like a recipe for disaster IMO.
This is still one of those things that keeps people on MySQL - there are not one, but two open-source solutions available that provide synchronous cluster replication, allowing for "safe" writes against multiple primaries.
As I understand it, this is a wrapper on top of Postgres' native logical replication features. Writes are committed locally and then published via a replication slot to subscriber nodes. You have ACID guarantees locally, but not across the entire distributed system.
https://www.postgresql.org/docs/current/logical-replication....
Seems the same is playing out out in Postgres with this extension, maybe will take it another 20 years
One of the use cases is to have a development db that can get data from production or staging (and doesn't send local changes back)
What I've done usually is have some script/cron/worker run periodically to get data, either via dump or running some queries, create a snapshot, store it in S3, then have a script on the local dev code that gets the snapshot and inserts/restores the data in the local db. This works for many cases, but index building can be a pain (take a long time), depending on the data
Having said that, legal exposure and risk will highly depend on what you are working on. Probably for most projects this isn’t a big deal. IANAL, this is not legal advice
The goal is not necessarily having an easy way to reset, but rather an easy/quick way to load real data
This would be all the time, ie. in my local dev, I’d like to have an up to date (daily is ok), copy of the source db that I can modify anytime, but that everyday will sync with the source, without having to fully rebuild indices
How would you go about managing/coordinating that?
Thinking about the developer experience though, when loading a snapshot manually, the dev knows they are overwriting their local db. However, if replication happened automatically/continuously on the background, it could lead to some really confusing/annoying behaviors
Either architect for no data overlap on writes across all the "actives" (in which case software like pgactive could be a good deal) or use a purely distributed database (like Yugabyte).
And I was wondering what other ways, besides schemas, of dividing up 'writer responsibility' would also work? Partitions?
Once you have done that, for updates and deletes you need to keep the same rule (i.e. don't update "foreign" rows).
If you do this, no other technique is needed. Partitions, however, are potentially a good technique to enforce some of these invariants, which gives us quick understanding of where data is originating from given the table name. Same could apply to schemas.
RLS may also help enforce these invariants.
RDS uses block replication. Aurora uses it's own SAN replication layer.
DMS maybe?
But only last month did they officially release it as open source to the community https://aws-news.com/article/2025-06-09-announcing-open-sour...
[1]: https://www.allthingsdistributed.com/2025/05/just-make-it-sc...
[1]https://aws.amazon.com/rds/aurora/dsql/features/#topic-1
I don't think that is used for cross region replication
Deleted Comment
Pgactive: Active-Active Replication Extension for PostgreSQL on Amazon RDS - https://news.ycombinator.com/item?id=37838223 - Oct 2023 (1 comment)
This is not a way to get better performance or scalability in general.
Seems sort of like a CQRS implementation on top of PG (you're using PG replication as the change queue to loosely separate writes/reads, losing transaction guarantees in the process)