Using triggers + history tables (aka audit tables) is the right answer 98% of the time. Just do it. If you're not already doing it, start today. It is a proven technique, in use for _over 30 years_.
It's great if you can store immutable data. Really, really great. But you _probably_ have a ton of mutable data in your database and you are _probably_ forgetting a ton of it every day. Stop forgetting things! Use history tables.
Do not use Papertrail or similar application-space history tracking libraries/techniques. They are slow, error-prone, and incapable of capturing any DB changes that bypass your app stack (which you probably have, and should). Worth remembering that _any_ attempt to capture an "updated" timestamp from your app is fundamentally incorrect, because each of your webheads has its own clock. Use the database clock! It's the only one that's correct!
> each of your webheads has its own clock. Use the database clock!
Yes, for consistency you should use the database clock by embedding the calls to `now()` or similar in the query instead of generating it on the client.
But it's not sufficient to use these timestamps for synchronization. The problem is that these timestamps are generated at the start of the transaction, not the end of the transaction when it commits. So if you poll a table and filter for recent timestamps, you'll miss some from transactions that are committing out of order. You can add a fudge factor like querying back an extra few minutes and removing the duplicates, but some transactions will take longer than a few minutes. There's no upper bound to how long a transaction can take in postgresql, and there's a lot of waste in querying too far back. This approach doesn't work if you care about correctness or efficiency.
There is a way in PostgreSQL to get actual wall clock time of the database server: `clock_timestamp()` regardless of transaction start time.
There's also `statement_timestamp()` per the docs: "returns the start time of the current statement (more specifically, the time of receipt of the latest command message from the client)."
This isn't to say any of these methods are the best (or even good) in all cases. Time is tricky, especially if you're trying to do any sort of sequencing.
Estuary (https://estuary.dev ; I'm CTO) gives you a real time data lake'd change log of all the changes happening in your database in your cloud storage -- complete with log sequence number, database time, and even before/after states if you use REPLICA IDENTITY FULL -- with no extra setup in your production DB.
By default, if you then go on to materialize your collections somewhere else (like Snowflake), you get synchronized tables that follow your source DB as they update.
But! You can also transform or materialize the complete history of your tables for auditing purposes from that same underlying data-lake, without going back to your source DB for another capture / WAL reader.
That sure sounds cool but I can't tell from your website that it does any of that. Even giving up on the marketing copy and going straight to the docs... I can't follow them.
I found that referencing session variables from triggers lets me add additional information (eg. a comment on why the change is being made) to the history. I've only done it in a small personal project, but it's worked well there so far.
I agree and this is a good approach - and how we power the activity feed in our app. But it doesn’t solve the issue of “pushing the changes” out. Of course, you can always listen to the audit table WAL changes - best of both worlds.
I enjoyed this blog. I think it provides a great succinct overview of various approaches native to Postgres.
For the "capture changes in an audit table" section, I've had good experiences at a previous company with the Temporal Tables pattern. Unlike other major RDBMS vendors, it's not built into Postgres itself, but there's a simple pattern [1] you can leverage with a SQL function.
This allows you to see a table's state as of a specific point in time. Some sample use cases:
- "What was this user's configuration on Aug 12?"
- "How many records were unprocessed at 11:55pm last night?"
- "Show me the diff on feature flags between now and a week ago"
I once consulted at a company with a very large monolithic SQL Server. It actually wasn't Postgres but let's pretend it was.
It had been around for decades and over time it had ended up being used for all sorts of things within the company. In fact, it was more or less true that every application and business process within the whole company stored its data within this database.
A key issue we had was that because this database had many different applications that queried it, and there were a huge number of processes and procedures that inserted or updated data within it, sometimes queries would break due to upstream insert/update processes being amended or new ones added that broke application-level invariants -- or when a normal process operated differently when there was bad data.
It was very difficult to work out what had happened because often everything that you looked at was written a decade before you and the employees had long since left the company.
Would it be possible to capture changes from a Postgres database in some kind of DAG in order that you could find out things like:
- What processes are inserting, updating or deleting data and historically how are they behaving? For example, do they operate differently ever?
- How are different applications' querying this data? Are there any statistics about their queries which are generally true? Historically how are these statistics changing?
I don't know if there is prior art here, or what kind of approach might allow a tool like this to be made?
(I've thought of making something like this before but I think this is an area in which you'd want to be a core Postgres engineer to make good choices.)
Logical replication, in Postgres, contains all of the information about the change statement (insert/update/delete) in order to logically recreate the same state in another database.
You won't get client-level providence data with each change...
However you could hack around that. The logical replication stream can also include informational messages from the "pg_logical_emit_message" function to insert your own metadata from clients. It might be possible to configure your clients to emit their identifier at the beginning of each transaction.
I’m not sure how to handle queries, but for inserts/updates I have a column that tracks the event source (last updated by). Maybe this is an anti-pattern - I’d love a more robust solution.
Technically log replication has everything done by everything, and if you are careful with triggers you can also track everything as well, using a DDL/DML capture table(DCL too if you're worried!).
These approaches work on basically every type of SQL solution that uses WAL/triggers.
For your specific question I have a trigger approach many times in SQL Server but it has a tendency to slow things down if you are logging every query so designing an insertion mechanism that doesn't bog down production isn't perfect, and you might want to perform some sampling.
An approach I've taken is temporal tables w/ Application and UpdatedBy fields. That gives you a permanent record of every change, what application did it, and what user performed the action, at what time, and then lets you query the database as if you were querying it at that point in time. You can add triggers to fail CRUD if those fields are not updated if you want to get really paranoid.
There's a lot of overhead to this in terms of storage, so it's not suitable for high-throughput or cost-constrained transactional systems, but it's something for the toolbox.
Even if it doesn't you can start by adding the application name to the connection string and you could probably do something gnarly with triggers to write that in a table and get it pushed in the logical replication.
I had the following idea recently: Go through all scripts/programs that send queries to the db and append a comment to the queey containing a unique id for that query that links it to the script/program. The query log hopefully shows the comment with the id so you can trace the origin.
For the "I just need to emit and visualize a DAG" problem, at one point I wrote a python script that would filter for the relevant data, emit line-by-line mermaidJS-flavored markdown, and then shove that into an HTML file that imported the MermaidJS library.
The MermaidJS solves for the DAG and visualizes it, and your browser has enough context to let you CTRL-F for any interesting phrases you put in the label.
Relax, don't do it. When you want this, you're turning the relations in postgres into a contract. No service gets to persist internal state. If you're _really_ committed to domain driven design it could work... but you'd be better off with a light (but real) event driven system.
Sequences kind of have the same issue, because you don't know if a gap is because of a rollback or an uncommitted transaction. Though with some logic you can do a pretty good job at this with sequences. And then you're not in the realm of "simple" anymore, at all.
Woah, that's news to me. Is that true even if triggers are used to update a column?
CREATE OR REPLACE FUNCTION
update_updated_at_function()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = now();
RETURN NEW;
END;
$$ language 'plpgsql';
CREATE TRIGGER
update_updated_at_trigger
BEFORE INSERT OR UPDATE ON
"my_schema"."my_table"
FOR EACH ROW EXECUTE PROCEDURE
update_updated_at_function();
END $$;
Is it possible for two rows to have `updated_at` timestamps that are different from the transaction commit order even if the above function and trigger are used? It's alright if `updated_at` and the commit timestamp are not the same, but the `updated_at` must represent commit order accurate to the millisecond/microsecond.
To confirm your fear, you can't use the updated_at timestamp as a proxy for commit order. The commits happen in a different order, and can be arbitrarily far apart, like hours or days depending on how long your transactions can last.
now() is the timestamp the transaction began at. There is no function to return the commit timestamp because you have to write the value before you commit.
For polling, instead of updated_at, I use a _txid column that gets set by a trigger to the current transaction ID. Then, when polling, use txid_current() to see which transactions have committed and which haven’t. It’s a little dicey and super easy to hit fencepost errors, but it’s been running smoothly in production for a few years.
There’s been attempts to revise the SQL standard to accommodate various types of temporality as a “first party” feature.
I think that we won’t see traction at the RDBMS “kernel space” until it’s in the SQL standard. There are many valid and complex options to choose from, and there are successful solutions in user space that aren’t overly burdened, performance-wise, from being in user space.
FWIW, the “audit table” approach is the approach that people who study this field gravitate towards. Mainly because it maintains consistent ACIDity in the database, and maintains Postgres as the single point of failure (a trade off vs introducing a proxy/polling job).
Here's a quick rundown of how to do it generically https://gist.github.com/slotrans/353952c4f383596e6fe8777db5d... (trades off space efficiency for "being easy").
It's great if you can store immutable data. Really, really great. But you _probably_ have a ton of mutable data in your database and you are _probably_ forgetting a ton of it every day. Stop forgetting things! Use history tables.
cf. https://github.com/matthiasn/talk-transcripts/blob/master/Hi...
Do not use Papertrail or similar application-space history tracking libraries/techniques. They are slow, error-prone, and incapable of capturing any DB changes that bypass your app stack (which you probably have, and should). Worth remembering that _any_ attempt to capture an "updated" timestamp from your app is fundamentally incorrect, because each of your webheads has its own clock. Use the database clock! It's the only one that's correct!
Yes, for consistency you should use the database clock by embedding the calls to `now()` or similar in the query instead of generating it on the client.
But it's not sufficient to use these timestamps for synchronization. The problem is that these timestamps are generated at the start of the transaction, not the end of the transaction when it commits. So if you poll a table and filter for recent timestamps, you'll miss some from transactions that are committing out of order. You can add a fudge factor like querying back an extra few minutes and removing the duplicates, but some transactions will take longer than a few minutes. There's no upper bound to how long a transaction can take in postgresql, and there's a lot of waste in querying too far back. This approach doesn't work if you care about correctness or efficiency.
There's also `statement_timestamp()` per the docs: "returns the start time of the current statement (more specifically, the time of receipt of the latest command message from the client)."
https://www.postgresql.org/docs/current/functions-datetime.h...
This isn't to say any of these methods are the best (or even good) in all cases. Time is tricky, especially if you're trying to do any sort of sequencing.
By default, if you then go on to materialize your collections somewhere else (like Snowflake), you get synchronized tables that follow your source DB as they update.
But! You can also transform or materialize the complete history of your tables for auditing purposes from that same underlying data-lake, without going back to your source DB for another capture / WAL reader.
[^1]: https://www.infoq.com/articles/wonders-of-postgres-logical-d...
I have my own SQLite implementation of a similar pattern (but using columns rather than JSON) which I describe here: https://simonwillison.net/2023/Apr/15/sqlite-history/
Deleted Comment
For the "capture changes in an audit table" section, I've had good experiences at a previous company with the Temporal Tables pattern. Unlike other major RDBMS vendors, it's not built into Postgres itself, but there's a simple pattern [1] you can leverage with a SQL function.
This allows you to see a table's state as of a specific point in time. Some sample use cases:
- "What was this user's configuration on Aug 12?"
- "How many records were unprocessed at 11:55pm last night?"
- "Show me the diff on feature flags between now and a week ago"
[1]: https://github.com/nearform/temporal_tables
It had been around for decades and over time it had ended up being used for all sorts of things within the company. In fact, it was more or less true that every application and business process within the whole company stored its data within this database.
A key issue we had was that because this database had many different applications that queried it, and there were a huge number of processes and procedures that inserted or updated data within it, sometimes queries would break due to upstream insert/update processes being amended or new ones added that broke application-level invariants -- or when a normal process operated differently when there was bad data.
It was very difficult to work out what had happened because often everything that you looked at was written a decade before you and the employees had long since left the company.
Would it be possible to capture changes from a Postgres database in some kind of DAG in order that you could find out things like:
- What processes are inserting, updating or deleting data and historically how are they behaving? For example, do they operate differently ever?
- How are different applications' querying this data? Are there any statistics about their queries which are generally true? Historically how are these statistics changing?
I don't know if there is prior art here, or what kind of approach might allow a tool like this to be made?
(I've thought of making something like this before but I think this is an area in which you'd want to be a core Postgres engineer to make good choices.)
You won't get client-level providence data with each change...
However you could hack around that. The logical replication stream can also include informational messages from the "pg_logical_emit_message" function to insert your own metadata from clients. It might be possible to configure your clients to emit their identifier at the beginning of each transaction.
These approaches work on basically every type of SQL solution that uses WAL/triggers.
For your specific question I have a trigger approach many times in SQL Server but it has a tendency to slow things down if you are logging every query so designing an insertion mechanism that doesn't bog down production isn't perfect, and you might want to perform some sampling.
An approach I've taken is temporal tables w/ Application and UpdatedBy fields. That gives you a permanent record of every change, what application did it, and what user performed the action, at what time, and then lets you query the database as if you were querying it at that point in time. You can add triggers to fail CRUD if those fields are not updated if you want to get really paranoid.
There's a lot of overhead to this in terms of storage, so it's not suitable for high-throughput or cost-constrained transactional systems, but it's something for the toolbox.
The MermaidJS solves for the DAG and visualizes it, and your browser has enough context to let you CTRL-F for any interesting phrases you put in the label.
https://github.com/pgaudit/pgaudit/blob/master/README.md
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appen...
[0] https://github.com/arkhipov/temporal_tables [1] https://news.ycombinator.com/item?id=26748096
Deleted Comment
Event-driven anything is 1000x more complex.
1. Transaction A starts, its before trigger fires, Row 1 has its updated_at timestamp set to 2023-09-22 12:00:01.
2. Transaction B starts a moment later, its before trigger fires, Row 2 has its updated_at timestamp set to 2023-09-22 12:00:02.
3. Transaction B commits successfully.
4. Polling query runs, sees Row 2 as the latest change, and updates its cursor to 2023-09-22 12:00:02.
5. Transaction A then commits successfully.
A simple way to avoid this issue is to not poll close to real-time, as the order is eventually consistent.
Perhaps a more robust suggestion would be to use a sequence? Imagine a new column, `updated_at_idx`, that incremented every time a row was changed.
If you're an Elixir & Postgres user, I have a little library for listening to WAL changes using a similar approach:
https://github.com/cpursley/walex
Would be great if Postgres innovated in this area.
I think that we won’t see traction at the RDBMS “kernel space” until it’s in the SQL standard. There are many valid and complex options to choose from, and there are successful solutions in user space that aren’t overly burdened, performance-wise, from being in user space.
FWIW, the “audit table” approach is the approach that people who study this field gravitate towards. Mainly because it maintains consistent ACIDity in the database, and maintains Postgres as the single point of failure (a trade off vs introducing a proxy/polling job).