Hey HN! For a few months, I've been building an in-memory version of Postgres at work. It has full feature parity with production databases.
The cool thing about it is that you don't need any external processes or proxies. If your platform can run WASM (Node.js, browser, etc.), it can probably run pgmock. Creating a new database with mock data is as simple as creating a JavaScript object.
It's a bit different from the amazing pglite [1] (which inspired me to open-source pgmock in the first place). pgmock runs an x86 emulator with the original Postgres inside, while pglite compiles a Postgres fork to native WASM directly and is hence much faster and more lightweight. However, it only supports single-user mode and a select few extensions, so you can't connect to it with normal Postgres clients (which is quite crucial for E2E testing).
Theoretically, it could be modified to run any Docker image on WebAssembly platforms. Anything specific you'd like to see?
Correct on PGlite only being single user at the moment, and that certainly is a problem for using it for integration tests in some environments. But I'm hopeful we can bring a multi-connection mode to it, I have a few ideas how, but it will be a while before we do.
There are a few other limitations with PGlite at the moment (related to it being single user mode), such as lacking support for pg_notify (have plans to fix this too). Whereas with this it should "just work" as it's much closer to a real Postgres.
I think there is a big future for these in-memory Postgres projects for testing, it's looks like test run times can be brought down to less than a 1/4 with them.
Ooh! The 'docker image on WASM' thing sounds promising for a wide range of problems. Recently I wanted to run a FFMPEG/SoX pipeline on the client - too many dependencies to easily recompile with Emscripten; could your approach help there?
If it could support the pgvector extension it would be a super fast vector database with all the power of Pg - the relational aspect brings the ability to add and query using rich domain specific metadata usually contained in relational databases.
Yes, that's a nonstandard function provided by v8, so it wouldn't work on Firefox. [1]
This can be worked around by just constructing an Error and taking it's stack property, captureStackTrace is just a convenience function, so hopefully they can fix that.
Explicitly mentioned in the comment as a drawback. In practice E2E means "E2E as much as humanly possible", and I'm glad to see any work that can help.
Why not just run Postgres with it's files on a ramdisk?
Update: this can apparently run in a browser/Node environment so can be created/updated/destroyed by the tests. I guess I'm too much of a backend dev to understand the advantage over a more typical dev setup. Can someone elaborate on where/when/how this is better?
That's more or less what happens inside the emulator (the emulated disk is an in-memory 9P file system). It's in WebAssembly because that makes it more portable (same behaviour across platforms, architectures, and even in the browser or edge environments), and there are no external dependencies (not even Docker).
Because the emulator lets us boot an "already launched" state directly, it's also faster to boot up the emulated database than spinning up a real one (or Docker container), but this was more of a happy accident than a design goal.
Can you give a specific / concrete example of why I would want to use this instead of running a postgres server a different way (docker, binary, whatever) and having the tests connect to that server? I really don't understand when this would be useful.
It is annoying is you want to run your teat inside a container for ci and now you are running a container in a container and all the issues that come with it.
It's the same amount of code and on Mac you still run a full VM to load containers (with a network stack), so I'm not really sure what your point is. If anything it's less code because the notion of the container is entirely abstracted away, and the whole thing is entirely a wasm dependency that you load as a normal import.
The fact that this can run in-process is a big deal, as it means you don't have to worry about cleanup.
As soon as you have external processes that your tests depend on, your tests need some sort of wrapper or orchestrator to set everything up before starting tests, and ideally tear it down after.
In 90% of cases I see, that orchestration is done in an extremely non-portable way (like leveraging tools built in to your CI system) which can make reproducing test failures a huge pain in the ass.
The whole purpose of End to End testing is that your testing the system in a real state. It's an emulation of your live environment. Because of that you can do interesting things like find out what happens if you pull the plug or run out of disk or ....
The moment that you shove a mock in there, your unit testing. Effective but not the same. One of the critical points of E2E is that without mocks you know that your tests are accurate. Because this isnt Postgres I'm testing it every time and not that system.
>> Can someone elaborate on where/when/how this is better?
If your building PG for an embedded, light weight, or under powered system then this would make sense for verification testing before real E2E testing that would be much slower. (a use case I have)
Other than that its just a cool project and if you ever need a PG shim it's there.
I think you're being a little absolutist about this. Swapping out a possibly equivalent database engine does not turn anything into a unit test, which is defined by testing individual units of code in relative isolation. You can argue that it's not true end to end testing. But almost every E2E test I've seen involves some compromises compared with the true production environment to save money, time, or effort.
> If your building PG for an embedded, light weight, or under powered system then this would make sense for verification testing before real E2E testing that would be much slower. (a use case I have)
If this is actually just Postgres running in an x86 emulator (*edit: originally this said "compiled to wasm"), then how could this be faster than Postgres in any given environment? I don't understand — if it were faster, wouldn't you just want to deploy this in prod in your weird environment rather than Postgres? Why limit this to mocking?
Nah, by having in-memory versions of your dependencies, in-memory versions which fulfill the same interfaces as those used in your E2E tests (or the majority of your E2E tests) you unlock running your entire E2E tests suite in milliseconds-to-seconds instead of minutes-to-seconds. And because they're E2E tests that work with any implementation, you can still run your exact same test suite against your "real" E2E dependencies in a CI step to be super sure both implementations behave the same.
I've done this across multiple jobs, and it's amazing to be able to run your "mostly-E2E" tests in 1-2 seconds while developing and the same suite in the full E2E env in CI. It makes developing with confidence so fast and mostly stress free (diverging behavior is admittedly annoying, but usually rare).
Until you trust every part of the mock behaves the same as every part of the real database you use… most often the db is your boundary with nothing further downstream. At that point it really is just a faster disposable database, and totally is valid acceptance tests for the e2e system.
Also nothing stops you from using a mock for some tests and a real database for others. It just comes down to trust.
It could be useful for test isolation, moving the Redis backend to FakeRedis in tests fixed quite a bit of noise in our test suite. With Postgres we use savepoints which is not very fast, even on a ramdisk.
I used to run all kinds of (custom) fake in-memory servers in my tests. Nowadays I just run the real thing using Testcontainers (https://testcontainers.com)
For prisma/nodejs devs who just want postgres-in-a-can for local dev you are better off using the recently released serverizing of pglite, pglite-server: https://github.com/kamilogorek/pglite-server
It's faster, can persist data to fs, though less stable under heavy use than the full x86 emu e2e test server. I found pglite-server uses only 150MB ram compared to 830MB for pgmock-server. You can then use dotenv to checkout a new .env.local with updated DATABASE_URL for all your nextjs/prisma package.json run scripts
DATABASE_URL="postgresql://postgres@localhost:5432/awesomeproject"
"db:pushlocal": "dotenv -e .env.local -- pnpm prisma db push"
Very easy to add to any project, No wonder neon is sponsoring this space.
Hate to be w downer but I’d never consider this for use.
For trivial applications maybe it’d work, but with more complexity like anything that has risk of deadlocking or depends on the database shape and such solution subtracts from value as even small shift in behavior can snowball into critical problems.
Today I lean towards resource constrained E2E environment so that local test runners have opportunity to break if someone write anything grossly underperforming.
Not to mention that snapshotting DB after second and distributing this snapshot to test partitions is super fast and many times shaved multiple minutes from test suites.
It’s an interesting idea and definitely great learning experience but I think that target audience is limited.
Off-topic, but the title confused me a bit - "...I built at work." Doesn't this imply that the intellectual property for this project belongs to your employer, assuming you used resources from work? If so, are you technically allowed to open-source it?
The cool thing about it is that you don't need any external processes or proxies. If your platform can run WASM (Node.js, browser, etc.), it can probably run pgmock. Creating a new database with mock data is as simple as creating a JavaScript object.
It's a bit different from the amazing pglite [1] (which inspired me to open-source pgmock in the first place). pgmock runs an x86 emulator with the original Postgres inside, while pglite compiles a Postgres fork to native WASM directly and is hence much faster and more lightweight. However, it only supports single-user mode and a select few extensions, so you can't connect to it with normal Postgres clients (which is quite crucial for E2E testing).
Theoretically, it could be modified to run any Docker image on WebAssembly platforms. Anything specific you'd like to see?
Happy hacking!
[1] https://github.com/electric-sql/pglite
Correct on PGlite only being single user at the moment, and that certainly is a problem for using it for integration tests in some environments. But I'm hopeful we can bring a multi-connection mode to it, I have a few ideas how, but it will be a while before we do.
There are a few other limitations with PGlite at the moment (related to it being single user mode), such as lacking support for pg_notify (have plans to fix this too). Whereas with this it should "just work" as it's much closer to a real Postgres.
I think there is a big future for these in-memory Postgres projects for testing, it's looks like test run times can be brought down to less than a 1/4 with them.
(I work on PGlite)
https://github.com/ffmpegwasm/ffmpeg.wasm
And then lancedb released their embedded client for rust, so I went towards that. But it's still lacking FTS. So I fell back to sqlite. have some notes here https://shelbyjenkins.github.io/blog/retrieval-is-all-you-ne...
This can be worked around by just constructing an Error and taking it's stack property, captureStackTrace is just a convenience function, so hopefully they can fix that.
[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...
Why can't Postgres compile to WASM instead of x86?
Update: this can apparently run in a browser/Node environment so can be created/updated/destroyed by the tests. I guess I'm too much of a backend dev to understand the advantage over a more typical dev setup. Can someone elaborate on where/when/how this is better?
Because the emulator lets us boot an "already launched" state directly, it's also faster to boot up the emulated database than spinning up a real one (or Docker container), but this was more of a happy accident than a design goal.
Why not use something like https://testcontainers.com/? Is a container engine as an external dependency that bad?
As soon as you have external processes that your tests depend on, your tests need some sort of wrapper or orchestrator to set everything up before starting tests, and ideally tear it down after.
In 90% of cases I see, that orchestration is done in an extremely non-portable way (like leveraging tools built in to your CI system) which can make reproducing test failures a huge pain in the ass.
The moment that you shove a mock in there, your unit testing. Effective but not the same. One of the critical points of E2E is that without mocks you know that your tests are accurate. Because this isnt Postgres I'm testing it every time and not that system.
>> Can someone elaborate on where/when/how this is better?
If your building PG for an embedded, light weight, or under powered system then this would make sense for verification testing before real E2E testing that would be much slower. (a use case I have)
Other than that its just a cool project and if you ever need a PG shim it's there.
If this is actually just Postgres running in an x86 emulator (*edit: originally this said "compiled to wasm"), then how could this be faster than Postgres in any given environment? I don't understand — if it were faster, wouldn't you just want to deploy this in prod in your weird environment rather than Postgres? Why limit this to mocking?
I've done this across multiple jobs, and it's amazing to be able to run your "mostly-E2E" tests in 1-2 seconds while developing and the same suite in the full E2E env in CI. It makes developing with confidence so fast and mostly stress free (diverging behavior is admittedly annoying, but usually rare).
I highly recommend using these if feasible.
Also nothing stops you from using a mock for some tests and a real database for others. It just comes down to trust.
It's faster, can persist data to fs, though less stable under heavy use than the full x86 emu e2e test server. I found pglite-server uses only 150MB ram compared to 830MB for pgmock-server. You can then use dotenv to checkout a new .env.local with updated DATABASE_URL for all your nextjs/prisma package.json run scripts
Very easy to add to any project, No wonder neon is sponsoring this space.For trivial applications maybe it’d work, but with more complexity like anything that has risk of deadlocking or depends on the database shape and such solution subtracts from value as even small shift in behavior can snowball into critical problems.
Today I lean towards resource constrained E2E environment so that local test runners have opportunity to break if someone write anything grossly underperforming.
Not to mention that snapshotting DB after second and distributing this snapshot to test partitions is super fast and many times shaved multiple minutes from test suites.
It’s an interesting idea and definitely great learning experience but I think that target audience is limited.
* What was the inspiration for developing this project at work? Was running Postgres in a Docker container too slow?
* What did your CI setup for E2E tests look like before and after integrating pgmock into the flow?
* Was migrating over to this solution difficult?
Thanks!