For document databases, I'm more interested in things like PoloDB and SurrealDB.
For document databases, I'm more interested in things like PoloDB and SurrealDB.
Thanks for sharing! My choices are pretty coloured by personal experience, and I didn't want to re-tread anything from the book (Redis/Valkey, Neo4j etc) other than Postgres - mostly due to Postgres changing _a lot_ over the years.
I had considered an OSS Dynamo-like (Cassandra, ScyllaDB, kinda), or a Calvin-like (FaunaDB), but went with FoundationDB instead because to me, that was much more interesting.
After a decade of running DBaaS at massive scale, I'm also pretty biased towards easy-to-run.
ditto, "worked on local" is a meme for a reason.
1. I think the idea of local-equal-to-prod is noble, and getting them as close as possible should be the goal, but is not possible. In the example, they're using a dockerized postgres, prod is probably a managed DB service. They're using docker compose, prod is likely ECS/K8S/DO/some other service that uses the image (with more complicated service definitions). Local is probably some VM linux kernel, prod is some other kernel. Your local dev is using mounted code, prod is probably baked in code. Maybe local is ARM64, and prod is AMD64.
I say this not because I want to take away from the idea of matching dev and prod as much as possible, but to highlight they're inherently going to be very different. So deploying your code with linters, or in debug mode, and getting slower container start times at best, worse production performance at worse - just to pretend envs which are wildly different aren't different seems silly. Moreover if you test in CI, you're much more likely to get to a prod-like infra than a laptop.
2. Cost will also prohibit this. Do you have your APM service running on every dev node, are you paying for that for all the developer machines for no benefit so things are the same. If you're integrating with salesforce, do you pay for a sandbox for every dev so things are the same. Again, keeping things as similar as possible should be a critical goal, but their are cost realities that again make that impossible to be perfect.
3. In my experience if you actually want to achieve this, you need a remote dev setup. Have your code deployed in K8S / ECS / whatever with remote dev tooling in place. That way your DNS discovery is the same, kernels are the same, etc. Sometimes this is worth it, sometimes it isn't.
I don't want to be negative, but if one of my engineers came to me saying they wanted to deploy images built from their machine, with all the dev niceties enabled, to go to prod, rather than proper CI/CD of prod optimized images, I'd have a hard time being sold on that.
[1] When you run `git annex add` it hashes the file and moves the original file to a `.git/annex/data` folder under the hash/content addressable file system, like git. Then it replaces the original file with a symlink to this hashed file path. The file is marked as read only, so any command in any language which tries to write to it will error (you can always `git annex unlock` so you can write to it). If you have duplicated files, they easily point to the same hashed location. As long as you git push normally and back up the `.git/annex/data` you're totally version controlled, and you can share the subset of files as needed
_Docker_ is a security hazard, and anything it touches is toxic.
Every single package, every single dependency, that has an actively exploited security flaw is being exploited in the Docker images you're using, unless you built them yourself, with brand new binaries. Do not trust anyone except official distro packages (unless you're on Ubuntu, then don't trust them either).
And if you're going to do that... just go to _actual_ orchestration. And if you're not going to do that, because orchestration is too big for your use case, then just roll normal actual long lived VMs the way we've done it for the past 15 years.