We run all of our stateful and stateless workloads on 10+ kubernetes clusters at work in multiple datacenters in multiple continents, and we serve 500 million users a month with it.
I wrote the first BORG version of DFP backend systems at Google, where we served billions of users billions of ads a day, and we used stateful infrastructure management on some of the first container runtime systems that inspired k8s during it's development.
Using rabbit and "most databases" native fallover strategy is fine for toy projects, but when you're operating at this scale, you need automated infrastructure provisioning and all of the automated tooling around it.
Maybe a T instead of a B.
I run https://atomictessellator.com solo, using kubernetes, and my database, Minio object store, application servers, quantum workers, everything is all on kubernetes, it’s self healing and much simpler to run all the infrastructure the same.
Recently I had a node failure while I was sleeping and the whole system healed itself while I slept, the monitoring system didn’t even alarm me because the small blip of increased latency while the pods rebalanced wasn’t above the alert threshold so it didn’t even wake me up.
What happens in the article infra when the rabbitmq or database nodes fail? The whole system goes offline, which seems very silly setup when you have kubernetes sitting right there, who’s primary function is to handle all of this.
Edit: Now China’s shooting them down! https://www.forbes.com/sites/mattnovak/2023/02/12/china-says...