I hope to find the time to build a PoC one day…
I hope to find the time to build a PoC one day…
1) A staging cluster for testing updates is really a must. YOLO-ing prod updates on a Sunday is no one's idea of fun.
2) Application level replication is king, followed by block-level replication (we use OpenEBS/Mayastor). After going through all the Postgres operators we found StackGres to (currently) be the best.
3) The Ansible playbooks are your assets. Once you have them down and well-commented for a given service then re-deploying that service in other cases (or again in the future) becomes straightforward.
4) If you can I'd recommend a dedicated 10G network to connect your servers. 1G just isn't quite enough when it comes to the combined load of prod traffic, plus image pulls, plus inter-service traffic. This also gives a 10x latency improvement over AWS intra-az.
5) If you want network redundancy you can create a 1G vSwitch (VLAN) on the 1G ports for internal use. Give each server a loopback IP, then use BGP to distribute routes (bird).
6) MinIO clusters (via the operator) are not that tricky to operate as long as you follow the well trodden path. This provides you with local high-bandwidth, low-latency object storage.
7) The initial investment to do this does take time. I'd put it at 2-4 months of undistracted skilled engineering time.
8) You can still push ancillary/annoying tasks off onto cloud providers (personally I'm a fan of CloudFlare for HTTP load balancing).
[1]: https://lithus.eu
Are you willing to share example config for that part?
I don't think building something like `ko` with it should be that much work. (I know, famous last words)
I'll have to check with $employer if it's ok to open source it.
Among all the cloud storages we've tried to use with Rclone, Azure Blobs turned out to be the fastest so far.
The bitter part of the story is that I did not know about Rclone existence until a recent year or two. I still feel ashamed about that.
I started my Nix journey a little over a year ago, and I regret not having switched sooner. A package-manager that also ships an operating system that can be customized from the bootloader up, using a purely functional programming language is the perfect configuration management tool!
It does have some rough edges, and I did lose some hair figuring things out early on, but it has been getting better with each passing day. Pretty much the entirety of my setup at home is now built by Nix and runs NixOS, including my Macbook Air (runs NixOS on ZFS), and two Mac-Minis that PXE-boot a custom NixOS served by a Raspberry Pi 4 running a custom NixOS configuration that also acts as a firewall connecting wirlessly to my ISP's router. The Mac-Minis also double as build machines which makes for a pretty smooth experience when I'm building anything on my work laptop (a 2020 Macbook Pro running Big Sur) that I dock with my CinemaDisplay, which is wired thru an unmanaged switch to the rest.
So far I haven't missed any packages that I could not find in nixpkgs, or customize just the way I wanted to. The community is pretty responsive and quick to merge any pull requests for fixes/upgrades. I would whole-heartedly recommend switching to Nix/NixOS/Nixpkgs.
They'll do warehouse scale computing with borg operating large clusters. borg is at the bottom.
The workloads spanning dev, test, and prod then run on these clusters. By having large clusters with lots of things running on them they get high utilization of the hardware and need less hardware.
It's amusing to see k8s used in such a different way and one that often uses a lot more hardware while driving up costs. Concepts Google used to lower the cost.
Or, maybe I read the papers and book wrong.
I like the idea of higher utilization and better efficiency because it uses less resources which is more green.
I don't have a PM.
If you have a manger, that decides this, or the PO decides this, IMO that’s a key problem of your scrum implementation.