I like that better than bringing racks into it, because once you have multiple machines in a rack you've got distributed systems problems, and there's a significant overlap between "big data" and the problems that a distributed system introduces.
Deleted Comment
I like that better than bringing racks into it, because once you have multiple machines in a rack you've got distributed systems problems, and there's a significant overlap between "big data" and the problems that a distributed system introduces.
It cuts down on the per-instance memory overhead, for cases where you're creating a ton of these objects. It can be useful even when not memory-constrained, because it will throw AttributeError, rather than succeeding silently, if you make a typo when assigning to an object attribute.
0: https://www.python.org/dev/peps/pep-0557/#support-for-automa...
http://techspot.zzzeek.org/2015/02/15/asynchronous-python-an...
This means that the "frontend" of a service can be asyncio, allowing it to support features like WebSockets that are non-trivial to support without aiohttp or a similiar asyncio-native HTTP server [2], while the "backend" of the service can be multi-threaded or multi-process for CPU-bound work.
0: https://docs.python.org/3/library/asyncio-eventloop.html#exe...
1: https://docs.python.org/3/library/asyncio-eventloop.html#asy...
2: Flask-SocketIO, for example, requires that you use eventlet or gevent, which are the "legacy" ways of doing asynchronous IO: https://flask-socketio.readthedocs.io/en/latest/
EDIT: I should have clarified, I want to self-host this on our internal VMWare cluster, rather than run it on GKE.
Similar use case - self-hosted VMs, for low-traffic, internal tools, and no need for autoscaling.
I can't speak to how well it integrates with Gitlab's Auto DevOps, but Nomad integrates very well with Terraform[1] and I'd be surprised if there wasn't a way to plug Terraform into Gitlab's process.
Under classic RDS, when your application makes a SQL connection (the data plane) it's talking to a more or less stock Postgres instance, the same as you would have if you ran it locally.
Aurora, on the other hand, is involved in both the control plane and data plane. Your SQL connection is to a Postgres instance that's been forked/modified to work within Aurora.
Then in the process of redeveloping, as more and more properties are redeveloped, the old character is lost to everybody- the people who lived there as well as the new people who moved there for that character!
It's not a new conundrum- what do you do about people moving to a community for it's desirable character, but killing that character in the process? It hurts everyone involved.
HN frequently likes to take the position that the parcel you bought is yours, but if your neighbor wants to bootstrap a red light district you just have to deal. But, all over the country we have HOA's and zoning and so forth. Turns out, people want to come together and live in a community. Nobody particularly likes living in a free-for-all, so mutual agreements were set up to ensure the neighborhood you bought into doesn't turn into something totally different overnight.
Here's what I think is the central (and flawed) assumption in this line of reasoning - people move to an area because of its "character". And that "character" is an intangible, immeasurable quality, but it is somehow diminished if more people move to the area.
I grew up in Seattle. Both of my grandparents, when I was a kid, lived in Seattle's Fremont neighborhood. I live in Fremont today. From one perspective, the Fremont of my childhood is completely changed. On the other hand, it's still Fremont, with the Center of the Universe sign and the statue of Lenin and many other things I remember from childhood. Does it have the same "character"? Does it have a newer, different, but just as good, "character"?
Those are impossible questions and it boils down to a Ship of Theseus style argument. Either way, I can't bring myself to assert that the housing supply of Fremont should be artificially constrained by zoning policies, in order to preserve my ideal of what Fremont "should be" or "used to be".
Single-server setups larger than 2U but (usually) smaller than 1 rack can give tremendous bang for the buck, no matter if your "bang" is peak throughput or total storage. (And, no, I don't mean spending inordinate amounts on brand-name "SAN" gear).
There's even another category of servers, arguably non-commodity, since one can pay a 2x price premium (but only for the server itself, not the storage), that can quadruple the CPU and RAM capacity, if not I/O throughput of the cheaper version.
I think the ignorance of what hardware capabilities are actually out there ended up driving well-intentioned (usually software) engineers to choose distributed systems solutions, with all their ensuing complexity.
Today, part of the driver is how few underlying hardware choices one has from "cloud" providers and how anemic the I/O performance is.
It's sad, really, since SSDs have so greatly reduced the penalty for data not fitting in RAM (while still being local). The penalty for being at the end of an ethernet, however, can be far greater than that of a spinning disk.
As a software engineer who builds their own desktops (and has for the last 10 years) but mostly works with AWS instances at $dayjob, are there any resources you'd recommend for learning about what's available in the land of that higher-end rackmount equipment? Short of going full homelab, tripling my power bill, and heating my apartment up to 30C, I mean...