For what it's worth, I've worked at multiple places that ran shell scripts just fine for their deploys.
- One had only 2 services [php] and ran over 1 billion requests a day. Deploy was trivial, ssh some new files to the server and run a migration, 0 downtime.
- One was in an industry that didn't need "Webscale" (retirement accounts). Prod deploys were just docker commands run by jenkins. We ran two servers per service from the day I joined the day I left 4 years later (3x growth), and ultimately removed one service and one database during all that growth.
Another outstanding thing about both of these places was that we had all the testing environments you need, on-demand, in minutes.
The place I'm at now is trying to do kubernetes and is failing miserably (ongoing nightmare 4 months in and probably at least 8 to go, when it was allegedly supposed to only take 3 total). It has one shared test environment that it takes 3-hours to see your changes in.
I don't fault kubernetes directly, I fault the overall complexity. But at the end of the day kubernetes feels like complexity trying to abstract over complexity, and often I find that's less successful that removing complexity in the first place.
If your application doesn't need and likely won't need to scale to large clusters, or multiple clusters, then there's nothing wrong per se. with your solution. I don't think k8s is that hard but there are a lot of moving pieces and there's a bit to learn. Finding someone with experience to help you can make a ton of difference.
Questions worth asking:
- Do you need a load balancer?
- TLS certs and rotation?
- Horizontal scalability.
- HA/DR
- dev/stage/production + being able to test/stage your complete stack on demand.
- CI/CD integrations, tools like ArgoCD or Spinnaker
- Monitoring and/or alerting with Prometheus and Grafana
- Would you benefit from being able to deploy a lot of off the shelf software (lessay Elastic Search, or some random database, or a monitoring stack) via helm quickly/easily.
- "Ingress"/proxy.
- DNS integrations.
If you answer yes to many of those questions there's really no better alternative than k8s. If you're building large enough scale web applications the almost to most of these will end up being yes at some point.
Every item on that list is "boring" tech. Approximately everyone have used load balancers, test environments and monitoring since the 90s just fine. What is it that you think make Kubernetes especially suited for this compared to every other solution during the past three decades?
There are good reasons to use Kubernetes, mainly if you are using public clouds and want to avoid lock-in. I may be partial, since managing it pays my bills. But it is complex, mostly unnecessarily so, and no one should be able to say with a straight face that it achieves better uptime or requires less personnel than any alternative. That's just sales talk, and should be a big warning sign.
Kubernetes is great example of the "second-system effect".
Kubernetes only works if you have a webapp written in a slow interpreted language. For anything else it is a huge impedance mismatch with what you're actually trying to do.
P.S. In the real world, Kubernetes isn't used to solve technical problems. It's used as a buffer between the dev team and the ops team, who usually have different schedules/budgets, and might even be different corporate entities. I'm sure there might be an easier way to solve that problem without dragging in Google's ridiculous and broken tech stack.
> If you answer yes to many of those questions there's really no better alternative than k8s.
This is not even close to true with even a small number of resources. The notion that k8s somehow is the only choice is right along the lines of “Java Enterprise Edition is the only choice” — ie a real failure of the imagination.
For startups and teams with limited resources, DO, fly.io and render are doing lots of interesting work. But what if you can’t use them? Is k8s your only choice?
Let’s say you’re a large orgs with good engineering leadership, and you have high-revenue systems where downtime isn’t okay. Also for compliance reasons public cloud isn’t okay.
DNS in a tightly controlled large enterprise internal network can be handled with relatively simple microservices. Your org will likely have something already though.
Dev/Stage/Production: if you can spin up instances on demand this is trivial. Also financial services and other regulated biz have been doing this for eons before k8s.
Load Balancers: lots of non-k8s options exist (software and hardware appliances).
Prometheus / Grafana (and things like Netdata) work very well even without k8s.
Load Balancing and Ingress is definitely the most interesting piece of the puzzle. Some choose nginx or Envoy, but there’s also teams that use their own ingress solution (sometimes open-sourced!)
But why would a team do this? Or more appropriately, why would their management spend on this? Answer: many don’t! But for those that do — the driver is usually cost*, availability and accountability, along with engineering capability as a secondary driver.
(*cost because it’s easy to set up a mixed ability team with experienced, mid-career and new engineers for this. You don’t need a team full of kernel hackers.)
It costs less than you think, it creates real accountability throughout the stack and most importantly you’ve now got a team of engineers who can rise to any reasonable challenge, and who can be cross pollinated throughout the org. In brief the goal is to have engineers not “k8s implementers” or “OpenShift implementers” or “Cloud Foundry implementers”.
> If you answer yes to many of those questions there's really no better alternative than k8s.
Nah, most of that list is basically free for any company that uses an amazon loadbalancer and an autoscale group. In terms of likeliness of incidents, time, and cost, those will each be an order of magnitude higher with a team of kubernetes engineers than less complex setup.
Containerization and orchestration of containers vs learning how to configure HaProxy, how to use Certbot, hmmmm
The questions you pose are legit skills web developers need to have.
Nothing you mentioned is obviated by K8s or containerization.
"oh but you can get someone elses pre-configured image" uh huh... sure, you can also install malware. You will also need to one day maintain or configure the software running in them.
You may even need to address issues your running software causes.
You can't do that without mastering the software you are running!
> People really underestimate the power of a shell scripts and ssh and trusted developers.
On the other hand, you seem to be underestimating the fact that even the best, most trusted developer can make a mistake from time to time. It's no disgrace, it's just life.
Besides the fact that shell scripts aren't scalable (in terms of horizontal scalability like actor model), I would also like to point out that shell scripts should be simple, but if you want to handle something that big, you essentially and definitely is using it as a programming language in disguise -- not ideal and I would like to go Go or Rust instead.
On the other hand, my team slapped 3 servers down in a datacenter, had each of them configured in a Proxmox cluster within a few hours. Some 8-10 hours later we had a fully configured kubernetes cluster running within Proxmox VMs, where the VMs and k8s cluster are created and configured using an automation workflow that we have running in GitHub Actions. An hour or two worth of work later we had several deployments running on it and serving requests.
Kubernetes is not simple. In fact it's even more complex than just running an executable with your linux distro's init system. The difference in my mind is that it's more complex for the system maintainer, but less complex for the person deploying workloads to it.
And that's before exploring all the benefits of kubernetes-ecosystem tooling like the Prometheus operator for k8s, or the horizontally scalable Loki deployments, for centrally collecting infrastructure and application metrics, and logs. In my mind, making the most of these kinds of tools, things start to look a bit easier even for the systems maintainers.
Not trying to discount your workplace too much. But I'd wager there's a few people that are maybe not owning up to the fact that it's their first time messing around with kubernetes.
As long as your organisation can cleanly either a) split the responsibility for the platform from the responsibility for the apps that run on it, and fund it properly, or b) do the exact opposite and accommodate all the responsibility for the platform into the app team, I can see it working.
The problems start when you're somewhere between those two points. If you've got a "throw it over the wall to ops" type organisation, it's going to go bad. If you've got an underfunded platform team so the app team has to pick up some of the slack, it's going to go bad. If the app team have to ask permission from the platform team before doing anything interesting, it's going to go bad.
The problem is that a lot of organisations will look at k8s and think it means something it doesn't. If you weren't willing to fund a platform team before k8s, I'd be sceptical that moving to it is going to end well.
> I can't see how it can take 4 months to figure it out.
Well have you ever tried moving a company with a dozen services onto kubernetes piece-by-piece, with zero downtime? How long would it take you to correctly move and test every permission, environment variable, and issue you run into?
Then if you get a single setting wrong (e.g. memory size) and don't load-test with realistic traffic, you bring down production, potentially lose customers, and have to do a public post-mortem about your mistakes? [true story for current employer]
I don't see how anybody says they'd move a large company to kubernetes in such an environment in a few months with no screwups and solid testing.
Using microk8s or k3s on one node works fine. As the author of "one big server," I am now working on an application that needs some GPUs and needs to be able to deploy on customer hardware, so k8s is natural. Our own hosted product runs on 2 servers, but it's ~10 containers (including databases, etc).
I never did choose any single thing in my job, just because of how it could look in my resume.
After +20 years of Linux sysadmin/devops, and because a spinal disc herniation last year, now I'm looking for a job.
99% of job offers, will ask for EKS/Kubernetes now.
It's like the VMware of the years 200[1-9], or like the "Cloud" of the years 201[1-9].
I've always specialized in physical datacenters and servers, being it on-premises, colocation, embedded, etc... so I'm out of the market now, at least in Spain (which always goes like 8 years behind the market).
You can try to avoid it, and it's nice when you save thousands of operational/performance/security/etc issues and dollars to your company across the years, and you look like a guru that goes ahead of industry issue to your boss eyes, but, it will make finding a job... 99% harder.
It doesn't matter if you demonstrate the highest level on Linux, scripting, ansible, networking, security, hardware, performance tuning, high availability, all kind of balancers, switching, routing, firewalls, encryption, backups, monitoring, log management, compliance, architecture, isolation, budget management, team management, provider/customer management, debugging, automation, programming full stack, and a long etc. If you say "I never worked with Kubernetes, but I learn fast", with your best sincerity at the interview, then you're automatically out of the process. No matter if you're talking with human resources, a helper of the CTO, or the CTO. You're out.
I think porting to k8s can succeed or fail, like any other project. I switched an app that I alone worked on, from Elastic Beanstalk (with Bash), to Kubernetes (with Babashka/Clojure). It didn't seem bad. I think k8s is basically a well-designed solution. I think of it as a declarative language which is sent to interpreters in k8s's control plane.
Obviously, some parts of took a while to figure out. For example, I needed to figure out an AWS security group problem with Ingress objects, that I recall wasn't well-documented. So I think parts of that declarative language can suck, if the declarative parts aren't well factored-out from the imperative parts. Or if the log messages don't help you diagnose errors, or if there isn't some kind of (dynamic?) linter that helps you notice problems quickly
In your team's case, more information seems needed to help us evaluate the problems. Why was it easier before to make testing environments, and harder now?
So, my current experience somewhere most old apps are very old school:
- most server software is waaaaaaay out of date so getting a dev / test env is a little harder (like last problem we got was the HAproxy version does not do ECDA keys for ssl certs, which is the default with certbot)
- yeah pushing to prod is "easy": FTP directly. But now which version of which files are really in prod? No idea. Yeah when I say old school it's old school before things like Jenkins.
- need something done around the servers? That's the OPS team job. Team which also has too much different work to do so now you'll have to wait a week or two for this simple "add an upload file" endpoint to this old API because you need somewhere to put those files.
Now we've started setting up some on-prem k8s nodes for the new developments. Not because we need crazy scaling but so the dev team can do most OPS they need. It takes time to have everything setup but once it started chugging along it felt good to be able to just declare whatever we need and get it.
You still need to get the devs to learn k8s which is not fun but that's the life of a dev: learning new things every day.
Also k8s does not do data. You want a database or anything managing files: you want to do most of the job outside k8s.
Kubernetes is so easy that you only need two or three dedicated full-time employees to keep the mountains of YAML from collapsing in on themselves before cutting costs and outsourcing your cluster management to someone else.
Sure, it can be easy, just pick one of the many cloud providers that fix all the complicated parts for you. Though, when you do that, expect to pay extra for the privilege, and maybe take a look at the much easier proprietary alternatives. In theory the entire thing is portable enough that you can just switch hosting providers, in practice you're never going to be able to do that without seriously rewriting part of your stack anyway.
The worst part is that the mountains of YAML were never supposed to be written by humans anyway, they're readable configuration your tooling is supposed to generate for you. You still need your bash scripts and your complicated deployment strategies, but rather than using them directly you're supposed to compile them into YAML first.
Kubernetes is nice and all but it's not worth the effort for the vast majority of websites and services. WordPress works just fine without automatic replication and end-to-end microservice TLS encryption.
I went down the Kubernetes path. The product I picked 4 years ago is no longer maintained :(
The biggest breaking change to docker compose since it was introduced was that the docker-compose command stopped working and I had to switch to «docker compose» with a space. Had I stuck with docker and docker-compose I could have trivially kept everything up to date and running smoothly.
I ran small bootstrapped startup , I used GKE. Everything was templated.
each app has it's own template e.g. nodejs-worker, and you don't change the template unless you really needed.
i spent ~2% of my manger+eng leader+hiring manger+ god knows what else people do at startup on managing 100+ microservices because they were templates.
This is an anti-pattern in my opinion. If you're on cloud provider A, might as well just write code for cloud provider A. If and when you'll be asked to switch to B you'll change the code to work on both A and B.
This is so unnuanced that it reads like rationalization to me. People seem to get stuck on mantras that simple things are inherently fragile which isn't really true, or at least not particularly more fragile than navigating a jungle of yaml files and k8s cottage industry products that link together in arcane ways and tend to be very hard to debug, or just to understand all the moving parts involved in the flow of a request and thus what can go wrong. I get the feeling that they mostly just don't like that it doesn't have professional aesthetics.
> People seem to get stuck on mantras that simple things are inherently fragile which isn't really true...
Ofc it isn't true.
Kubernetes was designed at Google at a time when Google was already a behemoth. 99.99% of all startups and SMEs out there shall never ever have the same scaling issues and automation needs that Google has.
Now that said... When you begin running VMs and containers, even only a very few of them, you immediately run into issues and then you begin to think: "Kubernetes is the solution". And it is. But it is also, in many cases, a solution to a problem you created. Still... the justification for creating that problem, if you're not Google scale, are highly disputable.
And, deep down, there's another very fundamental issue IMO: many of those "let's have only one process in one container" solutions actually mean "we're totally unable to write portable software working on several configs, so let's start with a machine with zero libs and dependencies and install exactly the minimum deps needed to make our ultra-fragile piece of shit of a software kinda work. And because it's still going to be a brittle piece of shit, let's make sure we use heartbeats and try to shut it down and back up again once it'll invariably have memory leaked and/or whatnots".
Then you also gained the right to be sloppy in the software you write: not respecting it. Treating it as cattle to be slaughtered, so it can be shitty. But you've now added an insane layer of complexity.
How do you like your uninitialized var when a container launchs but then silently doesn't work as expected? How do you like them logs in that case? Someone here as described the lack of instant failure on any uninitialized var as the "billion dollar mistake of the devops world".
Meanwhile look at some proper software like, say, the Linux kernel or a distro like Debian. Or compile Emacs or a browser from source and marvel at what's happening. Sure, there may be hickups but it works. On many configs. On many different hardware. On many different architectures. These are robust software that don't need to be "pid 1 on a pristine filesystem" to work properly.
In a way this whole "let's have all our software each as pid 1 each on a pristine OS and filesystem" is an admission of a very deep and profound failure of our entire field.
I don't think it's something to be celebrated.
And don't get me started on security: you know have ultra complicated LANs and VLANs, with a near impossible to monitor traffic, with shitloads of ports open everywhere, the most gigantic attack surface of them all and heartbeats and whatsnots constantly polluting the network, where nobody doesn't even know anymore what's going on. Where the only actual security seems to rely on the firewall being up and correctly configured, which is incredibly complicated to do seen the insane network complexity you added to your stack. "Oh wait, I have an idea, let's make configuring the firewall a service!" (and make sure to not forget to initialize one of the countless var or it'll all silently break and just be not be configuring firewalling for anything).
Now though love is true love: even at home I'm running an hypervisor with VMs and OCI containers ; )
> Meanwhile look at some proper software like, say, the Linux kernel or a distro like Debian. Or compile Emacs or a browser from source and marvel at what's happening. Sure, there may be hickups but it works. On many configs. On many different hardware. On many different architectures. These are robust software
Lol no. The build systems flake out if you look at them funny. The build requirements are whatever Joe in Nebraska happened to have installed on his machine that day (I mean sure there's a text file supposedly listing them, but it hasn't been accurate for 6 years). They list systems that they haven't actually supported for years, because no-one's actually testing them.
I hate containers as much as anyone, but the state of "native" unix software is even worse.
I love that the only alternative is a "pile of shell scripts". Nobody has posted a legitimate alternative to the complexity of K8S or the simplicity of doctor compose. Certainly feels like there's a gap in the market for an opinionated deployment solution that works locally and on the cloud, with less functionality than K8S and a bit more complexity than docker compose.
Nomad was amazing at every step of my experiments on it, except one. Simply including a file from the Nomad control to the Nomad host is... impossible? I saw indications of how to tell the host to get it from a file host, and I saw people complaining that they had to do it through the file host, with the response being security (I have thoughts about this and so did the complainants).
I was rather baffled to an extent. I was just trying to push a configuration file that would be the primary difference between a couple otherwise samey apps.
This looks cool and +1 for the 37Signals and Basecamp folks. I need to verify that I'll be able to spin up GPU enabled containers, but I can't imagine why that wouldn't work...
Docker Swarm is exactly what tried to fill that niche. It's basically an extension to Docker Compose that adds clustering support and overlay networks.
Docker Swarm is a good idea that sorely needs a revival. There are lots of places that need something more structured than a homemade deploy.sh, but less than... K8s.
This is basically exactly what we needed at the start up I worked at, with the added need of being able to host open source projects (airbyte, metabase) with a reasonable level of confidence.
We ended up migrating from Heroku to Kubernetes. I tried to take some of the learnings to build https://github.com/czhu12/canine
It basically wraps Kubernetes and tries to hide as much complexity from Kubernetes as possible, and only expose the good parts that will be enough for 95% of web application work loads.
I've personally been investing heavily in [Incus](https://linuxcontainers.org/incus/), which is the Linux Containers project fork and continuation of LXD post Canonical takeover of the LXD codebase. The mainline branch has been seeing some rapid growth, with the ability to deploy OCI Application Containers in addition to the System containers (think Xen paravirtualized systems if you know about those) and VMs, complete with clustering and SDN. There's work by others in the community to create [incus-compose](https://github.com/bketelsen/incus-compose), a way to use Compose spec manifests to define application stacks. I'm personally working on middleware to expose instance options under the user keyspace to a Redis API compliant KV store for use with Traefik as an ingress controller.
Too much to go into with what Incus does to tell you everything in a comment, but for me, Incus really feels like the right level of "old school" infrastructure platform tooling with "new school" cloud tech to deploy and manage application stacks, the odd Windows VM that accounting/HR/whoever needs to do that thing that can't be done anywhere else, and a great deal more.
Docker Swarm mode? I know it’s not as well maintained, but I think it’s exactly what you talk about here (forget K3s, etc). I believe smaller companies run it still and it’s perfect for personal projects. I myself run mostly docker compose + shell scripts though because I don’t really need zero-downtime deployments or redundancy/fault tolerance.
Looking at your page, it looks like Lambdas/Functions but on your system, not Amazon/Microsoft/Google.
Every company I've ever had try to do this has ended in crying after some part of the system doesn't fit neat into Serverless box and it becomes painful to extract from your system into "Run FastAPI in containers."
Capistrano, Ansible et al. have existed this whole time if you want to do that.
The real difference in approaches is between short lived environments that you redeploy from scratch all the time and long lived environments we nurse back to health with runbooks.
You can use lambda, kube, etc. or chef, puppet etc. but you end up at this same crossroad.
Just starting a process and keeping it alive for a long time is easy to get started with but eventually you have to pay the runbook tax. Instead you could pay the kubernetes tax or the nomad tax at the start instead of the 12am ansible tax later.
Remember using shell scripts to remove some insane node/js-brain-thonk hints, being easier than trying to reverse engineering how the project was supposed to be "compiled" to properly use those hints.
You mean the list of calls right there in the shell script?
> Who will know about those undocumented sysctl edits you made on the VM?
You mean those calls to `sysctl` conveniently right there in the shell script?
> your app needs to programmatically spawn other containers
Or you could run a job queue and push tasks to it (gaining all the usual benefits of observability, concurrency limits, etc), instead of spawning ad-hoc containers and hoping for the best.
"We don't know how to learn/read code we are unfamiliar with... Nor do we know how to grok and learn things quickly. Heck, we don't know what grok means "
I'm giggling at the idea you'd need Kubernetes for a mere two servers. We don't run any application with less than two instances for redundancy.
We've just never seen the need for Kubernetes. We're not against it as much as the need to replace our working setup just never arrived. We run EC2 instances with a setup shell script under 50loc. We autoscale up to 40-50 web servers at peak load of a little over 100k concurrent users.
Different strokes for different folks but moreso if it ain't broke, don't fix it
Highly amateurish take if you call shell spaghetti a Kubernates, especially if we compare complexity of both...
You know what would be even more bad? Introducing kubernates for your non-Google/Netflix/WhateverPlanetaryScale
App instead of just writing few scripts...
Hell, I’m a fan of k8s even for sub-planetary scale (assuming that scale is ultimately a goal of your business, it’s nice to build for success). But I agree that saying “well, it’s either k8s or you will build k8s yourself” is just ignorant. There are a lot of options between the two poles that can be both cheap and easy and offload the ugly bits of server management for the right price and complexity that your business needs.
Both this piece and the piece it’s imitating seem to have 2 central implicit axioms that in my opinion don’t hold. The first, that the constraints of the home grown systems are all cost and the second that the flexibility of the general purpose solution is all benefit.
You generally speaking do not want a code generation or service orchestration system that will support the entire universe of choices. You want your programs and idioms to follow similar patterns across your codebase and you want your services architected and deployed the same way. You want to know when outliers get introduced and similarly you want to make it costly enough to require introspection on if the value of the benefit out ways the cost of oddity.
The compiler one read to me like a reminder to not ignore the lessons of compiler design. The premise being that even though you have small scope project compared to a "real" compiler, you will evolve towards analogues of those design ideas. The databases and k8s pieces are more like don't even try a small scope project because you'll want the same features eventually.
I suppose I can see how people are taking this piece that way, but I don't see it like that. It is snarky and ranty, which makes it hard to express or perceive nuance. They do explicitly acknowledge that "a single server can go a long way" though.
I think the real point, better expressed, is that if you find yourself building a system with like a third of the features of K8s but composed of hand-rolled scripts and random third-party tools kludged together, maybe you should have just bit the bullet and moved to K8s instead.
You probably shouldn't start your project on it unless you have a dedicated DevOps department maintaining your cluster for you, but don't be afraid to move to it if your needs start getting more complex.
> You generally speaking do not want a code generation or service orchestration system that will support the entire universe of choices.
This. I will gladly give up the universe of choices for a one size fits most solution that just works. I will bend my use cases to fit the mold if it means not having to write k8s configuration in a twisty maze of managed services.
- One had only 2 services [php] and ran over 1 billion requests a day. Deploy was trivial, ssh some new files to the server and run a migration, 0 downtime.
- One was in an industry that didn't need "Webscale" (retirement accounts). Prod deploys were just docker commands run by jenkins. We ran two servers per service from the day I joined the day I left 4 years later (3x growth), and ultimately removed one service and one database during all that growth.
Another outstanding thing about both of these places was that we had all the testing environments you need, on-demand, in minutes.
The place I'm at now is trying to do kubernetes and is failing miserably (ongoing nightmare 4 months in and probably at least 8 to go, when it was allegedly supposed to only take 3 total). It has one shared test environment that it takes 3-hours to see your changes in.
I don't fault kubernetes directly, I fault the overall complexity. But at the end of the day kubernetes feels like complexity trying to abstract over complexity, and often I find that's less successful that removing complexity in the first place.
Questions worth asking:
- Do you need a load balancer?
- TLS certs and rotation?
- Horizontal scalability.
- HA/DR
- dev/stage/production + being able to test/stage your complete stack on demand.
- CI/CD integrations, tools like ArgoCD or Spinnaker
- Monitoring and/or alerting with Prometheus and Grafana
- Would you benefit from being able to deploy a lot of off the shelf software (lessay Elastic Search, or some random database, or a monitoring stack) via helm quickly/easily.
- "Ingress"/proxy.
- DNS integrations.
If you answer yes to many of those questions there's really no better alternative than k8s. If you're building large enough scale web applications the almost to most of these will end up being yes at some point.
There are good reasons to use Kubernetes, mainly if you are using public clouds and want to avoid lock-in. I may be partial, since managing it pays my bills. But it is complex, mostly unnecessarily so, and no one should be able to say with a straight face that it achieves better uptime or requires less personnel than any alternative. That's just sales talk, and should be a big warning sign.
Kubernetes only works if you have a webapp written in a slow interpreted language. For anything else it is a huge impedance mismatch with what you're actually trying to do.
P.S. In the real world, Kubernetes isn't used to solve technical problems. It's used as a buffer between the dev team and the ops team, who usually have different schedules/budgets, and might even be different corporate entities. I'm sure there might be an easier way to solve that problem without dragging in Google's ridiculous and broken tech stack.
This is not even close to true with even a small number of resources. The notion that k8s somehow is the only choice is right along the lines of “Java Enterprise Edition is the only choice” — ie a real failure of the imagination.
For startups and teams with limited resources, DO, fly.io and render are doing lots of interesting work. But what if you can’t use them? Is k8s your only choice?
Let’s say you’re a large orgs with good engineering leadership, and you have high-revenue systems where downtime isn’t okay. Also for compliance reasons public cloud isn’t okay.
DNS in a tightly controlled large enterprise internal network can be handled with relatively simple microservices. Your org will likely have something already though.
Dev/Stage/Production: if you can spin up instances on demand this is trivial. Also financial services and other regulated biz have been doing this for eons before k8s.
Load Balancers: lots of non-k8s options exist (software and hardware appliances).
Prometheus / Grafana (and things like Netdata) work very well even without k8s.
Load Balancing and Ingress is definitely the most interesting piece of the puzzle. Some choose nginx or Envoy, but there’s also teams that use their own ingress solution (sometimes open-sourced!)
But why would a team do this? Or more appropriately, why would their management spend on this? Answer: many don’t! But for those that do — the driver is usually cost*, availability and accountability, along with engineering capability as a secondary driver.
(*cost because it’s easy to set up a mixed ability team with experienced, mid-career and new engineers for this. You don’t need a team full of kernel hackers.)
It costs less than you think, it creates real accountability throughout the stack and most importantly you’ve now got a team of engineers who can rise to any reasonable challenge, and who can be cross pollinated throughout the org. In brief the goal is to have engineers not “k8s implementers” or “OpenShift implementers” or “Cloud Foundry implementers”.
Nah, most of that list is basically free for any company that uses an amazon loadbalancer and an autoscale group. In terms of likeliness of incidents, time, and cost, those will each be an order of magnitude higher with a team of kubernetes engineers than less complex setup.
https://blog.bradfieldcs.com/you-are-not-google-84912cf44afb
The questions you pose are legit skills web developers need to have. Nothing you mentioned is obviated by K8s or containerization.
"oh but you can get someone elses pre-configured image" uh huh... sure, you can also install malware. You will also need to one day maintain or configure the software running in them. You may even need to address issues your running software causes. You can't do that without mastering the software you are running!
On the other hand, you seem to be underestimating the fact that even the best, most trusted developer can make a mistake from time to time. It's no disgrace, it's just life.
Kubernetes is not simple. In fact it's even more complex than just running an executable with your linux distro's init system. The difference in my mind is that it's more complex for the system maintainer, but less complex for the person deploying workloads to it.
And that's before exploring all the benefits of kubernetes-ecosystem tooling like the Prometheus operator for k8s, or the horizontally scalable Loki deployments, for centrally collecting infrastructure and application metrics, and logs. In my mind, making the most of these kinds of tools, things start to look a bit easier even for the systems maintainers.
Not trying to discount your workplace too much. But I'd wager there's a few people that are maybe not owning up to the fact that it's their first time messing around with kubernetes.
The problems start when you're somewhere between those two points. If you've got a "throw it over the wall to ops" type organisation, it's going to go bad. If you've got an underfunded platform team so the app team has to pick up some of the slack, it's going to go bad. If the app team have to ask permission from the platform team before doing anything interesting, it's going to go bad.
The problem is that a lot of organisations will look at k8s and think it means something it doesn't. If you weren't willing to fund a platform team before k8s, I'd be sceptical that moving to it is going to end well.
I've only used it managed. There is a bit of a learning curve but it's not so bad. I can't see how it can take 4 months to figure it out.
> I can't see how it can take 4 months to figure it out.
Well have you ever tried moving a company with a dozen services onto kubernetes piece-by-piece, with zero downtime? How long would it take you to correctly move and test every permission, environment variable, and issue you run into?
Then if you get a single setting wrong (e.g. memory size) and don't load-test with realistic traffic, you bring down production, potentially lose customers, and have to do a public post-mortem about your mistakes? [true story for current employer]
I don't see how anybody says they'd move a large company to kubernetes in such an environment in a few months with no screwups and solid testing.
After +20 years of Linux sysadmin/devops, and because a spinal disc herniation last year, now I'm looking for a job.
99% of job offers, will ask for EKS/Kubernetes now.
It's like the VMware of the years 200[1-9], or like the "Cloud" of the years 201[1-9].
I've always specialized in physical datacenters and servers, being it on-premises, colocation, embedded, etc... so I'm out of the market now, at least in Spain (which always goes like 8 years behind the market).
You can try to avoid it, and it's nice when you save thousands of operational/performance/security/etc issues and dollars to your company across the years, and you look like a guru that goes ahead of industry issue to your boss eyes, but, it will make finding a job... 99% harder.
It doesn't matter if you demonstrate the highest level on Linux, scripting, ansible, networking, security, hardware, performance tuning, high availability, all kind of balancers, switching, routing, firewalls, encryption, backups, monitoring, log management, compliance, architecture, isolation, budget management, team management, provider/customer management, debugging, automation, programming full stack, and a long etc. If you say "I never worked with Kubernetes, but I learn fast", with your best sincerity at the interview, then you're automatically out of the process. No matter if you're talking with human resources, a helper of the CTO, or the CTO. You're out.
Obviously, some parts of took a while to figure out. For example, I needed to figure out an AWS security group problem with Ingress objects, that I recall wasn't well-documented. So I think parts of that declarative language can suck, if the declarative parts aren't well factored-out from the imperative parts. Or if the log messages don't help you diagnose errors, or if there isn't some kind of (dynamic?) linter that helps you notice problems quickly
In your team's case, more information seems needed to help us evaluate the problems. Why was it easier before to make testing environments, and harder now?
Deleted Comment
- most server software is waaaaaaay out of date so getting a dev / test env is a little harder (like last problem we got was the HAproxy version does not do ECDA keys for ssl certs, which is the default with certbot) - yeah pushing to prod is "easy": FTP directly. But now which version of which files are really in prod? No idea. Yeah when I say old school it's old school before things like Jenkins. - need something done around the servers? That's the OPS team job. Team which also has too much different work to do so now you'll have to wait a week or two for this simple "add an upload file" endpoint to this old API because you need somewhere to put those files.
Now we've started setting up some on-prem k8s nodes for the new developments. Not because we need crazy scaling but so the dev team can do most OPS they need. It takes time to have everything setup but once it started chugging along it felt good to be able to just declare whatever we need and get it. You still need to get the devs to learn k8s which is not fun but that's the life of a dev: learning new things every day.
Also k8s does not do data. You want a database or anything managing files: you want to do most of the job outside k8s.
Sure, it can be easy, just pick one of the many cloud providers that fix all the complicated parts for you. Though, when you do that, expect to pay extra for the privilege, and maybe take a look at the much easier proprietary alternatives. In theory the entire thing is portable enough that you can just switch hosting providers, in practice you're never going to be able to do that without seriously rewriting part of your stack anyway.
The worst part is that the mountains of YAML were never supposed to be written by humans anyway, they're readable configuration your tooling is supposed to generate for you. You still need your bash scripts and your complicated deployment strategies, but rather than using them directly you're supposed to compile them into YAML first.
Kubernetes is nice and all but it's not worth the effort for the vast majority of websites and services. WordPress works just fine without automatic replication and end-to-end microservice TLS encryption.
The biggest breaking change to docker compose since it was introduced was that the docker-compose command stopped working and I had to switch to «docker compose» with a space. Had I stuck with docker and docker-compose I could have trivially kept everything up to date and running smoothly.
each app has it's own template e.g. nodejs-worker, and you don't change the template unless you really needed.
i spent ~2% of my manger+eng leader+hiring manger+ god knows what else people do at startup on managing 100+ microservices because they were templates.
Did each employee have 2 to 3 services to maintain? If so, that sounds like an architectural mistake to me.
Deleted Comment
This is an anti-pattern in my opinion. If you're on cloud provider A, might as well just write code for cloud provider A. If and when you'll be asked to switch to B you'll change the code to work on both A and B.
[1] http://widgetsandshit.com/teddziuba/2010/10/taco-bell-progra...
"""After all, functionality is an asset, but code is a liability."""
I say this all the time! But i've not heard others saying this also. Great to see some like minded developers!!
Ofc it isn't true.
Kubernetes was designed at Google at a time when Google was already a behemoth. 99.99% of all startups and SMEs out there shall never ever have the same scaling issues and automation needs that Google has.
Now that said... When you begin running VMs and containers, even only a very few of them, you immediately run into issues and then you begin to think: "Kubernetes is the solution". And it is. But it is also, in many cases, a solution to a problem you created. Still... the justification for creating that problem, if you're not Google scale, are highly disputable.
And, deep down, there's another very fundamental issue IMO: many of those "let's have only one process in one container" solutions actually mean "we're totally unable to write portable software working on several configs, so let's start with a machine with zero libs and dependencies and install exactly the minimum deps needed to make our ultra-fragile piece of shit of a software kinda work. And because it's still going to be a brittle piece of shit, let's make sure we use heartbeats and try to shut it down and back up again once it'll invariably have memory leaked and/or whatnots".
Then you also gained the right to be sloppy in the software you write: not respecting it. Treating it as cattle to be slaughtered, so it can be shitty. But you've now added an insane layer of complexity.
How do you like your uninitialized var when a container launchs but then silently doesn't work as expected? How do you like them logs in that case? Someone here as described the lack of instant failure on any uninitialized var as the "billion dollar mistake of the devops world".
Meanwhile look at some proper software like, say, the Linux kernel or a distro like Debian. Or compile Emacs or a browser from source and marvel at what's happening. Sure, there may be hickups but it works. On many configs. On many different hardware. On many different architectures. These are robust software that don't need to be "pid 1 on a pristine filesystem" to work properly.
In a way this whole "let's have all our software each as pid 1 each on a pristine OS and filesystem" is an admission of a very deep and profound failure of our entire field.
I don't think it's something to be celebrated.
And don't get me started on security: you know have ultra complicated LANs and VLANs, with a near impossible to monitor traffic, with shitloads of ports open everywhere, the most gigantic attack surface of them all and heartbeats and whatsnots constantly polluting the network, where nobody doesn't even know anymore what's going on. Where the only actual security seems to rely on the firewall being up and correctly configured, which is incredibly complicated to do seen the insane network complexity you added to your stack. "Oh wait, I have an idea, let's make configuring the firewall a service!" (and make sure to not forget to initialize one of the countless var or it'll all silently break and just be not be configuring firewalling for anything).
Now though love is true love: even at home I'm running an hypervisor with VMs and OCI containers ; )
Lol no. The build systems flake out if you look at them funny. The build requirements are whatever Joe in Nebraska happened to have installed on his machine that day (I mean sure there's a text file supposedly listing them, but it hasn't been accurate for 6 years). They list systems that they haven't actually supported for years, because no-one's actually testing them.
I hate containers as much as anyone, but the state of "native" unix software is even worse.
99.99% of startups and SMEs should not be writing microservices.
But "I wrote a commercial system that served thousands of users, it ran on a single process on a spare box out the back" doesn't look good on resumes.
If you need a quick scheduler, orchestrator and services control pane without fully embracing containers you might soon be out of luck.
I was rather baffled to an extent. I was just trying to push a configuration file that would be the primary difference between a couple otherwise samey apps.
“People will always defend complexity, stating that the only alternative is shell scripts”.
I saw people defending docker this way, ansible this way and most recently systemd this way.
Now we’re on to kubernetes.
To be fair, most people attacking systemd say they want to return to shell scripts.
Wait. Wouldn't that be a good idea?
https://kamal-deploy.org/
We ended up migrating from Heroku to Kubernetes. I tried to take some of the learnings to build https://github.com/czhu12/canine
It basically wraps Kubernetes and tries to hide as much complexity from Kubernetes as possible, and only expose the good parts that will be enough for 95% of web application work loads.
Too much to go into with what Incus does to tell you everything in a comment, but for me, Incus really feels like the right level of "old school" infrastructure platform tooling with "new school" cloud tech to deploy and manage application stacks, the odd Windows VM that accounting/HR/whoever needs to do that thing that can't be done anywhere else, and a great deal more.
So we started by using docker swarm mode for our dev env, and made it all the way to production using docker swarm. Still using it happily.
You should check out DBOS and see if it meets your middle ground requirements.
Works locally and in the cloud, has all the things you’d need to build a reliable and stateful application.
[0] https://dbos.dev
Every company I've ever had try to do this has ended in crying after some part of the system doesn't fit neat into Serverless box and it becomes painful to extract from your system into "Run FastAPI in containers."
(and yeah, obviously "don't deploy that stuff" is the solution)
---
That being said, is it all OSS? I can see some stuff here that seems to be, but it mostly seems to be the client side stuff?
https://github.com/dbos-inc
The real difference in approaches is between short lived environments that you redeploy from scratch all the time and long lived environments we nurse back to health with runbooks.
You can use lambda, kube, etc. or chef, puppet etc. but you end up at this same crossroad.
Just starting a process and keeping it alive for a long time is easy to get started with but eventually you have to pay the runbook tax. Instead you could pay the kubernetes tax or the nomad tax at the start instead of the 12am ansible tax later.
Probably because except in specific niche industries, every Linux box you ever experience is extremely likely to have Bash and Python installed.
Also, because Powershell is hideously verbose and obnoxious, and JS and its ilk belong on a frontend, not running servers.
You mean the list of calls right there in the shell script?
> Who will know about those undocumented sysctl edits you made on the VM?
You mean those calls to `sysctl` conveniently right there in the shell script?
> your app needs to programmatically spawn other containers
Or you could run a job queue and push tasks to it (gaining all the usual benefits of observability, concurrency limits, etc), instead of spawning ad-hoc containers and hoping for the best.
This is about the worst encoding for network rules I can think of.
We've just never seen the need for Kubernetes. We're not against it as much as the need to replace our working setup just never arrived. We run EC2 instances with a setup shell script under 50loc. We autoscale up to 40-50 web servers at peak load of a little over 100k concurrent users.
Different strokes for different folks but moreso if it ain't broke, don't fix it
You know what would be even more bad? Introducing kubernates for your non-Google/Netflix/WhateverPlanetaryScale App instead of just writing few scripts...
You generally speaking do not want a code generation or service orchestration system that will support the entire universe of choices. You want your programs and idioms to follow similar patterns across your codebase and you want your services architected and deployed the same way. You want to know when outliers get introduced and similarly you want to make it costly enough to require introspection on if the value of the benefit out ways the cost of oddity.
I think the real point, better expressed, is that if you find yourself building a system with like a third of the features of K8s but composed of hand-rolled scripts and random third-party tools kludged together, maybe you should have just bit the bullet and moved to K8s instead.
You probably shouldn't start your project on it unless you have a dedicated DevOps department maintaining your cluster for you, but don't be afraid to move to it if your needs start getting more complex.
This. I will gladly give up the universe of choices for a one size fits most solution that just works. I will bend my use cases to fit the mold if it means not having to write k8s configuration in a twisty maze of managed services.
It's a fun philosophy for online debates, but an expensive one to use in real engineering.
Only offering the correction because I was confused at what you meant by “out ways” until I figured it out.