Readit News logoReadit News
lifeisstillgood · 5 years ago
>>> We no longer provision load balancers, or configure DNS; we simply describe the resources that we need, and Kubernetes makes it happen.

This is (part) of what keeps me in the stone age. You are provisioning load balancers and DNS - but just one step removed through k8s

And my prior is that we need to understand that, be aaare of it, have a model of what is going on to help develop and debug.

And so it feels a bit like "magic abstraction". And then to peek through the abstraction you suddenly not only need to know about DNS and which machine is running bind, but also how kubernetes internally stores it's DNS config and how it spits that out and what version they changed it with.

In other words you have to become expect in two things to debug it.

And maybe it's worth it - but I struggle to see why it's not simpler to keep my install scripts going.

(OK I guess I am writing my own answer - but surely the point is what is the simplest level of thin install scripts needed to deploy containers?)

mjgs · 5 years ago
I’m seeing things as you are seeing them currently.

I love the idea of using Kubernetes, it sounds amazing initially, but then every single article I read about it turns into some epic blog post that leaves me worried that the whole house of cards could easily come crashing down.

Maybe in the future the abstraction will become rock solid, easy to install and manage and ‘just work’, but it doesn’t feel like that to me now. There’s too many ‘we had to come up with / use hack X to integrate it with software Y’.

Until then, if you are on a small team with a small budget, I reckon keeping it simple is the better approach. Standard OSs with some bash scripts for provisioning, build and deploy. Even if it’s more manual work, and takes a bit longer, having an understanding of the platform you are building on is crucial.

BTW if you are looking for a description of a way todo this sort of thing ‘the boring way’:

Robust NodeJS Deployment Architecture

https://blog.markjgsmith.com/2020/11/13/robust-nodejs-deploy...

gerhardlazu · 5 years ago
K8S is an API that the majority is agreeing on, which is rare. There is a lot of amazing tooling, a staggering amount of ongoing innovation, all built on solid concepts: declarative models, emitted metrics (the /proc equivalent, but with larger scope) and versioned infrastructure as data (a.k.a. GitOps).

For someone that is known as the King of Bash (self-proclaimed) - https://speakerdeck.com/gerhardlazu/how-to-write-good-bash-c... - and after a decade of Puppet, Chef, Ansible and oh wow that sweet bash https://github.com/gerhard/deliver - even if all my workstations and work servers (yup, all running k3s) are provisioned with Make (bash++), I still think that K8S is the better approach to running production infrastructure. The advantage to using simple and well-defined components (e.g. external-dns, ingress-nginx, prometheus-operator etc.) that adhere to a universal API, and are maintained by many smart people all around the world, is a better proposition than scripting in my opinion.

At the end of the day, I'm in it for the shared mindset, great conversations and a genuine desire to do better, which I have not seen before K8S & the wider CNCF. I will go on a limb here and assume that I love scripting just as much as you do, but go beyond this aspect and you will discover that it's more to it than "thin install scripts that deploy containers" (which are not just glorified jails or unikernels).

ClumsyPilot · 5 years ago
I thi k you've hit your head on the nail - the point is not just the kubernetes, it's that you can build standard infrastructure on top. Any software can be (in theory) setup with a helm script, configured in a standard way through YAML configmaps rather than some esoteric configfiles or scripts which are diffetent for every piece of software
tgsovlerkhgsel · 5 years ago
By using K8s and similar technologies, you're buying standardization with underlying complexity and reduced efficiency.

In many cases, it's a good tradeoff, because you can now use standard tooling on everything.

Just like it's cheaper to ship an entire (physical) shipping container that's half-full than to ship the same stuff loosely. Or why companies will send you two separate letters on the same day with a small note that this is more efficient for them than collating them.

I assume that k8s also makes it much easier to move to a different cloud provider if you're unhappy with one (or the new one offers better pricing). Instead of rewriting your bespoke scripts that only you understand, anyone familiar with the technology will know which modules to swap to make it work with the new provider.

smallnamespace · 5 years ago
> And my prior is that we need to understand that, be aaare of it, have a model of what is going on to help develop and debug.

If it's a sufficiently robust abstraction, you don't, you just learn the abstraction. Kubernetes has reached that point for many folks.

I no longer have a detailed mental model of how my compiler or LLVM works, I just trust that it does. When was the last time you needed to (or were capable of) debugging a bug in your compiler? A couple of human generations of work went into making that happen.

Note that it turns out compiling code well, or making a reliable orchestration system, is an enormously complex problem. At some point, the complexity outstrips the ability of even generalists in the field to keep up, yet the systems keep getting more reliable.

So in these types of cases, you can either do it yourself poorly (you're an amateur), do it yourself well (congrats, you've become an expert), or delegate.

This isn't really limited to computing. I delegate maintenance on my car to a mechanic, while I'm pretty sure a generation ago, everybody (in the US) changed their own oil and understood how the carb worked. Times change.

lifeisstillgood · 5 years ago
The car issue is tempting as an argument clincher. My problem is the army - a lot of their trucks and so forth are not computer controlled this-generation-of-toyota-abstraction but have stayed inefficient but repairable trucks. Because they both need to have trucks that are repairable and also all the computer understands is nice paved roads. Which is what armies don't do.

But yes, mostly I am not an army.

kbar13 · 5 years ago
i think the idea is that k8s does away with having to glue all those pieces of infra together, not that you lose understanding of how it all works together. part of the headache with managing infra is that it rots over time... things come and go (sysv to systemd, apt/snap/whatever, config files change, things break). it's easier to keep up to date on k8s than all the disparate parts of the OS and provider-specific APIs and whatnot
lifeisstillgood · 5 years ago
That's an interesting perspective I have not heard before.

Does this imply there is a cloud abstract layer that should come (assuming all providers can put aside commercial interests etc)

And is k8s the simplest possible abstraction? And if not - what is?

theptip · 5 years ago
In the happy path, your developers no longer need to worry about this stuff. It’s possible for a team to stand up a new service and plumb it through all the way to the external LB just using k8s yaml templates.

In the unhappy path, sure, you need someone who knows how to debug networking issues, and in some cases it’s going to be harder to debug because of the layers of indirection. But the total amount of toil is significantly reduced.

A bad abstraction doesn’t carry its weight in complexity. A good abstraction allows you to ignore the lower levels most of the time without missing something important; I’d put k8s firmly in the latter category.

ojhughes · 5 years ago
I agree and in my experience with k8s, the unhappy path is a rare experience. Everything just works most of the time
brainzap · 5 years ago
I have switched from dev to cloud engineer, and from what I have experienced all abstractions leak and you need to master all layers.
ithkuil · 5 years ago
Let's turn the knobs on the scenario and see how it appears:

>>> >>> with the advent of the first programming languages you no longer had to think in terms of registers, loading operands from memory, storing the results back, spilling registers to the stack.

>>> You _are_ handling registers and memory spills, but just one step removed through the use of C.

The analogy may not be perfect, but I think it makes obvious some of the things also mentioned in sibling comments: it's all about habit, maturity and thus trust.

If you trust your tools are working correctly, if you know how to deal with their well known quirks, you'll just rebase on top a layer and hopefully boost your productivity and tackle more complex problems more easily.

Maturity is important because if today you're more likely to blame yourself than the compiler when your stuff doesn't works, it's just because you're lucky to work with mature and popular toolchains. (And I'm not only talking about the past when compilers where new and unproven, it still happens today on some niche embedded toolchains).

So, yes, it's indeed rational to be wary of new unproven abstraction layers as they could bring more pain than help.

It's hard to judge when that line is crossed though.

I personally like to know how stuff works under the hood anyways. I find it useful in practice and it gives me confidence in using the higher layers, when they make sense, or stay with the lower level layers, when they make sense.

Occasionally I still write some assembly. But most of the time, for most of the stuff, it just makes more sense to use a higher level programming language.

I see k8s in a similar way. We have operating systems, programming languages, etc; all sorts of abstractions that help us separate concerns and have specialists dealing with the nitty gritty details of some stuff, so that everybody else can be specialized in something else (just like in real life)

lifeisstillgood · 5 years ago
But i think this is apples and oranges - C maps pointers and memory allocation in a very direct robust manner, even Python allows you to dis back to see the stack.

But this hard-line-to-underlying reality is unlikely to exist looking at how this months k8s will configure last years AWS route53.

It just works 80% of the time is a disaster, 98% of the time might be bearable. is it above 99%?

enos_feedler · 5 years ago
Isn’t this true of all operating systems? Debugging is going to require some knowledge about the OS and the syscalls it makes but you don’t want to write directly to the machine as an OS would. Same goes for internet connected clustered machines
PietKachelhout · 5 years ago
I'm disappointed that blogs and podcasts keep promoting Linode. I currently maintain about 30 Linodes and I have been doing so for the past 2 years.

Some things I noticed:

* The internal network is not private. But people don't realise it. You share a /16 with other Linodes. So many open databases, file shares and other services in there.

* Block storage performance is really poor, around 100 iops. Same as a SATA disk from 10 years ago.

* No proper snapshot / image functionality.

* Linode Kubeternetes Engine was based on Debian Oldstable when it launched.

* Excessive CPU steal, even on dedicated cores. 25% CPU steal is considered normal. Over 50% happens a lot.

* Problems with their hosts. I can only guess what the reason is but 4 to 8 hours of unannounced downtime of a VM happend to me 6 times in the past 2 years.

Yes, support is friendly. But my international phone bill is huge because the fastest way to get them to do something is to call.

brobinson · 5 years ago
With all these negatives, there must be a really compelling reason to stay. What is it?
shaicoleman · 5 years ago
In a previous job a couple of years ago, I experienced regular issues with Linode (hypervisor errors, storage errors, performance issues, networking issues, etc.).

Despite all that, management decided to stay with Linode for the following reasons:

* Change is hard, and "better the devil you know" mentality.

* The instance pricing looks cheap compared to AWS. e.g. c5.xlarge ($124) vs 8GB-Balanced ($40) that Linode charges. In reality it isn't so cheap because it's poor oversold technology.

* AWS/GCP have exorbitant bandwidth pricing. Linode bandwidth is very generous, as its pooled across all servers in the account.

* Having someone to pick up the phone 24/7 when there's a problem is a big plus in theory. However, it's much better not to need to call in the first place because things just work.

* Migrating providers can be an expensive and time-consuming endeavour.

* Technical debt, interdependencies, manually configured snowflake servers and infrastructure, no documentation, etc. makes changes risky.

* Not enough DevOps on the team, and too many fires to put out, and shiny features to ship means cloud provider migration is low on the priority list.

PietKachelhout · 5 years ago
The owner of the company thinks they are great because you can call them when there is a problem. I'm having a hard time convincing him that these are issues I've never had at other providers. Certainly not so many.

We are moving everything away. Most of our servers are with another provider already. And we haven't had any similar issues there. I've never called them!

And I forgot to mention the connectivity issues at Linode. When the whole London datacenter was unreachable for 2 hours we lost some customers.

threwawaysoff · 5 years ago
I ditched Linode years ago because it just wasn't that great. AWS if I need ephemeral boxes (I used to be an enterprise on-prem AWS consultant), but I run most everything on a home 96 EPYC core, 512 GB, SSD, HDD (ZFS) box running KVM, Docker, and open vswitch. It just isn't worth it to rent slow, expensive servers when I need lots of them and to be fast. I don't have any problems remoting into them with ddns and wireguard.
gerhardlazu · 5 years ago
My first Supermicro just turned 9 and it's still running strong, with a fresh install of Ubuntu 20.04 & k3s over the holidays. The second Supermicro turned 5, and has been running FreeBSD all this time like a champ. They are both loft guardians.

A bunch of bare metal hosts run on Scaleway / Online, and different VMs & managed services run in Digital Ocean, Linode, AWS & GCP. I sometimes spin the odd bare metal instance on Equinix Metal (former Packet).

A diverse fleet means that there's always something new to learn and try out. A single large host would make me anxious, as no internet provider or power grid is 100% reliable and available. Also, software upgrades sometimes fail, and things get messed up all the time, which is when I find it most efficient to just start from scratch. A single host makes that less convenient.

Every approach has its pros and cons, which is why my main workstation is a 20 Xeon W with 64GB RAM & 1TB NVME : ). Yes, there is a backup workstation which doubles up as a mobile one meaning that it can work without power or hard internet for almost a day. Options are good ; )

js4ever · 5 years ago
I was baffled by this: "The worst part is that serving the same files from disks local to the VMs is 44x faster than from persistent volumes (267MB/s vs 6MB/s)."

Is it a configuration issue on their side or do the LKE volumes are really limited to 6MB/s on linode?

How can you be happy with this for production??

gerhardlazu · 5 years ago
Block storage is an area that we are working with Linode to improve. That's the random read/write performance, as measured by fio.

We have mostly sequential reads & writes (mp3 files) that peak at 50MB/s, then rely on CDN caching (Fastly makes us happy in this respect).

CDN caching is something that we are currently improving, which will make things quicker and more reliable.

The focus is on reality vs the ideal, and the path that we are taking to improving not just changelog.com, but also our sponsors' products. No managed K8S or IaaS is perfect, but we enjoy the Linode partnership & collaboration ;)

sweeneyrod · 5 years ago
Hey, that speed was good enough in ~1995!
tracker1 · 5 years ago
It's a relatively common problem... if you need more IO, you may want to try a stripe of 4-8 block storage connections.

Then again, depends on what you're doing.

gerhardlazu · 5 years ago
For what it's worth, Rook, OpenEBS or Longhorn are worth exploring.
remram · 5 years ago
Isn't changelog.com mostly a static site? What kind of workloads do they run/monitor/update with all this infrastructure?
awinter-py · 5 years ago
I think they're sponsored by linode, and they're developer-themed -- there may be team / content reasons to use a lot of unnecessary tools in order to review them
jerodsanto · 5 years ago
This quote from Adam on our episode about the setup explains some of our motivations here:

> It’s worth noting that we don’t really need what we have around Kubernetes. This is for fun, to some degree. One, we love Linode, they’re a great partner… Two, we love you, Gerhard, and all the work you’ve done here… We don’t really need this setup. One, it’s about learning ourselves, but then also sharing that. Obviously, Changelog.com is open source, so if you’re curious how this is implemented, you can look in our codebase. But beyond that, I think it’s important to remind our audience that we don’t really need this; it’s fun to have, and actually a worthwhile investment for us, because this does cost us money (Gerhard does not work for free), and it’s part of this desire to learn for ourselves, and then also to share it with everyone else… So that’s fun. It’s fun to do.

manigandham · 5 years ago
The static files should definitely be on some kind of object storage like S3, that's what it's built for. Much faster, more reliable, more scalable, and likely much cheaper too.

As for persistent volumes, might be better to just offload Postgres to a managed DB service and downsize the K8S instances, or use something like CockroachDB which is natively distributed and can make use of local volumes instead.

gerhardlazu · 5 years ago
Yes, it does make sense to move static files to object storage, especially the mp3s. There is some ffmpeg-related refactoring that we need to do before we can do this though, and it's not a quick & easy task, so we have been deferring it since it's not that high priority, and there are simpler solutions to this particular problem (i.e. improved CDN caching).

Other static files such as css, js, txt make sense to remain bundled with the app image, which is stateless and a prime candidate for horizontal scaling. Also, CDN caching makes small static files that change infrequently a non issue, regardless of their origin.

The managed Postgres service from Linode's 2021 roadmap is definitely something that we are looking forward to, but the simplest thing might be to provision Postgres with local volumes instead. We are already using a replicated Postgres via the Crunchy PostgreSQL Operator, so I'm looking forward to trying this approach out first.

CockroachDB is on my list of cool tech to experiment with, but that will use an innovation token, and we only have a few left for 2021, so I don't want to spend them all at once.

manigandham · 5 years ago
Yea the small static files that are part of your webapp can stay with it, but media files are best on S3. If you need a block interface though, I recommend something like ObjectiveFS: https://objectivefs.com/

If you're using an operator then local volumes is a good middleground if it automates the replication already. CockroachDB also has a kubernetes operator although it's only for GKE currently. There are also other options like YugabyteDB which is another cloud-native postgres-compatible DB.

johnchristopher · 5 years ago
It seemed interesting and I tried to subscribe but the process doesn't work without disabling ublock/adblock. I am okay with that so I disabled it but then I failed two captchas round (find bikes in thumbnails, some thumbnails are upside down ?!) :(.
jerodsanto · 5 years ago
Sorry for the hassle! We’ve been hit by lots of spammers lately so I battened down the hatches. Unfortunately this has the side effect of also blocking some legit humans as well. :(
johnchristopher · 5 years ago
Could 2FA or a mobile number be a good/practical alternative ?
skinnyarms · 5 years ago
I'm a fan of the show, it was really entertaining to listen to knocking the site over "live".
gerhardlazu · 5 years ago
That was my favourite part too!

Yes, we could have mitigated that entirely with CDN stale caching, but it was good to see what happens today, and then iterate towards better Fastly integration.

jacques_chester · 5 years ago
I'd be interested in learning more about the move from Concourse to Circle (I'm a notorious Concourse fanboy). What went well, what didn't, what you miss, what prompted it -- that sort of thing.
gerhardlazu · 5 years ago
The primary reason behind the move was not wanting to manage CI. Since there were no options for a managed Concourse in 2018, we migrated to Circle, one of the Changelog sponsors at the time.

Concourse worked well for us, we didn't have any issues that were being enough to remember. You may be interested in this screenshot that captured the changelog.com pipeline from 2017: https://pipeline.gerhard.io/images/small-oss.png

I missed the simple Concourse pipeline view at first, but CircleCI improved by leaps and bounds in 2020, and the new Circle pipeline view equivalent is even better (compared to Concourse, clicking on jobs always works): https://app.circleci.com/pipelines/github/thechangelog/chang...

The Circle feature which I didn't expect to like as much as I do today, is the dashboard view (list of all pipeline/workflow runs). This is something that Concourse is still missing: https://app.circleci.com/pipelines/github/thechangelog

My favourite Circle 2020 feature is the Insights: https://app.circleci.com/insights/github/thechangelog/change.... Yup, we were one of the first ones to ask for it in 2019.

In 2021, I expect us to spend one migration credit on GitHub Actions, as a Circle replacement. Argo comes as a second close, but that requires an innovation credit which is more precious to us. Because we are already using GitHub Actions for some automation, it would make sense to consolidate, and also leverage the GitHub Container Registry, as a migration from Docker Hub. Watch https://github.com/thechangelog/changelog.com to see what happens : )

jacques_chester · 5 years ago
Thanks, a very comprehensive answer. I agree especially that the lack of a hosted option has hamstrung Concourse's success.

I couldn't use the Circle CI links, they're auth gated.