Nah docker is excellent and far far preferable to the standard way of doing things for most people.
All the traditional distribution methods have barely even figured out how to uninstall a piece of software. `apt remove` will often leave other things lying around.
He complains about configuration being more complex because its not a file? Except it is, and its so much simpler to just have a compose file that tells you EXACTLY which files are used for configuration and where they are. This is still not easy in normal linux packages. You have to google or dig through a list of 10-15 places that may be used for config..
The other brilliant opinion is that docker makes the barrier to entry for packaging too low.. but the alternative is not those things being packaged well in apt or whatever, its them not being packaged at all.. this is not a win.
Whats the alternative to running a big project like say Sourcegraph through a docker compose instance? You have to get set up ~10 services yourself including Redis, Postgres, some logging thing blah blah. I do not believe this is ever easier than `docker compose up -d`.
And if you want to run it without docker, the docker images are basically a self-documenting system for how to do that. This is strictly a win over previous systems, where there generally just is no documentation for that.
Personally docker lowers the activation energy to deploy something that I can now try/run so many complex pieces of software so easily. I run sourcegraph, nginx, postgres, redis on my machine easily. A week ago I wanted to learn more about data engineering - got a whole Apache Airflow cluster setup with a single compose file, took a few minutes. That would have been at least an hour of following some half-outdated deploy guide before so I just wouldn't have done it.
---
The beginning of the post is most revealing though:
> The fact that I have multiple times had to unpack flatpaks and modify them to fix dependencies reveals ... Still, these systems work reasonably well, well enough that they continue to proliferate...
Basically you are okay with opening up flatpaks to modify them but not opening up docker images.. it just comes down to familiarity.
The thing I'm always interested in finding out is how people using lots of containers deal with security updates.
Do they go into every image and run the updates for the base image it's based on? Does that just mean you're now subject to multiple OS security and patching and best practices?
Do they think it doesn't matter and just get new images then they're published, which from what I can see is just when the overplayed software has an update and not the system anduvraries it relies on?
When glibc or zlib have a security update, what do you do? For RHEL and derivatives it looks line quay.io tries to help with that, but what's the best practice for the rest of the community?
We haven't really adopted containers at all at work yet, and to be honest this is at least part of the reason. It feels like we'd be doing all the patching we currently are, and then a bunch more. Sure, there are some gains, but if also feels like there's a lot more work involved there too?
So i primarily use containers on my local machine walled off from the internet, so it's not a big concern for me. Watchtower [1] is popular among home server users too which automatically updates containers to the latest image.
For production uses I think companies generally build their own containers. They would have a common base linux container and build the other containers based off that with a typical CI/CD pipeline. So if glibc is patched, it's probably patched in the base container and the others are then rebuilt. You don't have to patch each container individually, just the base. Once the whole build pipeline is automated its not too hard to add checks for security updates and rebuild when needed. Production also minimizes the scope of containers with nothing installed except what's necessary so they have few dependencies.
As one of those people: we don’t do much, if anything at all. I use containers to try things out behind a VPN like Tailscale. I use containers at work to deploy software but there it is put behind some kind of SSO proxy and lots of different rules apply. Even if I deploy something externally facing, 3rd party containers are going to be on a private docker network - much easier setup than dealing with allow/blocklists and/or firewall rules. Of course I need to take care of my code’s security but that is true with or without containers. For the situations where you deploy 3rd party services facing externally there are solutions (like Watchtower) that I can run and be more confident that this update would not ripple through the system with unknown (and sometimes hidden) consequences.
You can use tools like Snyk to scan images for vulnerabilities (even the Docker(tm) cli tool has one now), you can do things like failing your CI pipeline if there are critica vulnerabilities.
You can also use Dependabot (and others) to update your images on a cronjob-like schedule.
> Whats the alternative to running a big project like say Sourcegraph through a docker compose instance? You have to get set up ~10 services yourself including Redis, Postgres, some logging thing blah blah. I do not believe this is ever easier than `docker compose up -d`.
Even with Docker, it's not as easy as `docker compose up -d`.
You need to setup backups too. How do you backup all these containers? Do they even support it? Meanwhile I already have Postgres and Redis tuned, running, and backuped. I'd even have a warm Postgres replica if I could figure how to make that work.
Then you need to monitor them for security updates. How are you notified your Sourcegraph container and its dependencies' containers need to be updated and restarted? If they used system packages, I'd already have this solved with Munin's APT plugin and/or Debian's unattended-upgrades. So you need to install Watchtower or equivalent. Which can't tell the difference between a security update and other updates, so you might have your software updated with breaking changes at any point.
Alternatively, you can locate and subscribe to the RSS feed for the repository of every Dockerfile involved (or use one of these third-party services providing RSS feeds for Dockerhub) and hope they contain a changelog. If I installed from Debian I wouldn't need that because I'm already subscribed to debian-security-announce@lists.debian.org
Long lived data is stored as a volume which is just a directory on the host. You just backup the directory like you would any other.
I feel like people are misunderstanding.. containers are a wrapper. You can run whatever APT plugin or unattended-upgrades on the container as well its just linux. You can then even snapshot this new state of the container into a new image if you want to persist them. Like you can fully simulate the usual workflow of a regular server if you really want. They don't take away any functionality.
Another thing is docker is not necessarily the be-all end-all way to deploy things especially in production. If I was running Sourcegraph seriously in production, I might not use it. But it does make it so much easier to just try things out or run them as a hobbyist.
I don't back up containers, why would you? I back up the attached volumes, or, more often, their state is fully in a database, which isn't in a container (it doesn't need to be), so I just back the database up.
Wait wait wait. Docker has two use cases and you're conflating them. The original use case is:
Project Foo uses libqxt version 7. Project Bar uses libqxt version 8. They are incompatible so I'd need two development workstations (or later two LXC containers). This is slow and heavy on diskspace; docker solves that problem. This is a great use of docker.
The second use case that it has morphed into is:
I've decided stack management is hard and I can't tell my downstream which libraries they need because even I'm no longer sure. So I'll just bundle them all as this opaque container and distribute it that way and now literally nobody knows what versions of what software they are running in production. This is a very harmful use case of docker that is unfortunately nearly universal at this point.
I don't understand how stack management without Docker is any better. As far as I can tell, the alternative without Docker is the same. A list of `apt install`s with no versions listed or no mention of dependencies at all. If you used a lockfile in a language package manager, you can use the same lockfile in the docker too.
> Project Foo uses libqxt version 7. Project Bar uses libqxt version 8
Pretty much fixed by asdf or any sane manager.. Also, two containers aren't slow at all, but rebuilding one completely from scratch is.
The latter is a good use case. No big deal to include a commit/version number, and a list of packages / libraries used in a lock file..
The problems with docker I have are: no lock files to pin versions. No integration with other package management tools. For example, I want to install my dependencies as a step in Docker. I don't want to manually copy over the list of dependencies from my lockfiles (gemfile, package, etc). The reason why I'd want this is that it speeds up builds.
Exactly, Docker + Compose is the best way of running server software these days.
I keep my compose files in source control and I have got a CI server building images for anything that doesn’t have first party images available.
Updates are super easy as well, just update the pinned version at the top of the compose file (if not using latest), then ’docker-compose pull’ followed by ’docker-compose up -d’
The entire thing is so much more stable and easier to manage than the RedHat/Ubuntu/FreeBSD systems I used to manage.
I spent the last couple of days trying to set up some software waddling though open source repositories, fixing broken deps, pinning minor versions, browsing SO for obscure errors... I wish I had a Docker container instead.
I just found about a half a gigabyte of files from programs I uninstalled two laptops ago on my machine, and files from a Steam game I got a refund for because it would crash after fifteen minutes. It’s frustrating.
> Whats the alternative to running a big project like say Sourcegraph through a docker compose instance? You have to get set up ~10 services yourself including Redis, Postgres, some logging thing blah blah. I do not believe this is ever easier than `docker compose up -d`.
Maybe my recollection is just fuzzy, but it seems to me back in the day many projects just had fewer dependencies and more of them were optional. "For larger installations and/or better performance, here's where to configure your redis instance."
Instead now you try and run someone's little self-hosted bookmark manager and it's "just" docker-compose up to spin up the backend API, frontend UI, Postgres, redis, elasticsearch, thumbor, and a KinD container that we use to dynamically run pods for scraping your bookmarked websites!
I'd almost _rather_ that sort of setup be reserved for stuff where it's worth investing the time to set it up.
All of this complexity is easier to get up this way, but that doesn't make it easier to properly manage or a _good_ way to do things. I'd much rather run _one_ instance of Postgres, set up proper backups _once_, perform upgrades _once_, etc. Even if I don't care about the hardware resource usage, I do care about my time. How do I point this at an external postgres instance? Unfortunately, the setup instructions for many services these days start _and end_ at `docker-compose up`.
And this idea of "dockerfiles as documentation" should really die. There are often so many implicit assumptions baked into them as to make them a complete minefield for use as a reference. And unless you're going to dig into a thousand lines of bash scripts, they're not going to answer the questions you actually need answers to like "how do I change this configuration option?".
> figured out how to uninstall a piece of software. `apt remove` will often leave other things lying around.
What's the Docker way of uninstall? In most cases Docker packaged software uses some kind of volume or local mount to save data. Is there a way to remove these when you remove the container? What about networks? (besides running prune on all available entities)
You can `docker rm -v <container>` to remove a ccontainer + its volumes. For larger stuff typically I just read the docker-compose file with all of that listed in one place, so its pretty easy to just `docker rm` the volumes and networks. With apt I have no idea what files were added or modified and there's no simple "undo".
> All the traditional distribution methods have barely even figured out how to uninstall a piece of software. `apt remove` will often leave other things lying around.
Sounds like an issue that needs to be fixed instead of working around it. Also dependency hell. Few distros manage those hard problems nicely but they do exist.
> A week ago I wanted to learn more about data engineering - got a whole Apache Airflow cluster setup with a single compose file, took a few minutes.
Off-topic, but how did you like it? Tried it out a couple of years ago and felt like it overcomplicates things for probably 99% of use cases, and the overhead is huge.
> All the traditional distribution methods have barely even figured out how to uninstall a piece of software.
The most traditional way is to compile under /usr/local/application_name, and symlinking to /usr/local/(s)bin. Remove the folder and the links, and you're done.
> `apt remove` will often leave other things lying around.
"remove" is designed to leave config and database files in place, assuming that you might want to install it later without losing data. apt has "purge" option for the last decade which removes anything and everything completely.
> Whats the alternative to running a big project like say Sourcegraph through a docker compose instance? You have to get set up ~10 services yourself including Redis, Postgres, some logging thing blah blah. I do not believe this is ever easier than `docker compose up -d`.
Install in a single or on a couple of pet servers. Configure them, forget them. If you fancy, spawn a couple of VMs, and snapshot them daily. While it takes a bit more time, for a single time job, I won't mind.
Docker's biggest curse is it enables many bad practices and advertises as best practices. Why enable SSL on a service while I can add a SSL terminating container? Why tune a piece of software if I can spawn five more copies of that for scaling?, etc.
Docker is nice when it's packaged and documented well, and good for packaging stable meta-utilities (e.g.: A documenting pipeline which runs daily and terminates), but using it as a silver bullet and accepting its bad practices as gold standards is wasting resources, space, time; and creating security problems at the same time.
Basically you are okay with installing containers blindly but not installing services and learning apt purge. it just comes down to familiarity.
> Anyway, what I really wanted to complain a bit about is the realm of software intended to be run on servers.
Okay.
> I'm not sure that Docker has saved me more hours than it's cost
I'm not sure what's the alternative for servers here. Containers have certainly saved me of a lot of headache and created very little overhead. N=1 (as it seems to be the OP).
> The problem is the use of Docker as a lowest common denominator [...] approach to distributing software to end users.
Isn't the issue specific for server use? Are you running random images from the internet on your servers?
> In the worst case, some Docker images provide no documentation at all
Well, in the same vein as my last comment, Docker is not a silver bullet for everything. You still have to take care of what you're actually running.
Honestly the discussion is valid, but I think the OP aimed at "the current state of things" and hit a very valuable tool that doesn't deserve some of the targeted cristicism I read here.
edit: my two cents for those who cannot bother and expect just because it's a container, everything will magically be solved: use official images and those from Bitnami. There, you're set.
> I'm not sure what's the alternative for servers here.
Nixos/nixpkgs: isolated dependencies / services, easy to override if needed, configs follow relatively consistent pattern (main options exposed, others can be passed as text), service files can do isolation by whitelisting paths without going full-blown self-contained-os container.
> Are you running random images from the internet on your servers?
Many home server users do this. In business use, unless you invest lots of time into this, a part of your services is still effectively a random image from the internet.
It's always been surprising to me how little people seem to care about the provenance of their images. It's even more surprising that infosec isn't forcing developing to start their images `FROM scratch`.
> Isn't the issue specific for server use? Are you running random images from the internet on your servers?
Well exactly, there's what the author is writing about.
The whole article is dedicated to the problem of Docker being used as a distribution method, that is as a replacement for say Debian package.
So in order to use that software you need to run a Docker image from the internet which is open poorly made and incompatible with your infrastructure. Had a package been available you'd simply do "apt-get install" inside your own image built with your infrastructure in mind.
> You still have to take care of what you’re actually running.
This is the central thesis of OP, though. Pre-made/official images are not very good and docker in general doesn’t provide any means to improve/control quality.
You know who really knows how to package software? Mark Russinovich, Nir Sofer, and all the others who gave us beautiful utililies in standalone EXE's that don't require any dependencies.
For the longest time I stayed on older versions of .NET so any version of Windows since 2003 could run my software out of the box. Made use of ILMerge or a custom AssemblyResolve handler to bundle support DLL's right into my single-file tools - it wasn't hard.
I have no complaints about Docker, but I do find where I used to be able to download simple zip files and place their contents into my project I now just get a black box Docker link with zero documentation and that makes me sad.
Exactly my thoughts. The Linux guys have been discussing the merits of package management and various related systems since I started getting interested in computers.
Yet after all this time they have not come close to something as simple as the double click to run .exe or self-installing binary you can find on windows (macOS also has completely self-contained apps). So having managed linux servers and relatives I'm a bit confused that we are still there, discussing the merits of stupid packaging software, that follow some sort of ideal but never actually work properly (at least, scale very badly and react poorly to changes).
Everything he said about docker is true but it also applies to the regular package management in various linux distros. In the age of very fast upload bandwidth and very affordable storage, docker is even more suspicious against a regular lightweight VM. Doesn't have as good of security separation, more annoying to reproduce and required more setup; VM bit for bit copy is also extremely simple. I believe one reason we got docker is because they couldn't figure out how to partition hardware at the machine level efficiently so instead, they partitioned CPUs. Easier to manage in hardware, more of a pain in software...
But the reason no user facing operating system ever uses this kind of software management is that it never works, no matter the ideals, a lot like communism. What a waste of time, I guess at least it makes for some fun discussion from time to time.
>One of the great sins of Docker is having normalized running software as root. Yes, Docker provides a degree of isolation, but from a perspective of defense in depth running anything with user exposure as root continues to be a poor practice.
>Perhaps one of the problems with Docker is that it's too easy to use
If you've ever had to make a nonroot docker image (or an image that runs properly with the `--read-only` flag), it's not as trivial and fast to get things going—if it was default, perhaps docker wouldn't have been so successful in getting engineers of all types and levels to adopt it?
It's rare to find tooling in the DevOps/SRE world that's easy to just get started with productively, so docker's low barrier to entry is an exception IMO. Yes, the downside is you get a lot of poorly-made `Dockerfiles` in the wild, but it's also easy to iterate and improve them, given that there's a common ground. It's a curse I suppose, but I'd rather have a well-understood curse than the alternative being an arbitrary amount of bespoke curses.
> One of the basic concepts shared by most Linux systems is centralization of dependencies. Libraries should be declared as dependencies, and the packages depended on should be installed in a common location for use of the linker. This can create a challenge: different pieces of software might depend on different versions of a library, which may not be compatible. This is the central challenge of maintaining a Linux distribution, in the classical sense: providing repositories of software versions that will all work correctly together.
Maybe someone with more knowledge of Linux history can explain this for me, because I never understood it: Why is it so important that there must always only be one single version of a library installed on the entire system? What keeps a distribution from identifying a library by its name and version and allowing application A to use v1.1 and application B to use v1.2 at the same time?
Instead the solution of distros seems to be to enforce a single version and then to bend the entire world around this restriction - which then leads to unhappy developers that try to sidestep distro package management altogether and undermine the (very reasonable) "all software in a distro is known to work together" invariant.
If there's a chain of dependencies (libraries depending on other libraries), a single process might end up with different versions of the same library in it's memory.
That's not going to work, since the interface/API of the library is typically not versioned.
For security (and general bug fixing). If a security issue is found you want to only fix it in one place. The container alternative is searching down how all the containers were built, which might be very varied and some not even reproducible, and fixing them all.
Indeed, containers suck for this: Not just do you have to look into each container separately, there is also no requirement all containers use the same structure at all, as you said.
Yet this is exactly what happens in practice if the one-version-for-all paradigm is impractical for developers.
So it would be in the interest of distros to give some way in that regard: Keep the centralised dependency dogma in place, but allow multiple independent versions and configurations be stored.
Then bug fixing might be slightly harder than with a single version, because you might have to fix 3 versions instead of one, but still much easier than monkey-patching half a dozen containers.
(It would still be good to "nudge" developers towards a canonical version, i.e. by sending them reminders when the software uses a lower version or a different configuration - or if you want even a warning that everything below a certain minimum version will be rejected)
The idea is as basic as you can get - and has probably been thought up by every dev after their first major dependency conflict. I'm pretty sure distro maintainers know about it too.
So my question is just what the problems are that prevent adoption, even after decades of dealing with the problem, even in the face of growing threats from devs to abandon the distro model altogether.
>Doing anything non-default with networks in Docker Compose will often create stacks that don't work correctly on machines with complex network setups.
I run into this often. Docker networking is a mess.
Depending on the load and use case, encapsulating docker itself in an lxc container or a standalone vm can be a semi maintainable and separated solution.
Dockerfiles are really really simple. they do essentially three things: set a base image, set environment variables, and run scripts. and then as sort of a meta-thing, they prove that the steps in the dockerfile actually work, when you start the docker container and see that it works.
if you don't want to run in docker, a dockerfile is still a perfect setup script. open it up and see what it does, and use that as install instructions.
I’ve debugged project Dockerfiles to discover that they were pulling dependencies from URLS with “LATEST” in them. A Dockerfile isn’t really proof that anything currently works.
I remember how things were before docker. Better is not a word I'd use for that.
It sucked. Deploying software meant dealing with lots of operating system and distribution specific configuration, issues, bugs, etc. All of that had to be orchestrated with complicated scripts. First those were hand written, and later we got things like chef and puppet. Docker wiped most of that out and replaced it with simple build time tools that eliminate the need for having a lot of deploy time tools that take ages to run, are very complex to maintain, etc.
I also love to use it for development a lot. It allows me to use lots of different things that I need without having to bother installing those things. Saves a lot of time.
Docker gives us a nice standard way to run and configure whatever. Mostly configuration only gets hard when the underlying software is hard to configure. That's usually a problem with the software, not with docker. These days if you are building something that requires fiddling with some configuration file, it kind of is broken by design. You should design your configuration with docker in mind and not force people to have to mount volumes just so they can set some properties.
The reason docker is so widespread is that it is so obviously a good idea and there hasn't been anyone that came along with something better that actually managed to get any traction worth talking about. Most of the docker alternatives tend to be compatible with docker to the point that the differences are mostly in how they run dockerized software.
And while I like docker, I think Kubernetes is a hopelessly over engineered and convoluted mess.
> Deploying software meant dealing with lots of operating system and distribution specific configuration
But Docker didn't solve that at all, unless you consider "we only support Linux, so just run Linux or fuck you" as a "solution". And starting a Linux VM on Windows or macOS doesn't really count (and comes with a lot of issues).
Containers as a concept are fine. Docker as an implementation is not very good. These are people who released a program in 2014 that can only run as root, which should say a thing or two about the engineering ethos. What is this, 1982?
> All of that had to be orchestrated with complicated scripts.
Hacking deploy scripts (and configure scripts and Makefiles) wasn't fun either, but at least I could understand it. Hacking on Docker once something goes wrong is pretty much impossible.
It really is hugely over-engineered for what it does. You really really don't need a million lines of code to build and run containers on Linux.
I've had a number of machines where binary JSON files in /var/lib/docker got corrupted and the only solution I've ever been able to find is "completely wipe away the lot and start from scratch". The entire overlayfs thing they have can really go haywire for reasons I've never been able to reproduce or figure out (and the "solution" is similar: wipe away /var/lib/docker and start from 0), and things like that. It all "works", but it's a hugely untransparent black box where your only solution when things go wrong is to shrug and give up (or spend days or even weeks on figuring it all out).
I've had enough issues and outright bugs that I probably spent more time on Docker and docker-compose than I saved. At my last job I just bypassed the "official Docker development environment" with a few small shell scripts to run things locally, because it was a never-ending source of grief. I only had to support my own Linux system so I had it a bit easy, but it wasn't much more than running our programs with the right flags. The platform-specific stuff was "my-pkg-manager install postgres redis", and I don't see what's so hard about that.
How is writing those complicated scripts but in a dockerfile or docker-compose file any better?
Software written with docker in mind is easier to manage because it generally follows better design principles, such as separating configuration from state, being failure tolerant, treating the network as opaque, etc. This software would be easy to deploy without using docker as well.
If you're trying to deploy some complex piece of software which doesn't follow these principles, it's exactly as hard or even harder with docker. Unless you outsource the work to random people on the internet, but then you are not building production systems.
Containers are great for lots of things and containerization in general has forced developers to write better software, but there really isn't a lot of difference in difficulty in running that webapp in a container vs just running it on the machine directly.
All the traditional distribution methods have barely even figured out how to uninstall a piece of software. `apt remove` will often leave other things lying around.
He complains about configuration being more complex because its not a file? Except it is, and its so much simpler to just have a compose file that tells you EXACTLY which files are used for configuration and where they are. This is still not easy in normal linux packages. You have to google or dig through a list of 10-15 places that may be used for config..
The other brilliant opinion is that docker makes the barrier to entry for packaging too low.. but the alternative is not those things being packaged well in apt or whatever, its them not being packaged at all.. this is not a win.
Whats the alternative to running a big project like say Sourcegraph through a docker compose instance? You have to get set up ~10 services yourself including Redis, Postgres, some logging thing blah blah. I do not believe this is ever easier than `docker compose up -d`.
And if you want to run it without docker, the docker images are basically a self-documenting system for how to do that. This is strictly a win over previous systems, where there generally just is no documentation for that.
Personally docker lowers the activation energy to deploy something that I can now try/run so many complex pieces of software so easily. I run sourcegraph, nginx, postgres, redis on my machine easily. A week ago I wanted to learn more about data engineering - got a whole Apache Airflow cluster setup with a single compose file, took a few minutes. That would have been at least an hour of following some half-outdated deploy guide before so I just wouldn't have done it.
---
The beginning of the post is most revealing though:
> The fact that I have multiple times had to unpack flatpaks and modify them to fix dependencies reveals ... Still, these systems work reasonably well, well enough that they continue to proliferate...
Basically you are okay with opening up flatpaks to modify them but not opening up docker images.. it just comes down to familiarity.
Do they go into every image and run the updates for the base image it's based on? Does that just mean you're now subject to multiple OS security and patching and best practices?
Do they think it doesn't matter and just get new images then they're published, which from what I can see is just when the overplayed software has an update and not the system anduvraries it relies on?
When glibc or zlib have a security update, what do you do? For RHEL and derivatives it looks line quay.io tries to help with that, but what's the best practice for the rest of the community?
We haven't really adopted containers at all at work yet, and to be honest this is at least part of the reason. It feels like we'd be doing all the patching we currently are, and then a bunch more. Sure, there are some gains, but if also feels like there's a lot more work involved there too?
For production uses I think companies generally build their own containers. They would have a common base linux container and build the other containers based off that with a typical CI/CD pipeline. So if glibc is patched, it's probably patched in the base container and the others are then rebuilt. You don't have to patch each container individually, just the base. Once the whole build pipeline is automated its not too hard to add checks for security updates and rebuild when needed. Production also minimizes the scope of containers with nothing installed except what's necessary so they have few dependencies.
[1] https://github.com/containrrr/watchtower
You can also use Dependabot (and others) to update your images on a cronjob-like schedule.
This is a solved problem
I am of course partially joking.
But seriously, use musl libc, build static binaries, build the images from scratch, and have a CI server handle it for you to keep it updated.
Alternatively use a small image like Alpine as base if you want some tools in the image for debugging.
For containers we create ourselves, we automatically rebuild them each night which pulls the latest security updates.
[1] https://github.com/fmartinou/whats-up-docker
s/overplayed/overlaid/
s/anduvraries/and libraries/
s/looks line/looks like/
TL/DR: they don't. #yolo, don't be a square.
Even with Docker, it's not as easy as `docker compose up -d`.
You need to setup backups too. How do you backup all these containers? Do they even support it? Meanwhile I already have Postgres and Redis tuned, running, and backuped. I'd even have a warm Postgres replica if I could figure how to make that work.
Then you need to monitor them for security updates. How are you notified your Sourcegraph container and its dependencies' containers need to be updated and restarted? If they used system packages, I'd already have this solved with Munin's APT plugin and/or Debian's unattended-upgrades. So you need to install Watchtower or equivalent. Which can't tell the difference between a security update and other updates, so you might have your software updated with breaking changes at any point.
Alternatively, you can locate and subscribe to the RSS feed for the repository of every Dockerfile involved (or use one of these third-party services providing RSS feeds for Dockerhub) and hope they contain a changelog. If I installed from Debian I wouldn't need that because I'm already subscribed to debian-security-announce@lists.debian.org
I feel like people are misunderstanding.. containers are a wrapper. You can run whatever APT plugin or unattended-upgrades on the container as well its just linux. You can then even snapshot this new state of the container into a new image if you want to persist them. Like you can fully simulate the usual workflow of a regular server if you really want. They don't take away any functionality.
Another thing is docker is not necessarily the be-all end-all way to deploy things especially in production. If I was running Sourcegraph seriously in production, I might not use it. But it does make it so much easier to just try things out or run them as a hobbyist.
Project Foo uses libqxt version 7. Project Bar uses libqxt version 8. They are incompatible so I'd need two development workstations (or later two LXC containers). This is slow and heavy on diskspace; docker solves that problem. This is a great use of docker.
The second use case that it has morphed into is:
I've decided stack management is hard and I can't tell my downstream which libraries they need because even I'm no longer sure. So I'll just bundle them all as this opaque container and distribute it that way and now literally nobody knows what versions of what software they are running in production. This is a very harmful use case of docker that is unfortunately nearly universal at this point.
Pretty much fixed by asdf or any sane manager.. Also, two containers aren't slow at all, but rebuilding one completely from scratch is.
The latter is a good use case. No big deal to include a commit/version number, and a list of packages / libraries used in a lock file..
The problems with docker I have are: no lock files to pin versions. No integration with other package management tools. For example, I want to install my dependencies as a step in Docker. I don't want to manually copy over the list of dependencies from my lockfiles (gemfile, package, etc). The reason why I'd want this is that it speeds up builds.
I keep my compose files in source control and I have got a CI server building images for anything that doesn’t have first party images available.
Updates are super easy as well, just update the pinned version at the top of the compose file (if not using latest), then ’docker-compose pull’ followed by ’docker-compose up -d’
The entire thing is so much more stable and easier to manage than the RedHat/Ubuntu/FreeBSD systems I used to manage.
(I use Alpine Linux + ZFS for the host OS)
If you mean configuration files, then this is by design.
`apt purge` removes those as well.
Maybe my recollection is just fuzzy, but it seems to me back in the day many projects just had fewer dependencies and more of them were optional. "For larger installations and/or better performance, here's where to configure your redis instance."
Instead now you try and run someone's little self-hosted bookmark manager and it's "just" docker-compose up to spin up the backend API, frontend UI, Postgres, redis, elasticsearch, thumbor, and a KinD container that we use to dynamically run pods for scraping your bookmarked websites!
I'd almost _rather_ that sort of setup be reserved for stuff where it's worth investing the time to set it up.
All of this complexity is easier to get up this way, but that doesn't make it easier to properly manage or a _good_ way to do things. I'd much rather run _one_ instance of Postgres, set up proper backups _once_, perform upgrades _once_, etc. Even if I don't care about the hardware resource usage, I do care about my time. How do I point this at an external postgres instance? Unfortunately, the setup instructions for many services these days start _and end_ at `docker-compose up`.
And this idea of "dockerfiles as documentation" should really die. There are often so many implicit assumptions baked into them as to make them a complete minefield for use as a reference. And unless you're going to dig into a thousand lines of bash scripts, they're not going to answer the questions you actually need answers to like "how do I change this configuration option?".
What's the Docker way of uninstall? In most cases Docker packaged software uses some kind of volume or local mount to save data. Is there a way to remove these when you remove the container? What about networks? (besides running prune on all available entities)
Sounds like an issue that needs to be fixed instead of working around it. Also dependency hell. Few distros manage those hard problems nicely but they do exist.
Off-topic, but how did you like it? Tried it out a couple of years ago and felt like it overcomplicates things for probably 99% of use cases, and the overhead is huge.
There probably are good reasons to use it when you have complex distributed DAGs though.
The most traditional way is to compile under /usr/local/application_name, and symlinking to /usr/local/(s)bin. Remove the folder and the links, and you're done.
> `apt remove` will often leave other things lying around.
"remove" is designed to leave config and database files in place, assuming that you might want to install it later without losing data. apt has "purge" option for the last decade which removes anything and everything completely.
> Whats the alternative to running a big project like say Sourcegraph through a docker compose instance? You have to get set up ~10 services yourself including Redis, Postgres, some logging thing blah blah. I do not believe this is ever easier than `docker compose up -d`.
Install in a single or on a couple of pet servers. Configure them, forget them. If you fancy, spawn a couple of VMs, and snapshot them daily. While it takes a bit more time, for a single time job, I won't mind.
Docker's biggest curse is it enables many bad practices and advertises as best practices. Why enable SSL on a service while I can add a SSL terminating container? Why tune a piece of software if I can spawn five more copies of that for scaling?, etc.
Docker is nice when it's packaged and documented well, and good for packaging stable meta-utilities (e.g.: A documenting pipeline which runs daily and terminates), but using it as a silver bullet and accepting its bad practices as gold standards is wasting resources, space, time; and creating security problems at the same time.
Basically you are okay with installing containers blindly but not installing services and learning apt purge. it just comes down to familiarity.
Obligatory XKCD: https://xkcd.com/1988/
Okay.
> I'm not sure that Docker has saved me more hours than it's cost
I'm not sure what's the alternative for servers here. Containers have certainly saved me of a lot of headache and created very little overhead. N=1 (as it seems to be the OP).
> The problem is the use of Docker as a lowest common denominator [...] approach to distributing software to end users.
Isn't the issue specific for server use? Are you running random images from the internet on your servers?
> In the worst case, some Docker images provide no documentation at all
Well, in the same vein as my last comment, Docker is not a silver bullet for everything. You still have to take care of what you're actually running.
Honestly the discussion is valid, but I think the OP aimed at "the current state of things" and hit a very valuable tool that doesn't deserve some of the targeted cristicism I read here.
edit: my two cents for those who cannot bother and expect just because it's a container, everything will magically be solved: use official images and those from Bitnami. There, you're set.
Nixos/nixpkgs: isolated dependencies / services, easy to override if needed, configs follow relatively consistent pattern (main options exposed, others can be passed as text), service files can do isolation by whitelisting paths without going full-blown self-contained-os container.
> Are you running random images from the internet on your servers?
Many home server users do this. In business use, unless you invest lots of time into this, a part of your services is still effectively a random image from the internet.
> and those from Bitnami
Yes, that's a random image from the internet.
By that definition, you are running it on a random OS, random processor with some random network infra.
Well exactly, there's what the author is writing about.
The whole article is dedicated to the problem of Docker being used as a distribution method, that is as a replacement for say Debian package.
So in order to use that software you need to run a Docker image from the internet which is open poorly made and incompatible with your infrastructure. Had a package been available you'd simply do "apt-get install" inside your own image built with your infrastructure in mind.
In other words, random images from the internet.
> You still have to take care of what you’re actually running.
This is the central thesis of OP, though. Pre-made/official images are not very good and docker in general doesn’t provide any means to improve/control quality.
For the longest time I stayed on older versions of .NET so any version of Windows since 2003 could run my software out of the box. Made use of ILMerge or a custom AssemblyResolve handler to bundle support DLL's right into my single-file tools - it wasn't hard.
I have no complaints about Docker, but I do find where I used to be able to download simple zip files and place their contents into my project I now just get a black box Docker link with zero documentation and that makes me sad.
Yet after all this time they have not come close to something as simple as the double click to run .exe or self-installing binary you can find on windows (macOS also has completely self-contained apps). So having managed linux servers and relatives I'm a bit confused that we are still there, discussing the merits of stupid packaging software, that follow some sort of ideal but never actually work properly (at least, scale very badly and react poorly to changes).
Everything he said about docker is true but it also applies to the regular package management in various linux distros. In the age of very fast upload bandwidth and very affordable storage, docker is even more suspicious against a regular lightweight VM. Doesn't have as good of security separation, more annoying to reproduce and required more setup; VM bit for bit copy is also extremely simple. I believe one reason we got docker is because they couldn't figure out how to partition hardware at the machine level efficiently so instead, they partitioned CPUs. Easier to manage in hardware, more of a pain in software...
But the reason no user facing operating system ever uses this kind of software management is that it never works, no matter the ideals, a lot like communism. What a waste of time, I guess at least it makes for some fun discussion from time to time.
Dead Comment
>Perhaps one of the problems with Docker is that it's too easy to use
If you've ever had to make a nonroot docker image (or an image that runs properly with the `--read-only` flag), it's not as trivial and fast to get things going—if it was default, perhaps docker wouldn't have been so successful in getting engineers of all types and levels to adopt it?
It's rare to find tooling in the DevOps/SRE world that's easy to just get started with productively, so docker's low barrier to entry is an exception IMO. Yes, the downside is you get a lot of poorly-made `Dockerfiles` in the wild, but it's also easy to iterate and improve them, given that there's a common ground. It's a curse I suppose, but I'd rather have a well-understood curse than the alternative being an arbitrary amount of bespoke curses.
It’s all fun and games until the bills come due.
Maybe someone with more knowledge of Linux history can explain this for me, because I never understood it: Why is it so important that there must always only be one single version of a library installed on the entire system? What keeps a distribution from identifying a library by its name and version and allowing application A to use v1.1 and application B to use v1.2 at the same time?
Instead the solution of distros seems to be to enforce a single version and then to bend the entire world around this restriction - which then leads to unhappy developers that try to sidestep distro package management altogether and undermine the (very reasonable) "all software in a distro is known to work together" invariant.
So, why?
Yet this is exactly what happens in practice if the one-version-for-all paradigm is impractical for developers.
So it would be in the interest of distros to give some way in that regard: Keep the centralised dependency dogma in place, but allow multiple independent versions and configurations be stored.
Then bug fixing might be slightly harder than with a single version, because you might have to fix 3 versions instead of one, but still much easier than monkey-patching half a dozen containers.
(It would still be good to "nudge" developers towards a canonical version, i.e. by sending them reminders when the software uses a lower version or a different configuration - or if you want even a warning that everything below a certain minimum version will be rejected)
Fedora had support for modularity: https://docs.fedoraproject.org/en-US/modularity/ . Join the Fedora project, please.
The idea is as basic as you can get - and has probably been thought up by every dev after their first major dependency conflict. I'm pretty sure distro maintainers know about it too.
So my question is just what the problems are that prevent adoption, even after decades of dealing with the problem, even in the face of growing threats from devs to abandon the distro model altogether.
I run into this often. Docker networking is a mess.
if you don't want to run in docker, a dockerfile is still a perfect setup script. open it up and see what it does, and use that as install instructions.
It sucked. Deploying software meant dealing with lots of operating system and distribution specific configuration, issues, bugs, etc. All of that had to be orchestrated with complicated scripts. First those were hand written, and later we got things like chef and puppet. Docker wiped most of that out and replaced it with simple build time tools that eliminate the need for having a lot of deploy time tools that take ages to run, are very complex to maintain, etc.
I also love to use it for development a lot. It allows me to use lots of different things that I need without having to bother installing those things. Saves a lot of time.
Docker gives us a nice standard way to run and configure whatever. Mostly configuration only gets hard when the underlying software is hard to configure. That's usually a problem with the software, not with docker. These days if you are building something that requires fiddling with some configuration file, it kind of is broken by design. You should design your configuration with docker in mind and not force people to have to mount volumes just so they can set some properties.
The reason docker is so widespread is that it is so obviously a good idea and there hasn't been anyone that came along with something better that actually managed to get any traction worth talking about. Most of the docker alternatives tend to be compatible with docker to the point that the differences are mostly in how they run dockerized software.
And while I like docker, I think Kubernetes is a hopelessly over engineered and convoluted mess.
But Docker didn't solve that at all, unless you consider "we only support Linux, so just run Linux or fuck you" as a "solution". And starting a Linux VM on Windows or macOS doesn't really count (and comes with a lot of issues).
Containers as a concept are fine. Docker as an implementation is not very good. These are people who released a program in 2014 that can only run as root, which should say a thing or two about the engineering ethos. What is this, 1982?
> All of that had to be orchestrated with complicated scripts.
Hacking deploy scripts (and configure scripts and Makefiles) wasn't fun either, but at least I could understand it. Hacking on Docker once something goes wrong is pretty much impossible.
It really is hugely over-engineered for what it does. You really really don't need a million lines of code to build and run containers on Linux.
I've had a number of machines where binary JSON files in /var/lib/docker got corrupted and the only solution I've ever been able to find is "completely wipe away the lot and start from scratch". The entire overlayfs thing they have can really go haywire for reasons I've never been able to reproduce or figure out (and the "solution" is similar: wipe away /var/lib/docker and start from 0), and things like that. It all "works", but it's a hugely untransparent black box where your only solution when things go wrong is to shrug and give up (or spend days or even weeks on figuring it all out).
I've had enough issues and outright bugs that I probably spent more time on Docker and docker-compose than I saved. At my last job I just bypassed the "official Docker development environment" with a few small shell scripts to run things locally, because it was a never-ending source of grief. I only had to support my own Linux system so I had it a bit easy, but it wasn't much more than running our programs with the right flags. The platform-specific stuff was "my-pkg-manager install postgres redis", and I don't see what's so hard about that.
Software written with docker in mind is easier to manage because it generally follows better design principles, such as separating configuration from state, being failure tolerant, treating the network as opaque, etc. This software would be easy to deploy without using docker as well.
If you're trying to deploy some complex piece of software which doesn't follow these principles, it's exactly as hard or even harder with docker. Unless you outsource the work to random people on the internet, but then you are not building production systems.
Containers are great for lots of things and containerization in general has forced developers to write better software, but there really isn't a lot of difference in difficulty in running that webapp in a container vs just running it on the machine directly.
Deleted Comment