Containers vs. Zones vs. Jails vs. VMs (2017)

> A “container” is just a term people use to describe a combination of Linux namespaces and cgroups. Linux namespaces and cgroups ARE first class objects. NOT containers.

Amen.

Somewhat tangential note: most developers I have met do not understand what a 'container' is. There's an aura of magic and mystique around them. And a heavy emphasis on Docker.

A sizable fraction will be concerned about 'container overhead' (and "scalability issues") when asked to move workloads to containers. They are usually not able to explain what the overhead would be, and what could potentially be causing it. No mention to storage, or how networking would be impacted, just CPU. That's usually said without measuring the actual performance first.

When I press further, what I most commonly get is the sense that they believe that containers are "like VMs, but lighter"(also, I've been told that, literally, a few times, specially when interviewing candidates). To this day, I've heard CGroups being mentioned only once.

I wonder if I'm stuck in the wrong bubble, or if this is widespread.

CBLT · 5 years ago

> To this day, I've heard CGroups being mentioned only once.

See https://www.kernel.org/doc/Documentation/cgroup-v2.txt

> "cgroup" stands for "control group" and is never capitalized. The singular form is used to designate the whole feature and also as a qualifier as in "cgroup controllers". When explicitly referring to multiple individual control groups, the plural form "cgroups" is used.

To this day, I've heard cgroup mentioned only once...

To put forth a more substantive argument, everybody has a layer of abstraction they do not peek under. You interviewed people that didn't peek under container. You went a layer deeper, but never peeked at the source tree to learn what cgroup really is. Does it really feel that much better to be one level above others?

xelxebar · 5 years ago

> To put forth a more substantive argument, everybody has a layer of abstraction they do not peek under.

Sure. Though it's reasonable to want your level N developers to have some idea of what goes on at levels N-1 and perhaps N-2, cf. Law of Leaky Abstractions etc. It's similar to wanting your developers to be aware of their users' needs, which are level N+1 concerns.

thaumasiotes · 5 years ago

> The singular form is used to designate the whole feature and also as a qualifier as in "cgroup controllers". When explicitly referring to multiple individual control groups, the plural form "cgroups" is used.

They're free to say this, but since it violates the rules of the language they're never going to get any significant level of compliance.

the8472 · 5 years ago

> To put forth a more substantive argument, everybody has a layer of abstraction they do not peek under.

I hope you meant have not peeked under, so far. Any obscure problem will you down the rabbit hole. If you just stop looking then how can you solve problems?

Harlekuin · 5 years ago

> everybody has a layer of abstraction they do not peek under

Reminds of this xkcd: https://xkcd.com/435/

The beauty of programming is abstraction - you write a function to do one thing, do it well, and then you can abstract that concept away and reuse that abstraction. Although it's only an abstraction - a container is "like" a lightweight VM, and you can use it like that until it doesn't act like a lightweight VM. In which case, you have to dabble one layer deeper just to understand enough to know why the abstraction of your layer isn't a perfect analogy.

In which case, if you're looking for an expert in layer n, then basic knowledge of layer n - 1 might be a decent proxy for expertise

ZoomZoomZoom · 5 years ago

>Does it really feel that much better to be one level above others?

Yes? Because levels are finite and quantifiable.

matharmin · 5 years ago

I feel nowadays containers generally refer to the concept, and cgroups and namespaces are the implementation details of a specific container runtime. These are very important implementation details for security and performance, but it doesn't fundamentally impact how you structure your containerized application.

You can take the same container image, and run using Docker, Firecracker, gVisor, or many other container runtimes. Some of them are exactly "like a VM, but lighter".

cmckn · 5 years ago

Agreed. The post feels a bit pedantic; I don't know any dev doing "cool things" with the underlying namespaces/cgroups. They're just using Docker. De-mystifying containers has value, but so does the abstraction.

mav3rick · 5 years ago

Please don't propagate this. Running in a hypervisor with a possible different kernel vs running on the same kernel in the same ring as the host are two very different things. Implications of these are very different.

mulmen · 5 years ago

I think it is widespread because containers are (seemingly) marketed as being some kind of magic. The impression I get is that the benefit to containers is that you don't have to think about them. This may be more a product of Docker but I think containers and Docker have become synonymous.

I'm not sure this is a fair comparison but that is my impression. I could be in a bubble too.

nurpax · 5 years ago

I’d like to think so too. It’s hard to actually understand how Docker works by reading their documentation.

Everything looks so simple.. and it magically runs on macOS and Windows too. No wonder people think it’s some sort of a VM.

takeda · 5 years ago

It is product of docker, if you would deploy applications by using namespaces and cgroups directly it is very likely you would see things the same way the author does.

fluffything · 5 years ago

I think how you are asking candidates the question might be unfortunate. A FreeBSD, MacOSX, or Windows dev that know their OSes might never tell you "namespace + cgroup".

Hell if you try to explain containers to 99% of the world programmers by saying "namespace + cgroup", I'd bet you that 0% of them will understand what you mean.

Instead, if you tell them that's "like a VM, but faster, because it "reuses" the host's kernel", you might be able to reach a substantial amount of them, and that level of abstraction would be enough for most of them to do something useful with containers.

Maybe the question you should be asking your candidates is: "How would you implement a Docker-like container app on Linux?". That's a question that specifically constraints the level of abstraction of the answer to at least one layer below Docker, and also specifies that you are interested on hearing the details for Linux.

kqr · 5 years ago

> A “container” is just a term people use to describe a combination of Linux namespaces and cgroups. Linux namespaces and cgroups ARE first class objects. NOT containers.

Wow.

I have always wondered in which cases one would use "containers". I have asked so many Docker enthusiasts, "Why would I use containers instead of the strong process separation functionality that's built into the operating system, like users, cgroups, SELinux, etc?"

The answer has freakishly invariably been "Containers are lighter than virtual machines. Virtual machines use up all your resources. Containers can run just a single application, where virtual machines must emulate a whole operating system."

Now you're probably going, "How is that an answer to the question?"

And I know! It's not. These Docker enthusiasts have been impossible to get an answer out of. They have appeared like they're just parroting whatever they've heard from someone else as long as it seemed vaguely related to the question.

Now that I finally have an answer to my original question, it all makes sense again. And I'm inclined to agree that if you're stuck in the wrong bubble, we're both stuck in the same bubble.

Jnr · 5 years ago

I started using Linux containers long time ago and I have used it to achieve compatibility and ease of use.

Before LXC was introduced it was somewhat painful to manage multiple environments using chroot, managing network, etc.

But running the same software in a single environment wasn't always easy. You had to take care of different software versions. And it wasn't uncommon for things to break frequently because of that.

While things like python venv, ruby rvm, etc. helped dealing with it, there was no universal tool for whole filesystem environments besides virtualization.

When LXC came out, I started using it for everything. Nowadays I use LXD and sometimes Docker and it is so nice and requires minimal effort. I know that without those tools it would be very inconvenient to manage my own servers. I have separate auto-updating containers for everything and if one thing breaks, it doesn't take everything down with it. And when everything is contained and each system has minimal set of packages set up, over the years rarely anything ever breaks.

And let's not forget that these Linux features also enabled the universal packages (flatpaks, snaps, etc.) which make it easier for Desktop users to get up to date software easily.

Of course I know that it is not virtualization. But why do people say "containers are not really containers"? It still contains different environments. No one said it is about containing memory or something else.

marcus_holmes · 5 years ago

I figure Docker wraps stuff up into a nice "thing" that you can use, with documentation, a logo, and mindspace. You can put "Docker" on your CV and there's some hope that a recruiter will know what that means.

tyingq · 5 years ago

Have them look at bocker (docker-like in ~100 lines of bash).

It makes it very clear what docker is, and isn't.

https://github.com/p8952/bocker

Specifically, the bocker_run function: https://github.com/p8952/bocker/blob/master/bocker#L61

hadsed · 5 years ago

so what's the gap here with docker? what incorrect assumptions would i make from assuming this as a model for containers, if anyone knows

erjiang · 5 years ago

This is true for many new waves of popular technologies.

1) A new technology or method becomes popular. 2) Developers find new advantages in using the technology. 3) Understanding of tech and original advantage is somewhat lost.

For example: containers are now widely used as part of a scriptable application build process, e.g. the Dockerfile. There are probably many developers out there who care about this and not about how containers are run and how they interact with the kernel. And for their use cases, that is probably the thing that matters most anyways.

fiddlerwoaroof · 5 years ago

A down side is that people feel like they have to bundle an entire Linux rootfs because they think of a container as a lightweight vm: if they thought of it as a os process running inside various namespaces, they might be more inclined to only ship what they actually need.

pojzon · 5 years ago

This is widespread unfortunately. Developers changed to „users” and no longer pursue the details of solutions. Im not gonna say its the dominant behaviour in the field right now, but I see it more and more often on various experience levels.

jagged-chisel · 5 years ago

An experienced software engineer (yeah, developer) has experience engineering software. It’s been less than 10yrs since the advent of containerized deployments, and the space has been fraught with change nearly on par with the front end JavaScript ecosystem. Might as well just stick to writing code. OK, that’s the perception of my own peers, but I assume it scales.

DevOps is a recent advent, too, and sounds to me like it should be populated with folks who can participate in development and operations. Most developers I’ve ever known aren’t interested in operations.

justanotherc · 5 years ago

You talk as if that's a bad thing.

As a developer I don't want to wade into the details of systems I'm using, I want to spend my time writing code that solves the business problems I'm tasked with solving.

If there is a system that allows me to do that by abstracting away the details I don't care about, why wouldn't I use that system?

nunez · 5 years ago

This is extremely widespread amongst the really large companies I've consulted for (with equally large development teams to boot). "Containers are VMs, but smaller and/or faster" is an extremely common school of thought. I would like to think that this viewpoint is dying somewhat, especially now that many large organizations have at least experimented with orchestrators like Kubernetes or ECS.

I can't blame them, however.

If you're a dev at a company where such misunderstandings are pervasive, and you're interested in trying to get your app running in this "Docker thing that you've heard of", you will, probably:

- need to seek out a Linux dev server because you probably don't have admin rights on your machine, and getting Docker installed onto your machine in a way that doesn't suck is more trouble than its worth,

- have engineering management that are being told that containers are like VMs, but smaller, likely from magazines or sales/pre-sales consultants,

- Have to wait days/weeks to get SSH credentials to a dev server that has RHEL 7 on it, hoping that it has Docker (a thing you've heard of at this point, but don't really know much about it otherwise),

- Have to wait even more time to get Docker installed for you on the dev server by a sysadmin that dislikes Docker because "it's insecure" or something, and

- be constantly reminded that your job is shipping features before anything else, usually at the cost of learning new things that make those features more stable

The point here is that environments like this are barely conducive for getting engineering done, let alone learning about new things. Moreover, the people that will do anything to scratch that itch usually find themselves out of companies like that eventually. It's a circle of suck for everyone involved.

So when I introduce containers (which Docker is, by far, the most common runtime, so I introduce it as "Docker" to avoid confusion) to someone who doesn't know what cgroups or namespaces are, or someone who responds with something about containers being VMs or whatever, I happily meet them where they are at and do my best to show them otherwise.

cortesoft · 5 years ago

Part of it might be that many developers target linux but code on a mac... and on a mac, docker containers DO run in a VM.

BiteCode_dev · 5 years ago

With the new dev ops crave, we expect people to understand backend dev, and front end dev and design and sysadmin and networking and project management and infra and product ownership. Not to mention be proficient with tools related to those things.

The result is not people getting experts at all those things, but getting capable of producing something with all those things.

Obviously, to do so, people must take shortcut, and container ~= docker ~= "like VMs, but lighter" is good enough as a simplification for producing something.

Now there is something to be said about the value, quality, durability and ethics of what is produced.

But that's a choice companies make.

tasogare · 5 years ago

The "containers are just XX Linux technology" comes regularly in the comments but it’s untrue: Windows containers are obviously not based on any Linux tech proper.

Also the overhead intuition exists for a reason: on both macOS and Windows when Linux containers are used, there is actually a whole VM running Linux underneath. And Windows containers come in two flavors, one being Hyper-V based, so again a VM tech comes in play.

So there are technical reasons why containers are "misunderstood", it’s because most people don’t run Linux natively, and on their stack containers are more than just cgroups and namespaces.

wmf · 5 years ago

People love to bring this up, but if Linux did have first-class containers, how would the developer's experience be different?

cyphar · 5 years ago

All system programs could operate and manage the containers running on the system, meaning you can use your existing knowledge to manage a bunch of containers.

For instance, you could run your package manager across all containers to see if they have packages with known CVEs. Or manage the filesystems of all containers on the system (the usefulness of this is only clear with filesystems like ZFS and btrfs). This is effectively what you can do with Solaris Zones.

These kinds of improvements to usability aren't as sexy now that everyone is really excited about Kubernetes (where you blow away containers at whim) but it is useful for the more traditional container usecases that Zones and Jails. LXC (and LXD) is probably the one Linux container runtime that is closest to that original container model.

There's also a very big security argument -- it is (speaking as a maintainer of runc) very hard to construct a secure container on Linux. There are dozens of different facilities you need to individually configure in the right order, with hundreds of individual knobs that some users might want to disable or quirks you need to work around. It's basically impossible to implement a sane and secure container runtime without having read and understood the kernel code which implements the interfaces you're using. If containers were an in-kernel primitive then all of the security design would rest in one single codebase, and all of the policies would be defined by one entity (the kernel).

zurn · 5 years ago

Docker would probably diverge less from LXC, because it was first built on LXC and only later got its own low level implementation using namespaces and the other low-level things. Hard to say if the alternative world would have been better or worse, a lot of LXC/LXD implementation details seem more technically competent than Docker.

lmm · 5 years ago

There would maybe be more consistency. E.g. currently if I say an application is running in a container, do you expect there is virtual networking in place, or not?

loeg · 5 years ago

> Somewhat tangential note: most developers I have met do not understand what a 'container' is.

The problem is at least in part that the term is basically meaningless. It's too broad and flexible to be descriptive. As you quoted:

> A “container” is just a term people use to describe a combination of Linux namespaces and cgroups.

sanderjd · 5 years ago

I'm more sympathetic to the mainstream usage. People are (attempting to) use an abstraction: a "container" is an isolated process runtime environment without virtualization overhead. That abstraction seems useful to me. Ideally it would be usable without too much leakiness, in which case its users would not need to be aware of implementation details like cgroups and namespaces. In practice, all abstractions are leaky to some degree and effective use of an abstraction often eventually requires a more sophisticated understanding of the details beneath the veil of the abstraction. But that doesn't mean the abstraction is totally useless or completely a mirage or anything, it's just a leaky abstraction like all others.

If you say that a container is not a first class object but cgroups and namespaces are, I can just as easily say that cgroups and namespaces aren't first class objects, they are just terms people use to describe a combination of system calls. It's just abstractions with different amounts of leakage the whole way down.

didibus · 5 years ago

I don't know, container is an abstract idea, and that's all. Can you run apps within a contained OS environment?

LXC is one way to do so, runc is another way to do so, docker is a third way to do so, all for Linux. Now if you took some other OS, there'd be different solutions, each with slightly different details and thus properties, but same idea.

I mean, do you ask people what SQL is? And get frustrated if they don't start talking about MySQL specific details like InnoDB and what not?

Don't know, I feel I can't agree with you, do devs feel there's less magic involved in VMs? Honestly, I have less idea what VMs are built on top of than I do for containers.

peteretep · 5 years ago

> container is an abstract idea

In this context, it's not, it's specifically referring to a process running in a cgroup.

> LXC is one way to do so, runc is another way to do so, docker is a third way to do so

Docker is a suite of tools for managing running LXC or runc, both of which set up processes running under cgroups.

> Now if you took some other OS, there'd be different solutions, each with slightly different details and thus properties, but same idea

You should read the article.

mav3rick · 5 years ago

A process in a namespace is running just like another process being managed by your kernel. Based on how you set up networking, you may face an extra hop to get packets. I don't know what other scalability issues will be there, it's literally a process running similar to other processors.

Can you shed light on some of these, maybe I haven't encountered these in my day to day ? (Please note I am not talking about containers running in VMs, which apparently Docker does now).

fulafel · 5 years ago

Docker usage of cgroups is optional and it's not really an integral part of containers. Contrast to people using cgroups without containers too (unlike namespaces).

k__ · 5 years ago

Doesn't surprise me if I think about all the people who suddendly need a K8s cluster...

taf2 · 5 years ago

Thanks I’d have fallen into your category of developers because in large part I never bothered with containers since we have everything running on vms and we’ve already isolated things so I’ve had only partial interest in exploring them ... but now that I know in linux it’s CGroups and namespaces that helps a lot in understanding thanks

Thaxll · 5 years ago

Overhead is close to 0.

dehrmann · 5 years ago

> like VMs, but lighter

This is both very right and very wrong.

jjtheblunt · 5 years ago

widespread in my experience; it reminds me to users of npm not seeming to understand what node is, for example.

Dead Comment

MuffinFlavored · 5 years ago

> A sizable fraction will be concerned about 'container overhead' (and "scalability issues") when asked to move workloads to containers. They are usually not able to explain what the overhead would be, and what could potentially be causing it.

For what it's worth, one of the biggest "containerization" recommendations is to not run your database (example: Postgres) in a container, correct? Due to I/O performance decrease?

wmf · 5 years ago

No. Docker volumes aka bind mounts have little or no overhead. You don't want to run a database in an ephemeral "cattle" container without some kind of HA because you'd lose data.

lazyant · 5 years ago

Funny, I looked into papers or articles about performance issues with containerized RDBS _binaries_ and didn't find anything relevant. Of course you want the data mounted outside the container so it's not ephemeral.

I ran some casual tests using and found out there is a performance hit in using db binaries inside a Docker container due to Docker networking (different for different types of networking).

Deleted Comment

aprdm · 5 years ago

I think it is more because containers should be stateless and you cannot make a database stateless.

We do run databases that do 100k`s ops/s in containers but we don't run them in kubernetes. We just mount the VM hard drive in it.