Container-to-Container Communication

Normally I'm all for people using tried and tested primitives for things, however I think that in this case unix sockets are probably not the right choice.

Firstly you are creating a hard dependency for having the two services sharing a same box, with a shared file system (that's difficult to coordinate and secure.) But also should you add a new service that also want to connect via unix socket, stuff could get tricky to orchestrate.

But this also limits your ability to move stuff about, should you need it.

Inside a container, I think its probably a perfectly legitimate way to do IPC. Between containers, I suspect you are asking for trouble.

akvadrako · 4 years ago

Within a pod it seems pretty reasonable.

erulabs · 4 years ago

Exactly, and this is where the concept of "Pod" really shines. Sharing networking and filesystems between containers is, from time to time, exactly the right strategy, as long as it can be encapsulated.

auspex · 4 years ago

Unix socket requires elevated privileges that containers shouldn’t have in the first place.

paulfurtado · 4 years ago

Unix sockets only require as much permission as creating a file in a directory. If a program has write access to a directory, it can create a unix socket. File permission on the socket then dictate which users may connect to it.

Doesn't 700 requests per second for such a trivial service seem kinda slow?

fyrn- · 4 years ago

Yeah, it's so slow that I'm wondering if they were actually measuring TCP/unix socket overhead. I wouldn't expect to see a difference at such a low frequency.

Tostino · 4 years ago

Yeah, seems like there was some other bottleneck. Maybe changing the IPC method makes the small difference we are seeing, but we should be seeing orders of magnitude higher TPS prior to caring about the IPC method.

lmeyerov · 4 years ago

I didn't look too closely, but I'm wondering if this is Python's GIL. So instead of nginx -> multiple ~independent Python processes, each async handler is fighting for the lock, even if running on a different core. So read as 700 queries / core vs 700 queries / server. If so, in a slightly tweaked Python server setup, that'd be 3K-12K/s per server. For narrower benchmarking, keeping sequential, and doing container 1 pinned to core 1 <-> container 2 pinned to core 2 might tell an even clearer picture.

I did enjoy the work overall, incl Graviton comparison. Likewise, OS X's painfully slow Docker impl has driven me to Windows w/ WSL2 -> Ubuntu + nvidia-docker, which has been night and day, so not as surprised that those numbers are weird.

miketheman · 4 years ago

Glad you liked it!

In my example, I'm using an unmodified gunicorn runner to load the uvloop worker. So I'm still only using a single worker process. Once I start tweaking the `--workers` count, I get a much higher queries per second.

And you're correct - this is a narrow benchmark, not designed to test total TPS or saturate resources.

miketheman · 4 years ago

It may, but this is considering the overhead incurred in the local setup - calling the app through a Docker Desktop exposed port. Running locally on macOS produces ~5000 TPS. The `ab` parameters are also not using any sort of concurrency, and perform requests sequentially. The test is not designed to maximize TPS or saturate the resources, rather isolate variables to perform the comparison.

hamburglar · 4 years ago

But this test becomes a lot more interesting when the bottleneck is the actual thing under scrutiny (socket performance), rather than Whatever Python Is Doing. You can probably get 10-20x the RPS out of a trivial golang server, which might shift the bottleneck closer to socket perf.

KaiserPro · 4 years ago

snicker7 · 4 years ago

Jhsto · 4 years ago

Bit besides the point, but how many of you do still run nginx inside container infrastructures? I've been having container hosts behind a firewall without explicit WAN access for a long time -- to expose public services, I offload the nginx tasks to CloudFlare by running `cloudflared` tunnel. These "Argo" tunnels are free to use, and essentially give you a managed nginx for free. Nifty if you are using CloudFlare anyway.

Not nginx, but I run haproxy, which serves the reverse proxy role.

I use it instead of Google's own ingress because it gives you better control over load balancing and can react faster to deployments.

digianarchist · 4 years ago

We run Traefik in our stack at work.

janto · 4 years ago

https://zeromq.org/get-started/

shipit · 4 years ago

I think this is where `gRPC` shines. It can feel tedious but really, define the interface and use the tooling to generate the stubs, implement and done. It prevents having to think up and implement a protocol and importantly versioning for if/when the features of the containerized apps start to grow/change.

3np · 4 years ago

The results are only relevant for AWS ECS Fargate due to the specifics of how they do volumes and CNI.

Truth! Wouldn't have it any other way. ;)

astrea · 4 years ago

The multiple layers of abstraction in this make this test sorta moot. You have the AWS infra, the poor MacOS implementation of Docker, the server architecture. Couldn't you have just had a vanilla Ubuntu install and curl some dummy load n times and get some statistics from that?

Possibly - however the premise is that I'm running an application on cloud infrastructure that I don't control - which is common today. I tried to call that out in the post.

2OEH8eoCRo0 · 4 years ago

https://podman.io/getting-started/network

> By definition, all containers in the same Podman pod share the same network namespace. Therefore, the containers will share the IP Address, MAC Addresses and port mappings. You can always communicate between containers in the same pod, using localhost.

I'm a noob here but why wouldn't you use IPC?

https://docs.podman.io/en/latest/markdown/podman-run.1.html#...

kodah · 4 years ago

Network and IPC namespacing are just two different ways of communicating using namespaces. I can't imagine that the overhead of either is significant.

I can see, from a design perspective, that network namespacing is likely more scalable. Network addresses can be local or WAN while unix sockets would tie you to a single node. That implies a very different and slightly more rigid scaling strategy with IPC based communication.

For one, I'm not using `podman`. I'm using Amazon Elastic Container Service (Amazon ECS) on top of the Fargate compute layer, so I don't have to manage much beyond my application.

Ah gotcha. I understand some of those words.