Not just the JVM, lot of libraries build thread pools based on number of processors and some of them even hard code the multipliers (X the number of processors etc).
Setting up each one becomes a lot of work fast, so we wrote a LD_PRELOAD[1] hook that overrides sysconf[1] _NC_PROCESSORS* call to
get number of processors availabale/online to a certain specified value and is baked in the docker image by default during builds.
Can we expect that kind of horror to fade away as Java evolves?
In the .Net world there's the ThreadPool class to manage the single process-wide thread-pool, and the Task class to enqueue and orchestrate concurrent jobs (it uses ThreadPool and almost completely abstracts away the thread management).
(You could write your own thread-pool, of course, but for most code that wouldn't make sense.)
As I understand it, the JVM is rather behind in this department. (Not to mention async/await.)
> In the .Net world there's the ThreadPool class to manage the single process-wide thread-pool, and the Task class to enqueue and orchestrate concurrent jobs (it uses ThreadPool and almost completely abstracts away the thread management).
> (You could write your own thread-pool, of course, but for most code that wouldn't make sense.)
Java has these concepts too (ExecutorService since at least Java 5, circa 2004). The problem is not with the JDK libraries, it's been in the JVM's assumption that it's running on bare metal or a VM.
Linux containers leak information about the "true" environment in a way that upset JVM assumptions before 9 and 10.
.NET has the same problem: Each process would create one thread pool thread per CPU core, and you could end up with excessive context switching.
Java possibly makes it a bit easier to work around, really, in that Java forces you to initialize your thread pool, so it wouldn't feel quite so weird to add a "# of threads in thread pool" setting to an app config file for something. I'm guessing that's not the Docker way of doing it, though.
IMO these are a great improvement but one still needs to be aware of the underlying container limits. E.g., understanding why setting MaxRAMFraction=1 might not be a good idea as the container might get killed...
This. I've seen container builds for Java applications broken in the described ways fail in production over and over again. Just because Docker makes building an image very easy doesn't mean that you will end up with a production ready image using three lines of code in a Dockerfile. Most of the time people don't even bother to use a non-root user to execute the jvm in their container...
That's why I feel platforms like Cloud Foundry are a much better fit for teams that don't have tons of container experience but want to get the benefits of a containerized runtime. The CF java buildpack[1] for example automatically handles OOM and heap settings calculation while building your application container.
disclaimer: co-founder at meshcloud, we offer a public cloud service for Cloud Foundry and K8s hosted in german datacenters.
I wrote up my experience[0] on containerizing JVM based applications a bit ago using the cloud foundry java buildpack’s memory calculator. Fortunately the JVM now has a way to respect cgroup memory[1] making it a bit simpler.
In particular, the Java buildpack uses a component developed entirely to calculate JVM settings.
I am not sure what the plans are for Java 9 and 10 yet. Ben Hale works on JBP pretty much fulltime and the Spring team tend to experiment pretty early on JDKs. So I can't see it falling far behind.
Fabric8 has really good base java images[1] with a script that simply sets environment variables with right GC and CPU parameters before launching Java, with nice sane defaults.
Heavily encourage anyone running Java in containers to use their base image, or for larger organizations to create standard base image dockerfiles that set these JVM envvar parameters. A simple contract is: ENTRYPOINT belongs to the base image, CMD belongs to downstream application images (unless something else essential).
Just don't use vanilla "FROM openjdk:8-jre" and expect it to work. That's the worst way to kill application performance and reliability in a container.
I disagree. They seem mostly targeted at low-end cloud providers who overcommit on memory and ignore application response time (gc latency). And they don't even do a good job at this.
Their configuration uses the ParallelOld gc and tunes it to aggressively shrink the heap. What that means they don't care about frequent and long gc pauses (unless you're running small heaps below 1 GB). They just care about reducing the memory footprint of the application. On multi gigabyte heaps you accept full GCs that take several seconds. They increase the number of concurrent GC threads to the number of cores. This defeats the whole purpose of the concurrent GC threads which are supposed to run concurrently with your application without stopping it. That value should be below the number of cores.
gc logging does not work on Java 9 or Java 10
If you really care about reducing memory usage you probably should do this in addition:
-XX:-TieredCompilation hurts startup perfromance, but we just established that we don't care about performance so that's fine, but easily takes 35% out of code cache
-Xss512k cuts the thread memory usage by half, this can usually be done without any issues, often -Xss256k works as well. We run Spring inside a full profile Java EE application server with -Xss256k.
And finally the most important option of them all -XX:HeapDumpOnOutOfMemoryError is missing. You absolutely, positively need this, always. It's the only way to debug OutOfMemoryErrors.
We tried those flags in the beginning since it was introduced in docker openjdk image[1].
When we dug in further, we find its just not trouble free (i.e. experimental). The default is to use 1/4th of RAM which is entirely inefficient [2]. The "MaxRAMFraction" parameter allows to specify 1/n fraction and not possible to efficiently use 65% or 75% of memory. The only place to start is to set MaxRAMFraction=2 and that already means only 50% of memory is used for heap. That produces a lot of wastage. A lot of resource efficiency is gained by starting with 65% or 80%.
OpenJDK 10 is introducing a new option "MaxRAMPercentage" [3] and that goes closer to making a script unnecessary.
TL;DR - The default flags are still experimental in JDK 8/9, and deemed to be better on Java 10. A script is just better for consistency.
+1 for Fabric8. It makes running Java in containers much more pleasant, and provides a complete opinionated ecosystem, if that's desired. I work at an enterprise shop that is a Red Hat customer (so we get commercial support on this stuff), and it has made our lives much easier in many respects.
Anyone here actually use Java containers in production?
Sadly the article mentions very little in terms of practical advice. We've tried running some small Java 8 Spring Boot containers in Kubernetes which are configured to use max ~50M heap and ~150M total off-heap yet decide to use in excess of double that memory so we end up with either a lot of OOMKills or overly large memory limits.
Yes, we do. It's actually a rather small setup (13 dedicated server, about 100 containers).
The absolutely most basic advice is probably: "-Xmx" does not represent the actual upper limit for memory usage. We actually most often only set 50% of the assigend memory for the jvm.
You may be experiencing the same bug as we did: memory freed by the GC of a containerised JVM was not able to be returned to the host machine.
IIRC it was due to a bug in glibc >= 2.1. Something about how mallocs are pooled. IIRC you need to tune it to be <= num of physical threads. Usually people advise 4 or 2.
# openjdk 8 bug: HotSpot leaking memory in long-running requests
# workaround:
# - Disable HotSpot completely with -Xint
# - slows the memory growth, but does not halt it:
MALLOC_ARENA_MAX=4
So, ensure that your java process is launched with that environment var (so, export it in the same shell, or precede your java command with it).
If you happen to be using Tomcat, I recommend putting:
export MALLOC_ARENA_MAX=4
into:
/usr/local/tomcat/bin/setenv.sh
As for how much memory you allocate to your containers: as of JRE 8u131 you can make this far more container-friendly:
We use Java containers in production on a very large scale system. We are actively migrating stuff away from Java to Go, because the JVM is a nightmare in containers. Sure we could make do and configure the JVM to hell and hack things to get it to work...but why bother? We have to allocate hundreds of MB of memory for a simple HTTP server. The tons of configuration and hacks are maintenance nightmares. It's been a terrible experience all around. Java is bloated, it's a resource hog, and the ecosystem doesn't care about cold start times. It's just a terrible fit for a containerized environment.
Spring Boot is, by design, eager to load everything it sense you might need. Most of the memory usage is not Spring itself, it's the libraries it's pulling in on your behalf.
In Spring Boot 2 I'm told you can can use functional beans to cut down on RAM usage. Not sure how it works.
But really, it comes down to deciding if you need a dependency or not.
Where I work most of our dockerized services are Spring Boot in Kubernetes, they do need more memory than what you've posted and generally run with about 300M ~ 600M usage depending on what they need to do.
You can also use smaller frameworks (Vertx? Javalin? possibly Spring Boot 2). I hope that with Java 9 we won't see this amount of memory usage anymore, however our organisation isn't there yet though.
Yeah, we deploy containerized Jenkins environments for almost 100 teams on VM's running Docker. These are massive heap containers (20+GB in some cases). Probably not the best use of Docker, but we actually are doing pretty good. Working towards migrating to an OpenShift environment and then evaluating some new tech from CloudBees in this area.
Honestly I don't feel there is any need to run Java in containers. The war/jar file is its own container with its own dependencies. The JVM still makes the same syscalls as it would inside a Docker/Kubernetes container.
In fact I would rather look at serverless architecture before considering docker/Kubernetes.
when you run a polyglot stack with java/python/go/node on top of a cluster of machines, you will love to have them containerized and uniform. It makes scripting and CI so much easier.
or, when you have a legacy app that relies on java 6, but you want everything else to run on java 8, the ability to drop everything into a container with its runtime is a life saver.
source: I'm the devops person that's responsible for making this work
The real killer app would be the ability to fully containerize all jvm instances running on any given box.
It will be a setup where one jvm instance on the host basically serves the role of "master" in terms of class data and shared object loading while each container instance uses its memory allotment only for running computations specific to the application in that container while sharing memory objects with other containers as much as possible.
It is possible to do something similar at the moment but it requires going through a hodge-podge of painful hacks. A seamless solution to this would basically make the jvm an out of the box poor man's polyglot PaaS platform.
> It will be a setup where one jvm instance on the host basically serves the role of "master" in terms of class data and shared object loading while each container instance uses its memory allotment only for running computations specific to the application in that container while sharing memory objects with other containers as much as possible.
Eclipse OpenJ9 does something like this.
> A seamless solution to this would basically make the jvm an out of the box poor man's polyglot PaaS platform.
Trying to recreate OS resource-allocation guarantees without the the OS has been a bit of a fool's errand, historically. The OS has privileged access to hardware -- it has a view of activity and an ability to enforce guarantees that processes lack.
Add to that the amount of time and effort that has gone into operating systems to cover allocation of so many different kinds of resource under so many different conditions. It is really expensive to re-engineer that capability.
I've seen a lot of attempts at trying to share ostensibly multiuser systems above the OS level and they have mostly been unhappy once there is heavy usage. Databases, queues, the JVM, everything eventually needs to be isolated from unrelated workloads at some point. Containers and VMs are much better at providing that capability.
My OS internals knowledge is rusty so I am not sure they are quite the same but likely similar.
If you think of a typical jvm application (true for non-jvm apps as well), a significant chunk of class data will be shared since apps are typically using the same libraries (with deltas in versions), allowing easy reuse of class data across all container instances on a host would be a major scalability advance.
Similar to the goals of the multi-tenant virtual machine (MVM)?
I feel like all this cloud, container stuff is incrementally, painfully evolving towards grid computing and agents. Kinda like reinventing LISP or Prolog like features with your own 'C' runtime, instead of just using LISP or Prolog.
Setting up each one becomes a lot of work fast, so we wrote a LD_PRELOAD[1] hook that overrides sysconf[1] _NC_PROCESSORS* call to get number of processors availabale/online to a certain specified value and is baked in the docker image by default during builds.
[1] http://man7.org/linux/man-pages/man3/sysconf.3.html
In the .Net world there's the ThreadPool class to manage the single process-wide thread-pool, and the Task class to enqueue and orchestrate concurrent jobs (it uses ThreadPool and almost completely abstracts away the thread management).
(You could write your own thread-pool, of course, but for most code that wouldn't make sense.)
As I understand it, the JVM is rather behind in this department. (Not to mention async/await.)
Java had thread pools and tasks since 2004 (https://docs.oracle.com/javase/1.5.0/docs/api/java/util/conc...). For typical Java programs, no one is writing their own thread pool functionality as far as I know.
Linux containers leak information about the "true" environment in a way that upset JVM assumptions before 9 and 10.
Java possibly makes it a bit easier to work around, really, in that Java forces you to initialize your thread pool, so it wouldn't feel quite so weird to add a "# of threads in thread pool" setting to an app config file for something. I'm guessing that's not the Docker way of doing it, though.
Java just lacks async/await primitives.
edit: at the end of the article, it is acknowledged that there are improvements in Java 10 and the following link is provided: https://bugs.openjdk.java.net/browse/JDK-8146115
https://reddit.com/r/java/comments/85t7dt/java_on_docker_wil...
That's why I feel platforms like Cloud Foundry are a much better fit for teams that don't have tons of container experience but want to get the benefits of a containerized runtime. The CF java buildpack[1] for example automatically handles OOM and heap settings calculation while building your application container.
disclaimer: co-founder at meshcloud, we offer a public cloud service for Cloud Foundry and K8s hosted in german datacenters.
[0]: https://medium.com/@matt_rasband/dockerizing-a-spring-boot-a... [1]: https://twitter.com/codepitbull/status/934384652806221825?s=...
I am not sure what the plans are for Java 9 and 10 yet. Ben Hale works on JBP pretty much fulltime and the Spring team tend to experiment pretty early on JDKs. So I can't see it falling far behind.
[0] https://github.com/cloudfoundry/java-buildpack-memory-calcul...
Heavily encourage anyone running Java in containers to use their base image, or for larger organizations to create standard base image dockerfiles that set these JVM envvar parameters. A simple contract is: ENTRYPOINT belongs to the base image, CMD belongs to downstream application images (unless something else essential).
Just don't use vanilla "FROM openjdk:8-jre" and expect it to work. That's the worst way to kill application performance and reliability in a container.
1: https://github.com/fabric8io-images/java/blob/master/images/...
I disagree. They seem mostly targeted at low-end cloud providers who overcommit on memory and ignore application response time (gc latency). And they don't even do a good job at this.
Their configuration uses the ParallelOld gc and tunes it to aggressively shrink the heap. What that means they don't care about frequent and long gc pauses (unless you're running small heaps below 1 GB). They just care about reducing the memory footprint of the application. On multi gigabyte heaps you accept full GCs that take several seconds. They increase the number of concurrent GC threads to the number of cores. This defeats the whole purpose of the concurrent GC threads which are supposed to run concurrently with your application without stopping it. That value should be below the number of cores.
gc logging does not work on Java 9 or Java 10
If you really care about reducing memory usage you probably should do this in addition:
-XX:-TieredCompilation hurts startup perfromance, but we just established that we don't care about performance so that's fine, but easily takes 35% out of code cache
-Xss512k cuts the thread memory usage by half, this can usually be done without any issues, often -Xss256k works as well. We run Spring inside a full profile Java EE application server with -Xss256k.
And finally the most important option of them all -XX:HeapDumpOnOutOfMemoryError is missing. You absolutely, positively need this, always. It's the only way to debug OutOfMemoryErrors.
https://blogs.oracle.com/java-platform-group/java-se-support...
When we dug in further, we find its just not trouble free (i.e. experimental). The default is to use 1/4th of RAM which is entirely inefficient [2]. The "MaxRAMFraction" parameter allows to specify 1/n fraction and not possible to efficiently use 65% or 75% of memory. The only place to start is to set MaxRAMFraction=2 and that already means only 50% of memory is used for heap. That produces a lot of wastage. A lot of resource efficiency is gained by starting with 65% or 80%.
OpenJDK 10 is introducing a new option "MaxRAMPercentage" [3] and that goes closer to making a script unnecessary.
TL;DR - The default flags are still experimental in JDK 8/9, and deemed to be better on Java 10. A script is just better for consistency.
1: https://github.com/docker-library/docs/pull/900
2: https://news.ycombinator.com/item?id=16636544
3: https://bugs.openjdk.java.net/browse/JDK-8186248
Deleted Comment
Sadly the article mentions very little in terms of practical advice. We've tried running some small Java 8 Spring Boot containers in Kubernetes which are configured to use max ~50M heap and ~150M total off-heap yet decide to use in excess of double that memory so we end up with either a lot of OOMKills or overly large memory limits.
The absolutely most basic advice is probably: "-Xmx" does not represent the actual upper limit for memory usage. We actually most often only set 50% of the assigend memory for the jvm.
Somebody mentioned https://github.com/cloudfoundry/java-buildpack-memory-calcul... which seems pretty interesting.
IIRC it was due to a bug in glibc >= 2.1. Something about how mallocs are pooled. IIRC you need to tune it to be <= num of physical threads. Usually people advise 4 or 2.
So, ensure that your java process is launched with that environment var (so, export it in the same shell, or precede your java command with it).If you happen to be using Tomcat, I recommend putting:
into: As for how much memory you allocate to your containers: as of JRE 8u131 you can make this far more container-friendly: This is equivalent to saying:https://github.com/moby/moby/issues/15020https://github.com/docker-library/openjdk/issues/57https://bugs.openjdk.java.net/browse/JDK-8170888In Spring Boot 2 I'm told you can can use functional beans to cut down on RAM usage. Not sure how it works.
But really, it comes down to deciding if you need a dependency or not.
As is normal for questions about Spring Boot performance, Dave Syer has done a lot of investigating: https://github.com/dsyer/spring-boot-memory-blog
You can also use smaller frameworks (Vertx? Javalin? possibly Spring Boot 2). I hope that with Java 9 we won't see this amount of memory usage anymore, however our organisation isn't there yet though.
In fact I would rather look at serverless architecture before considering docker/Kubernetes.
or, when you have a legacy app that relies on java 6, but you want everything else to run on java 8, the ability to drop everything into a container with its runtime is a life saver.
source: I'm the devops person that's responsible for making this work
If you want to be more clever, you could try this: https://github.com/cloudfoundry/java-buildpack-memory-calcul...
(Electron i guess being the obvious JS-to-native example)
(And isn't Sprint Boot the light-weight solution to the Sprint memory problem? /s)
https://github.com/sladeware/groningen
Unfortunately the experiment state persistence management capability is broken.
It will be a setup where one jvm instance on the host basically serves the role of "master" in terms of class data and shared object loading while each container instance uses its memory allotment only for running computations specific to the application in that container while sharing memory objects with other containers as much as possible.
It is possible to do something similar at the moment but it requires going through a hodge-podge of painful hacks. A seamless solution to this would basically make the jvm an out of the box poor man's polyglot PaaS platform.
Eclipse OpenJ9 does something like this.
> A seamless solution to this would basically make the jvm an out of the box poor man's polyglot PaaS platform.
Trying to recreate OS resource-allocation guarantees without the the OS has been a bit of a fool's errand, historically. The OS has privileged access to hardware -- it has a view of activity and an ability to enforce guarantees that processes lack.
Add to that the amount of time and effort that has gone into operating systems to cover allocation of so many different kinds of resource under so many different conditions. It is really expensive to re-engineer that capability.
I've seen a lot of attempts at trying to share ostensibly multiuser systems above the OS level and they have mostly been unhappy once there is heavy usage. Databases, queues, the JVM, everything eventually needs to be isolated from unrelated workloads at some point. Containers and VMs are much better at providing that capability.
If you think of a typical jvm application (true for non-jvm apps as well), a significant chunk of class data will be shared since apps are typically using the same libraries (with deltas in versions), allowing easy reuse of class data across all container instances on a host would be a major scalability advance.
I feel like all this cloud, container stuff is incrementally, painfully evolving towards grid computing and agents. Kinda like reinventing LISP or Prolog like features with your own 'C' runtime, instead of just using LISP or Prolog.
http://www.oracle.com/technetwork/articles/javase/mvm-141094...https://www.jcp.org/en/jsr/detail?id=121