Episode 1: “The Evolution” — Java JIT Hotspot and C2 Compilers (Building “Super Optimum Java Microservices Architecture” Series)
This blog is the first in the series, where we will explore Java JIT, HotSpot, Graal, Truffle, Quarkus, and how we can build the most optimum Java MicroServices.
Join the DZone community and get the full member experience.Join For Free
In my search for the most optimum container tech, I have been playing around with various combinations of OpenSource and frameworks.
In this blog, I will walk you through what, I think, is one of the most optimum container stacks.
Before I dig into the stack, let me spend some time, walking through what are some of the non-functional requirements of a Container & Serverless/FaaS based MicroServices Architecture
IMHO, the following are some of the key requirements
Eventually, all of these MicroServices are going to run on the cloud, where we “pay for what we use”…What we need is a runtime that has a smaller footprint and runs with optimum CPU cycles so that we can run more on less infrastructure
Scalability is one of the most important aspects of container-based MicroServices architecture. So the faster the container bootup, the faster it can scale the clusters. This is even more important for Serverless architectures.
Built On Open Standard
We must have the underlying platform/runtime built on open standards, as it's easy for me to port or run the workloads in a hybrid multi-cloud world!!, and avoid vendor lock-in.
Faster Build Time
In this agile world, where we roll out fixes/features/updates very frequently, the build and rollouts must be quicker…including real-time deployments of the changes (during development time, to test, as we develop)
Let's park these requirements for sometime…let me go down the stack, to the foundational elements and work my way up the stack, to build (what I believe is) the most optimum container platform, that would deliver the above requirements.
Since there is a lot to go through, I have divided into 4 episodes.
- Episode 1: “The Evolution” — Java JIT Hotspot & C2 compilers (the current episode…scroll down)
- Episode 2: “The Holy Grail” — GraalVM: In this blog, I will talk about how GraalVM embraces polyglot, providing interoperability between various programming languages. I will then cover how it extends from Hotspot and provides faster execution and smaller footprints with “Ahead-of-time” compilations & other optimizations
- Episode 3: “The Leapstep” — Quarkus+CRI-O: In this blog, I will talk about how Quarkus takes a leap-step and provides the fastest, smallest, and the best developer experience in building Java MicroServices. I will also introduce CRI-O, and how it brings its ecosystem of tools.
- Episode 4: “The Final Showdown” — Full-stack MicroServices/Serverless Architecture: In this blog, I will put all the pieces together and talk about how they build a robust, scalable, fast, thin MicroServices Architecture.
I hope you will enjoy this series…
Java JIT Hotspot and C2 compilers
With Java, we achieved the “write-once-run-anywhere” dream, in the early 90s. The approach was very simple. The Java programs are compiled to “byte-code”
Interesting fact: byte-code is called byte-code, as each instruction in byte-code is of byte length, so that it can be loaded into the CPU cache, and in fact there were also java CPUs built!!! didn’t take-off
We have JVM implementations, for each supported operating system. The respective JVM will “interpret” the byte-code to machine instruction (using something like a map). Obviously this is slow, as the interpreter goes one statement at a time!!!
To speed up this, it makes sense to identify the code, that is run more commonly, and compile them ahead of time, and cache it.
That is exactly, what later versions of JVMs started doing. A performance counter was introduced, that counted the number of times a particular method/snippets of code is executed. Once a method/code snippet is used to a particular number of times (threshold), then that particular code snippet, is compiled, optimized & cached, by the “C1 compiler”.
Next time, that code snippet is called, it directly executes the compiled machine instructions from the cache, rather than going through the interpreter. This brought in the first level of optimization.
While the code is getting executed, the JVM will perform runtime code profiling, and come up with code paths and hotspots. It then runs the “C2 compiler”, to further optimize the hot code paths…and hence the name “Hotspot”
C1 is faster, and good for short-running applications, while C2 is slower and heavy, but is ideal for long-running processes like daemons, servers, etc, the code performs better over time.
In Java 6, we have an option to use either C1 or C2 methods (with a command-line argument
-client (for C1),
-server (for C2)), in Java 7, we could use both, and from Java 8 onwards it became the default behavior.
The below diagram illustrates the flow…
Here are some of the code optimization, that the JVM compiler:
- Removing null checks (for the variable that are never null)
- Inlining smaller, most called methods (small methods) reducing the method calls
- Optimizing the loops, by combining, unrolling & inversions
- Removing the code that is never called (Dead code)
and many more…
Whatever said and done, JIT (Just-In-time compilation) is slow, as there is a lot of work that the JVM has to do in the runtime.
The Ahead-of-Time compilation option was introduced since Java 9, where u can generate the final machine code, directly using
This code is compiled to a target architecture, so it is not portable…in X86, we can have both Java bytecode and AOT compiled code, working together.
The bytecode will go through the approach, that I explained previously (C1, C2) while the AOT compiled code directly goes and sits in the code cache, reducing the load on JVM. Typically the most frequently used libraries can be AOT compiled, for faster responses.
This is the story of Java VM…and pretty much every language has a similar story, where it goes thru the similar inception and over some time, the compiler/VM gets optimized to run faster
In the next episode, we will look at how GraalVM, takes this further, by reducing the footprint, optimizing the execution, and bring in support for polyglot/multi-language interoperability.
Episode 2: GraalVM - "The Holy Grail" - (Coming Soon)
Published at DZone with permission of A B Vijay Kumar. See the original article here.
Opinions expressed by DZone contributors are their own.