Beginner's Guide to Compilation in Java
Java applications are complied to bytecode then JIT and JVM takes care of code execution. Here you will find some insights about how JIT compiler works.
Join the DZone community and get the full member experience.Join For Free
I am guessing that many of you use Java as your primary language in your day-to-day work. Have you ever thought about why HotSpot is even called HotSpot or what the Tiered Compilation is and how it relates to Java? I will answer these questions and a few others through the course of this article. I will begin this by explaining a few things about compilation itself and the theory behind it.
Turning Source Code to Machine Code
In general, we can differentiate two basic ways of translating human readable code to instructions that can be understood by our computers:
Static (native, AOT) CompilationAfter code is written, a compiler will take it and produce a binary executable file. This file will contain set of machine code instructions targeted for particular CPU architecture. Of course the same binary should be able to run on CPUs with similar set of instructions but in more complex cases your binary may fail to run and may require recompiling to meet server requirements. We lose the ability to run on multiple platforms for the benefit of faster execution on a dedicated platform.
InterpretationAlready existing source code will be run and turned into binary code line by line by the interpreter while the exact line is being executed. Thanks to this feature, the application may run on every CPU that has the correct interpreter. On the other hand, it will make the execution slower than in the case of statically compiled languages. We benefit from the ability to run on multiple platforms but lose on execution time.
As you can see both types have their advantages and disadvantages and are dedicated to specific use cases and will probably fail if not used in the correct case. You may ask – if there are only two ways does it mean that Java is an interpreted or a statically compiled language?
JIT Compilation to the Rescue
We may have two basic ways of translating the code, but we as humans always want to improve the things we use. So, we created a thing called JIT. It stands for just-in-time compilation and is an attempt to combine the pros of static compilation and interpretation.
In most cases, such compiler creates some sort of an intermitted level code — in the case of Java, it is bytecode – which is further read and translated by the specific interpreter — for Java it is JVM. Thanks to just-in-time compilation we still have the ability to run the application on multiple platforms, as long as they have the correct interpreter and performance overhead are far lower than for a standard interpreted code.
After describing the basics of compilation we can dive deeper into topics related to Java.
Let’s start by answering the first question stated in the introduction part.
Why Is HotSpot Even Called HotSpot?
This name is connected to the way of compiling the code by JVM. In general, every application has fragments of code that are executed with very high frequency. These fragments play the biggest part in overall application performance. Such places are called "hotspots". The more frequently a particular fragment is executed, the "hotter" it gets from JVM perspective.
Essentially, not every piece of code from our bytecode is going to be compiled. For mostly performance-connected reasons, for sections with small call frequency or with only one call it may be more efficient to simply interpret and run the bytecode directly.
On the other hand, if a particular section is called frequently then its compilation is worth spending CPU cycles. An additional plus of frequently called methods is that the JVM is able to get more information about them. Based on this information it is able to make more complex optimizations.
How Many Compilers Do We Have in Java?
Brace yourselves because this one will be long.
A quick answer to this one — we have two compilers in Java.
Because we do not like simple answers, we will dive deeper. For a start, a short story of Java — a long long time age (in times before Java 8) JDK was available in two versions: client and server, each of them having its own compiler.
The most commonly used names for them are C1 and C2 so I will use these names throughout this article. C1 is the name of the client compiler while C2 is the server compiler’s name. Moreover, we had to specify which one exactly we wanted to use while compiling our code.
As for now this distinction is not as important as it was but nevertheless it is good to know the difference between these two compilers.
The main differences between them are the moment when they start compiling code and final code performance:
C1 starts compilation sooner than C2 and does not try to perform many costly performance optimizations. At the beginning of program execution C1 compiler will be faster because it will compile more code in same time as C2.
C2 starts compiling later than C1 but it collects plenty of useful information while waiting. Thanks to all this information it can perform complex optimizations. In the end, C2 compiled code will be much faster than the one compiled by C1 — the performance can even compete with compiled C++ code.
It is widely believed that because of various reasons no more major enhancements for C2 compiler are possible.
In the old day if you were interested in the optimization time of application startup C1 compiler was the best option but if you preferred a long-running application with strict performance restriction, C2 was the choice for you.
Does this differentiation even have to exist? Can we maybe optimize it even further and use both compilers at the same time? It turns out that Java creators asked themselves very similar question (at least I suppose so).
Therefore, since Java 8 this differentiation has been removed and Tiered Compilation introduced in Java 7 became the basic technique of compiling code by the JVM.
In Java 7 it was not enabled by default and you could turn it on with -XX:+TieredCompialtion flag. However, in Java 8 it still can be disable by setting flag from the previous line to false.
What is Tiered Compilation
It is a technique which combines both C1 and C2 compilers together. JVM will start with C1 as default compiler and then use C2 to compile code when it gets hotter.
There are five tiers of Tiered Compilation in Java:
0 – Interpreted code (bytecode after javac command)
1 – Simple C1 code
2 – Limited C1 code
3 – Full C1 code
4 – C2 compiled code
From the performance point of view the most profitable transition is 0 → 3 → 4. Our code gains the most in performance while spending as few CPU cycles as it is possible. In fact, it is also the most common case for methods to be compiled to level 3 after the first C1 compilation. Transitions between 1 ↔ 2 ↔ 3 and other states are more complex and will not be mentioned in this article.
Tiered Compilation Trade-Offs
It will optimize startup time to levels above using C1 alone because the final code produced by the C2 may be already available during the early stages of application initialization.
Moreover, C1 will generate complicated versions of profiling methods and because compiled code is faster than the interpreted one, C2 compiler will be able to get more information during the profiling phase. Due to this small feature in Tiered Compilation, we may achieve better peak performance than by using a regular C2 compiler alone. The more information C2 has, the more complex optimization it is able to do.
When Will My Code Be Compiled by C2 ?
So, JVM uses two counters to determine if the method is “worthy” of C2 compilation. Before each execution of a method, JVM will check these counters and decide if this method is worth compiling. The first counter is a simple method call counter while the second one is responsible for storing a number which represents how many times each loop within a method was executed. Here concept know as Back Branching start to show up - loop is said to branch back if it reaches the end of itself or executes statement like continue.
If a particular loop within a method exceeded a defined threshold it is marked as “worth” of compilation. It is important to note that the only loop within the method will be compiled, not the entire method.
To be clear if Tiered Complication is enabled then compilation thresholds, max values of counters from the previous paragraph, are counted dynamically and cannot be changed. If you want to change it you first need to disable Tried Compilation and then set -XX:CompileThreshold flag.
There are more flags connected to thresholds and Tiered Compilation but they will not be mentioned here as I believe them to be too complex. Just remember that in most cases such JVM tuning will not bring you any performance benefits and can mostly result in a performance decrease.
Does It Mean That All Bytecode Will End Up C2 Compiled?
Simply stating – no, because in very specific situations JVM will reduce the value of both counters. This means that JVM actually measures only recent method "hotness" not overall "hotness" so even if our application runs forever, not whole code base will be compiled by C2.
GraalVM and Its Revolution
Now after we have discussed some basics and older ways of doing things it is time to go back to more recent times. In this and the next paragraphs, I will describe some ideas behind the new (the first production-ready release took place in May 2019) VM, which is based on already existing JVM. First of all, it is named GraalVM and brings a few interesting new features to Java ecosystem. Furthermore, Graal comes in two versions — Enterprise and Community. Both include support for Java 8 and 11 (current LTS). The Enterprise version is based on Oracle JDK while Community is based on Open JDK.
Below I have listed (with some more insight) three GraalVM features which, I my opinion, are the most notable ones:
It can provide runtime for applications written in many languages so you are able to run Python application on the same VM as Ruby and Java.
With Graal, we are able to compile our jar to native platform executable.
New Implementation of C2 Compiler
It should provide notable performance increase, especially in case of new features from the latest Java releases.
AOT Compilation With GraalVM
With Native Image technology, we are able to compile our jars to the native platform executable. Such executable files are called native images and contain all code necessary for the application to run. Moreover, they include all necessary components like memory management, thread scheduling, and so on from different runtime components.
Furthermore, AOT compilation greatly reduces the memory footprint and startup time. It can be a great advantage if you are using cloud or prefer microservice architecture. On the other hand, code optimization is not as good as the one done by C2 compiler so we can expect a decrease in terms of performance.
A Few More Words About the New Compiler
The new implementation is written in Java as opposite to the previous one in C++. Moreover, Graal was able to achieve performance increase by utilizing the new and more aggressive/complex compiler optimization. Quoting from Graal page - “the compiler in GraalVM Enterprise includes 62 optimizations”.
Additionally, Graal compiler is able to remove costly object allocation so applications which runs on GraalVM needs to spend less time on memory management and garbage collection.
It is all for today, I hope that you managed to read the whole article and reach this paragraph. I wanted to give you some better understanding of what is going on inside our JVMs and how we can turn it into our advantage. Hopefully you learn something new today and it will come in handy for you at some point.
Source code of Java 8 where you can find more information about calculating compilation thresholds.
Talk from Scala Days where Chris Thalinger from Twitter is describing what they manage to achieve using GraalVm features.
Opinions expressed by DZone contributors are their own.