Beyond Bytecode: Exploring the Relationship Between JVM, JIT, and Performance
JIT compilation boosts Java performance by converting bytecode to native code at runtime, optimizing execution while balancing startup speed.
Join the DZone community and get the full member experience.
Join For FreeIn computing, the execution of programs written in high-level languages requires that the source code be compiled to a low-level or native language. This compilation is referred to as Ahead-of-Time (AOT) and is typically done during build time. Effectively reducing the work to be done during runtime.
In case of Java, the AOT produces an intermediate binary, viz. bytecode, which is then translated to native machine code during execution by the Java Virtual Machine (JVM). This is in line with Java’s philosophy of Write-Once-Run-Anywhere (WORA), or simply put, platform independence.
During program execution, the JVM identifies frequently running code, referred to as hotspots, that could be optimized. This optimization is done by Just-In-Time (JIT) compiler during runtime. Fun fact: This is how HotSpot VM gets its name.
JIT is used across programming languages such as .NET, JavaScript V8, and in some contexts within Python and PHP. In this article, we will focus on Just-In-Time (JIT) in Java only.
JIT in JVM
At runtime, the JVM loads the compiled code, viz., bytecode, and determines the semantics of each bytecode for appropriate computation. Bytecode interpretation at runtime requires computing resources, such as processor and memory, resulting in slower execution compared to a native application. JIT helps optimize Java program performance by compiling bytecodes to native code at runtime. The resulting natively compiled code is cached for later (re)use.
During JVM startup, a large number of methods are called. If all these methods are compiled immediately, it would significantly affect the startup time. Thus, as a tradeoff between startup times and long-term performance, only those methods that are frequently called are compiled as soon as the JVM starts. Less-used methods are compiled later or not at all, depending on usage.
To determine the threshold at which the method should be compiled is maintained internally by the JVM as a method invocation count. For every invocation, the counter is decremented. Once counter reaches zero, the JIT kicks in and compiles the method.
Another counter maintained by JVM is for the loop back-edge. For every loop executed, the counter is checked against the threshold, beyond which it is too JIT-compiled for optimization.
JIT Compilers
JIT compliers come in below two flavors:
- C1: Client Compiler → C1 compiler has a low threshold for compilation and thus is optimized for quick application startup.
- C2: Server Compiler → C2 compiler has a higher threshold for compilation. Due to this, the profiling information available before compilation is much richer. Thus, the C2 compiled code is highly optimized for performance. Moreover, methods that are determined to be in the critical execution path of the application can be identified accurately by C2.
Tiered Compilation
JIT can compile the code at various optimization level depending upon its usage and complexity. For a higher level of optimization, although the program performs better, the compilation is costly in terms of resource utilization, viz. CPU and memory.
To get the best of both C1 and C2, not only are they bundled together, but Tiered Compilation on multiple levels is done as described below.
Levels of Tiered Compilation
- Level 0: Interpreted Code → During startup, all the bytecode is translated to native code. At this level, there is no optimization done. However, the frequently executed code is determined and profiled. This information is utilized in later levels for optimization.
- Level 1: Simple C1 Compiled Code → At this level, the low complexity methods are optimized, which the JVM considers as trivial. Neither profiling is done on these methods, nor are they optimized further.
- Level 2: Limited C1 Compiled Code → At this level, only a few of the hot methods are compiled with whatever profiling information is available. These methods are compiled for any early optimizations without waiting for C2. Note that these methods could later be (re)compiled at higher levels, viz. 3 or 4 additional profiles are captured.
- Level 3: Full C1 Compiled Code → At this level, all the non-trivial hot methods are compiled with full profiling information available. In most of the cases the JIT jumps directly from level 0 interpreted code to 3, unless compiler queues are full.
- Level 4: C2 Compiled Code → At this level JIT compiles the code with maximum optimization with all the available rich profiling information. For long-term execution, these compiled codes are most suitable. Since this is the peak of optimization, there is no further profiling information captured.
It's interesting to note that a code could be (re)compiled multiple times for higher-level optimization, as deemed appropriate by the JIT.
Deoptimization
While JIT continuously strive to improve or optimize performance, there could be instances that the optimized methods are now irrelevant. Also, the compiler's assumptions about optimization could vary with the method’s behavior. For such instances, JIT temporarily reverts the optimization level back to previous or directly to level 0.
Do note that these methods can again be optimized with newer profile information. However, it's advised to monitor such switching and recommended to adapt the source code accordingly to thwart the cost of frequent switching.
Configurations
JIT and Tiered compilation are enabled by default. They still could be disabled for strong reasons e.g. to diagnose any JIT induced errors (which is quite rare in nature) thus disabling should be avoided.
To disable JIT either specify -Djava.compiler=NONE
or -Xint
as arguments during JVM startup.
To disable tiered compilation, completely specify -XX:-TieredCompilation
.
For granular control, i.e., use only the C1 compiler, specify -XX:TieredStopAtLevel=1
.
To control the respective thresholds of various tiers from 2 to 4, refer below (use them by replacing Y with the number for the tier):
-XX:TierYCompileThreshold=0
-XX:TierYInvocationThreshold=100
-XX:TierYMinInvocationThreshold=50
-XX:TierYBackEdgeThreshold=1500
Do note that tweaking any of these configurations will have effects on the program's performance. Thus, it is advised to tweak only after thorough benchmarking.
Conclusion
JIT compilation enhances Java program performance by converting bytecode to native code at runtime, optimizing frequently used methods while balancing startup speed. Tiered compilation further refines this process by progressively optimizing code based on profiling data. While default JIT settings work well, fine-tuning configurations requires careful benchmarking to prevent performance drawbacks.
For most applications, these optimizations happen seamlessly, without requiring developer intervention. However, understanding Just-in-Time (JIT) compilation is crucial for high-performance applications, where fine-tuning compilation settings can significantly impact execution efficiency.
JDK Hotspot vs. GraalVM JIT
GraalVM JIT is another implementation whose core differences with JDK HotSpot JIT is described below:
- JDK HotSpot → The standard HotSpot JVM uses a tiered JIT approach — featuring the simpler C1 compiler for quick optimizations and the more aggressive C2 (server) compiler for deeper optimizations. These compilers are predominantly written in C/C++ and have been honed over many years to offer stable and reliable performance for general Java workloads.
- GraalVM JIT → GraalVM builds on the HotSpot foundation by replacing the traditional C2 compiler with the Graal compiler. Written in Java, the Graal compiler introduces advanced optimizations such as improved inlining, partial escape analysis, and speculative optimizations. Additionally, GraalVM extends beyond just JIT improvements; it supports polyglot runtimes, enabling languages such as JavaScript and Python, and offers ahead-of-time (AOT) compilation to improve startup times and reduce memory overhead in suitable scenarios.
In essence, while HotSpot remains a battle-tested and stable platform for running Java applications, GraalVM pushes the boundaries of performance and flexibility with its modern JIT compiler and additional runtime features. The choice between them usually depends on the specific workload and the performance or interoperability requirements of the application.
References and Further Reading
Published at DZone with permission of Ammar Husain. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments