How to Tune Garbage Collection in Java
Want to learn more about performance tuning and garbage collection in Java?
Join the DZone community and get the full member experience.Join For Free
Garbage collection is the mechanism by which the JVM reclaims memory on behalf of the application when it's no longer needed. At a high level, it consists of finding objects that are no longer in use, freeing the memory associated with those objects, and occasionally compacting the heap to prevent memory fragmentation.
The garbage collector performs it's work using one or more threads. But in order to do the job of tracking down object references and moving objects around in memory, it needs to make sure that the application threads are not currently using those objects because if, for example, an application thread is using an object and then the memory location of the object changes due to GC, then bad and unpredictable things could happen. This is why garbage collectors must pause all application threads when performing certain tasks. These pauses are sometimes called Stop-The-World pauses, and the minimization of them is the primary concern of GC tuning, as they can have a huge impact on the performance of a Java application.
Sizing the Heap
The first step in garbage collection tuning is tuning the size of the heap. This is because if the heap is too small, then too many GCs will occur in order to reclaim memory, which would reduce overall application throughput. And if the heap is too big of a heap, there would be fewer GCs, but those GCs would take a long time and your response time metrics would take a hit. The parallel collector is especially vulnerable to this problem, so if you're going to need a large heap and low pause times, then you should try the G1GC collector.
Side Note: The Concurrent Mark Sweep (CMS) collector has been deprecated since Java 9 and the Shenandoah Garbage Collector is still considered "experimental" as of the time of this writing. Therefore, if you are running an online interactive application, then the G1GC should be your default choice, and if you are running an offline batch application, then the Parallel collector should be your first choice.
The size of the heap is controlled by two values: an initial value, specified with the ms flag, and a maximum value, specified with the mx flag.
Having an initial and maximum size for the heap allows the JVM to autotune heap size, depending on the workload. If the JVM is experiencing memory pressure and observes that it is doing too much GC, it will continually increase the heap until the memory pressure dissipates, or until the heap hits its maximum size. And if memory pressure is low, the JVM can also decide to reduce pause times by shrinking the heap size. This process is called Adaptive sizing, and it adjusts not only the overall size of the heap but also the size and ratio of the young and old generations.
If you have taken the time to finely tune your application's GC behavior and sizes, you may choose to turn off adaptive sizing. This can save the JVM, the small period of time that it takes to do the calculation for what the heap size should be. You can do so by setting the flag
UseAdaptiveSizePolicy to false.
Also, setting the initial heap size to the same value as the max heap size or the initial new gen size to the same value as the max new gen size effectively turns off part of the adaptive sizing behavior.
A strongly recommended guideline for setting the max heap size is that the max heap size should not exceed the amount of physical memory on the machine. And if you have multiple JVMs running, the sum of the max heap sizes should not exceed the machine's physical memory.
A more general recommendation for setting max heap size is that the size should be set so that the heap is about 30 percent occupied after a full GC. To calculate this, you can look in the GC log for an entry where a full GC takes place and observe how much memory is used when the GC completes. Or, you can run the application until it has reached a steady-state and then force a full GC with
Tuning GC Performance
If adaptive sizing is turned on, then you can use the
MaxGCPauseMillis flag to tune GC behavior. This flag sets a target for the maximum GC pause time. When used with the Parallel collector, the JVM will adjust the size of the young and old generations in order to try and meet the goal. It will then adjust the size of the heap so that the time spent in GC does not exceed a certain value, which is, by default, 1 percent.
One of the goals of G1GC was that it would need minimal tuning. So, in G1GC, the one tuning parameter,
MaxGCPauseMillis, performs all the following optimizations in order to try to achieve the specified pause time goal:
- Adjust the size of the heap
- Start background processing sooner,
- Adjust the tenuring threshold for objects to be promoted to the old generation,
- Adjust the number of old regions processed during a mixed GC cycle.
In G1GC, default value of the flag is 200 ms. While you may be tempted to set it to something really small like 20 ms, note that to try to achieve this, the garbage collector will contract the young generation to a really small size and collect less of the old generation which would eventually lead to a situation where there's too much garbage in the old generation and the system would have to perform a full GC, which is undesirable.
Fixing Concurrent Mode Failures
G1GC is a concurrent collector. This means that some phases of the garbage collection process can be running concurrently while the application threads are still running. And since the running application can continue to produce garbage, we can run into a situation where the application runs out of old generation memory while the garbage collector is still in the middle of the garbage collection process. In other words, the running application is producing garbage faster than it can be cleaned. This situation is known as either a concurrent mode failure, a promotion failure, or an evacuation failure depending on when the failure occurs. If you see a lot of these errors in the GC logs; the solution is to either increase the size of the heap, start the G1 background processing earlier, or speed up GC processing by using more background threads.
To performing G1 background activities more frequently, you can reduce the threshold at which a G1 cycle is triggered. This is achieved by reducing the value of the
This flag is set to 45 by default. This means that a GC cycle is triggered when the heap becomes 45 percent filled. Reducing this value means GC would get triggered earlier and more often. But care should be taken that the value is not set to a number that's too low which would result in GCs happening too frequently.
To increase the number of background threads use the
The default value for this flag is set to the value of
ParallelGCThreads plus 2, divided by 4. As long as you have sufficient CPU available on the machine, you can increase this value without incurring any performance penalties.
If tuning the heap size and tuning the collector doesn't work for you, then you can try another collector. And if you still aren't getting good results, then you need to look at tuning the application code itself.
Published at DZone with permission of Tim Ojo, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.