Over a million developers have joined DZone.

Revisiting the Advanced Theories of ‘Java Garbage Collection’

This post enumerates each aspect of JVM garbage collection while detailing APIs in a lucid yet self-explanatory manner.

· Java Zone

Learn more about how the Java language, tools and frameworks have been the foundation of countless enterprise systems, brought to you in partnership with Salesforce.

JVM or the ‘Java Virtual Machine’ resorts to Garbage Collection for automatically housekeeping memory while an application propels in the hindsight. Without the garbage collector, the programmer would have to undertake memory deallocation—explicitly for each and every application. This can leverage productivity in most cases. With a Garbage Collector in place, the program can solely dedicate its core to problem solving, letting the JVM handle issues pertaining to memory management. While this happens to be a complicated process, a detailed analysis is actually appropriate when dealing with the advanced concepts of JVM and the Java Garbage Collection. This post enumerates each aspect while detailing APIs in a lucid yet self-explanatory manner:

Image title

Figure 1: Simplifying Garbage Collection

Pitfalls of Explicit Garbage Collection

It isn’t easy to reclaim unused chunks of memory, via explicit Garbage Collection courtesy erroneous codes and unexpected behavior of the concerned program. Some of the lingering issues include Dangling Reference and Memory Leaks. While the former is synonymous to faulty memory allocation and deallocation in case of one object simultaneously referring to two entities, the second one mainly occurs when the memory segment isn’t released but allocated beforehand. Dangling reference can be understood by looking at the C++/C pointers. Memory leaks happen when the root element is deleted but the memory clusters aren’t refreshed. This keeps on consuming resources which cannot be reclaimed later.

Resorting to Implicit GC

Implicit garbage collection helps in automatic management of memory— a feature seen in most object-oriented programming languages. Primary functions include seamless memory allocation, existence of a referenced object within the memory, and helping JVM with memory recovery even from the jaws of dormant references.

Layout of the Garbage Collector

Three categories are encountered while segregating the ‘Java Memory Manager’— namely young, old and permanent generation. Memory can now be effaced in waves with garbage collector in picture. It all starts when new objects are allocated to the young generation and the existing ones are shifted to the old generation. However, some larger objects can also be shifted to the old generation, almost directly.

Image title

Figure 2: Detailed GC Layout

The permanent generation mainly houses methods and classes which are comparatively easier to handle by GC. The data clusters in garbage collector resemble waves and this brings us to the subsequent segments of the young generation—namely Eden and Survivor Spaces. The former includes objects which have survived one data effacing wave. These objects move to the survivor spaces and then to the old generation. Once the young generation is full, GC runs an algorithm i.e. minor collection for cleaning up the segments or moving data to the subsequent ones. Upon fill up, the old generation initiates a major collection— focused on cleaning or collecting data from the generations.

Myriad GC Algorithms

While dealing with GC algorithms we can enumerate four alternatives within the Java Hotspot Virtual Machine. The first one is Serial GC—collecting old and young generations, serially. This algorithm pauses the execution while the collection is underway. Next in line is Parallel GC which is nothing but a parallel version of the first one. This option is used while dealing with multiple processors and a larger the usual, memory base.

Sweep GC or the MCM handles long pauses and avoids them instantly. This algorithm keeps away from compacting the ‘Old Generation’ and helps reclaim and manage spaces, via free-lists. Collecting in mark is its most potent virtue besides sweeping phases in parallel. Applications which are dedicated to live-streaming, including Cinemabox and even Playbox are immensely benefited by this algorithm as once installed the caches can be easily freed upon request from the Sweep GC.

Lastly we need to analyze the Garbage First algorithm which is mainly used for collections. This one brings a sense of configurability and predictability besides working as a replacement for the CMS or Sweep GC. Instead of stopping the pause time completely, this option allows us to select the predefined timeframe. Moreover, with ‘Garbage First’ in picture, we are hardly guaranteed a request grant.

Summoning the GC

JVM summons the garbage collector whenever the system is low on memory and this can also be invoked from the concerned application code. However, the gc() method isn’t a certainty and is merely a suggestion for the garbage collector to listen to JVM.

According to the ‘Java Documentation API’, gc() works in a factual manner as follows:

“Runs the garbage collector. Calling this method suggests that the Java virtual machine expends effort toward recycling unused objects in order to make the memory they currently occupy available for quick reuse. When control returns from the method call, the virtual machine has made its best effort to recycle all discarded objects. The namegcstands for "garbage collector." The virtual machine performs this recycling process automatically as needed, in a separate thread, even if thegcmethod is not invoked explicitly. The methodSystem.gc()is the conventional and convenient means of invoking this method."

Types of References

With Java we do get several references types— capable of designated a new meaning to the object class. Any program might use the preferred reference type which in turn would be exclusively collected by the GC. Some of us who are only dealing with strong references might not be familiar with the featured set on offer:

The ordinary reference type looks somewhat similar to the following:

String String (“Hello”);

While this happens to the strong reference, we do get three more in soft, phantom and weak for our own benefit— segregated as SoftReference<T>, PhantomReference<T>, and WeakReference<T>. This terminology is due to these being subclasses of the concerned abstract base class i.e. the Reference<T>.

Soft References: These are mainly used for ‘memory sensitive’ caches.

Phantom References: These help with pre-mortem actions involving cleanups and scheduling the ‘Java Finalization Mechanism’.

Weak References: Canonical Mappings are implemented with these while allowing value and key recovery.

All these references correspond to different reachability levels with the main ones being strongly reachable, softly reachable, weakly reachable, phantom reachable and finally, unreachable.

Bottom Line

The entire concept of Garbage Collection is featured under the complex aegis of the JVM. To be precise, this is where the most crucial functions are initiated. However, the programmer is vested the responsibility of handling algorithms and references with due care. While this post takes us behind GCs working style, the detailed list of codes will be presented in the subsequent posts.

Discover how the Force.com Web Services Connector (WSC) is a code-generation tool and runtime library for use with Force.com Web services, brought to you in partnership with Salesforce.


The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}