LJV: What We Can Learn From Java Data Structures Visualization
What we can learn from Java data structures visualization, where we can illustrate what's going on inside of the data structures.
Join the DZone community and get the full member experience.Join For Free
When I started to prepare a course of lectures on the Java language for the Moscow Institute of Physics and Technology, I became to search for material that can be used as illustrations. ‘One picture is worth a thousand words’ as they say, and sometimes it’s worth even more because it seems impossible to explain e. g. how a hash table works without drawing something. My task was to explain to students many Java concepts, from string pool to advanced data structures. I was looking for a solution that is able to make the visualization of Java data structures in the easiest and precise way, ideally compatible with ‘presentation as code’ technology.
There exists Aleksey Shipilev’s JOL (Java Object Layout) tool known among Java performance engineers. JOL analyzes the memory layout for any given object, including all the auxiliary data and fields, and the graph of objects reachable from the given instance. This tool gives accurate estimations of the object size and also shows the addresses in memory where objects are located. However, it’s not yet able to visualize the object graph and gives too many low-level details that are irrelevant for the students who just started to learn Java.
While surfing the web looking for something that better fits my needs, I came across a page on the University of Auckland’s website dedicated to the GPL-licensed program called LJV (Lightweight Java Visualizer). This tool was developed by Dr. John Hamer in Java version 1.4 in 2004, which is the year when Maven 1.0 was released, and four years before GitHub appeared. Although it hadn’t been developed since then, it still worked perfectly and was able to run at least on Java 8 and do its job! I used LJV extensively while preparing the slides for my lectures, and in the fall of 2020 two of my students, Ilya Selivanov and Nuras Nogaev updated the code to Java 11, added tests, CI, documentation, a couple of new features, and published it on GitHub and Maven Central as a part of their semester project. Here I would like to briefly explain the LJV working principle and show some examples of its usage.
The idea behind LJV is very simple. It takes an arbitrary object and then uses Java Reflection API to traverse the graph of objects reachable from a given one. It forms a representation of the graph in DOT language that can be later converted into
svg image using Graphviz software. Actually we even need not to have graphviz installed: GraphvizOnline can render an image in a browser.
Thus, this one-liner
gives us the following in the console:
In order to convert this to an image one might use
or, even simpler, just paste the resulting script to GraphViz Online and get a picture:
By calling configuration methods on
LJV the object we can customize the way it visualizes data. We can choose fonts and colors for classes and fields, whether to hide null-valued fields, which objects should be represented only by their
toString method invocation result, diagram drawing direction (top-to-bottom or left-to-right), etc., etc.
The things that we see as a result of the visualization are, in my opinion, interesting and instructive not only for Java beginners but also for all the practicing Java engineers, even those who already know about the effects shown.
Let’s see some examples.
The most widely used type of data in Java is, of course,
String. Starting from Java 9, the internal representation of
String has changed:
char was replaced by
coder flag was introduced in order to switch between 8-bit and 16-bit character representation. This allowed significant memory optimization for strings that contain only LATIN-1 charset characters:
|Java 8-: one 16-bit
char per character
|Java 9+: coder set to 0 and one byte per LATIN-1 character
||Java 9+: coder set to 1 and 2 bytes
per character if there are symbols
outside LATIN-1 set
One rarely needs to create
new operator, however, it’s worth noticing that
String(String original) constructor reuses the internal byte array of its argument. Concatenation (even with an empty string!) always produces a full new copy:
intern() deduplicates all the
String objects and reduce them to a single value kept in the
String pool (compare with the previous example):
Boxed Primitives Caching
Usually, we create boxed primitives via autoboxing. In rare cases when we do need to create e. g.
Integer object explicitly, the correct way to do this is with
Integer.valueOf the method. This method deduplicates values in the range from -128 to 127 or
Values outside this range will not be deduplicated even when autoboxing is used.
Integer created with constructor will never be deduplicated, and this constructor is deprecated since Java 9.
A linked list is a data structure with theoretical O(1) efficiency for adding/removing its random node that can act both as
Deque. In Java practice, however,
LinkedList is superseded by
ArrayDeque in all the cases, and it’s questionable whether this class is needed in the standard library at all.
LinkedList, then what? Java has a number of high-performant array-based data structures.
ArrayList is well-known, but there are also
ArrayDeque based on the looped array and
PriorityQueue based on a balanced binary heap, which is actually also an array.
Let’s see, for example, how looped buffer of
This structure implements queue capabilities. If the maximum number of elements in the queue does not grow over time, this data structure works very fast and memory-efficient, with constant time for every operation.
|3 values put to the queue
||2 values polled
||The end of the buffer is reached, started from the beginning
HashMap is a widely used data structure in Java. For many people, implementation details of
HashMap is also a favorite topic of discussion in a Java programmer job interview.
There are a number of ways to implement hash collisions resolution in a hash map, developers of the Java platform chose linked lists:
While the number of collisions on a single
HashMap bucket is small, the linked list keeps growing:
However, if a single bucket becomes overloaded with collisions, and keys implement
Comparable interface, the linked list turns to a tree.
This reduces the search time in a bucket from O(N) to O(log(N)) and mitigates a certain kind of DDoS attacks:
One of the features of
HashMap is that this data structure completely 'forgets' the order of insertion of its elements. Also, iteration over
HashMap is not very effective from a performance point of view. When insertion order matters, we can use
LinkedHashMap, which is actually a
HashMap combined with a linked list. One of the possible use cases for
LinkedHashMap is LRU cache implementation.
TreeMap in Java is a Red-Black tree that implements
NavigableMap interface. This implementation provides guaranteed O(log(N)) time cost for get/put/remove operations, which in practice is inferior to
TreeMap when we need
higherKey(..) and other
NavigableMap capabilities not provided by a simple
ConcurrentSkipListMap is a thread-safe
NavigableMap implementation, that uses quite a complex non-blocking algorithm involving random numbers generator. That’s why for a given input its internal representation never looks the same from one run to another:
I hope you liked these examples. Maybe you can think of other examples of data layout visualization that are worth adding here. In this case, experiment with Lightweight Java Visualizer and share your ideas!
Opinions expressed by DZone contributors are their own.