Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Class Sharing in Eclipse OpenJ9: How to Improve Memory, Performance (Part 2)

DZone's Guide to

Class Sharing in Eclipse OpenJ9: How to Improve Memory, Performance (Part 2)

Learn how to reduce your memory footprint and improve startup performance in this tutorial on class sharing in Eclipse OpenJ9.

· Java Zone ·
Free Resource

Download Microservices for Java Developers: A hands-on introduction to frameworks and containers. Brought to you in partnership with Red Hat.

Memory footprint and startup time are important performance metrics for a Java virtual machine (JVM). The memory footprint becomes especially important in the cloud environment since you pay for the memory that your application uses. In this tutorial, we will show you how to use the shared classes feature in Eclipse OpenJ9 to reduce the memory footprint and improve your JVM startup time.

Runtime Bytecode Modification

Runtime bytecode modification is a popular way of instrumenting behavior into Java classes. It can be performed using the JVM Tools Interface (JVMTI) hooks (details can be found here). Alternately, the class bytes can be replaced by the class loader before the class is defined. This presents an extra challenge to class sharing, as one JVM may cache instrumented bytecode that should not be loaded by another JVM sharing the same cache.

However, because of the dynamic nature of the OpenJ9 Shared Classes implementation, multiple JVMs using different types of modification can safely share the same cache. Indeed, if the bytecode modification is expensive, caching the modified classes has an even greater benefit, as the transformation only needs to be performed once. The only provision is that the bytecode modifications should be deterministic and predictable. Once a class has been modified and cached, it cannot be changed further.

Modified bytecode can be shared by using the modified= sub-option to -Xshareclasses. The context is a user-defined name that creates a logical partition in the shared cache into which all the classes loaded by that JVM are stored. All JVMs using that particular modification should use the same modification context name. They all load classes from the same shared cache partition. Any JVM using the same shared cache without the modifiedub-option finds and stores vanilla classes as normal.

Potential Pitfalls

If a JVM is running with a JVMTI agent that has registered to modify class bytes and the modified sub-option is not used, class sharing with other vanilla JVMs or JVMs using other agents is still managed safely, albeit with a small performance cost due to extra checking. Thus, it is always more efficient to use the modified sub-option.

Note that this is only possible because the JVM knows when bytecode modification is occurring because of the use of the JVMTI API. Redefined and retransformed classes are not stored in the cache. JVM stores vanilla class byte data in the shared cache, which allows the JVMTI ClassFileLoadHook event to be triggered for all classes loaded from the cache. Therefore, if a custom class loader modifies class bytes before defining the class without using JVMTI and without using the modified sub-option, the classes being defined are assumed to be vanilla and could be incorrectly loaded by other JVMs.

For more detailed information on sharing modified bytecode, see here.

Using the Helper API

The Shared Classes Helper API is provided by OpenJ9 so that developers can integrate class sharing support into custom class loaders. This is only required for class loaders that do not extend java.net.URLClassLoader, as those class loaders automatically inherit class-sharing support.

A comprehensive tutorial on the Helper API is beyond the scope of this article, but we will provide a general overview. If you'd like to know more details, you can find the Helper API implementation on GitHub.

The Helper API: a Summary

All the Helper API classes are in the com.ibm.oti.shared package. Each class loader wishing to share classes must get a SharedClassHelperobject from a SharedClassHelperFactory. Once created, the SharedClassHelperbelongs to the class loader that requested it and can only store classes defined by that class loader. The SharedClassHelpergives the class loader a simple API for finding and storing classes in the shared cache. If the class loader is garbage collected, its SharedClassHelperis also garbage collected.

Using the SharedClassHelperFactory

The SharedClassHelperFactoryis a singleton that is obtained using the static method com.ibm.oti.shared.Shared.getSharedClassHelperFactory(), which returns a factory if class sharing is enabled in the JVM; otherwise, it returns null.

Using the SharedClassHelpers

There are three different types of SharedClassHelper that can be returned by the factory. Each is designed for use by a different type of class loader:

  • SharedClassURLClasspathHelper: This helper is designed for use by class loaders that have the concept of a URL classpath. Classes are stored and found in the shared cache using the URL classpath array. The URL resources in the classpath must be accessible on the filesystem for the classes to be cached. This helper also carries some restrictions on how the classpath can be modified during the lifetime of the helper.
  • SharedClassURLHelper: This helper is designed for use by class loaders that can load classes from any URL. The URL resources given must be accessible on the filesystem for the classes to be cached.
  • SharedClassTokenHelper: This helper effectively turns the shared class cache into a simple hash table — classes are stored against string key tokens that are meaningless to the shared cache. This is the only helper that doesn't provide dynamic update capability because the classes stored have no filesystem context associated with them.

Each SharedClassHelperhas two basic methods, the parameters of which differ slightly between helper types:

  • byte[] findSharedClass(String classname...) should be called after the class loader has asked its parent for the class (if one exists). If the findSharedClass() does not return null, the class loader should call the defineClass()on the byte array returned. Note that this function returns a special cookie for the defineClass(), not actual class bytes, so the bytes cannot be instrumented.
  • boolean storeSharedClass(Class clazz...) should be called immediately after a class has been defined. The method returns true if the class was successfully stored and false otherwise.

Other Considerations

When deploying class sharing with your application, you need to consider factors such as security and cache tuning. These considerations are briefly summarized below.

Security

By default, the shared caches are created with user-level security, so only the user that created the shared cache can access it. For this reason, the default cache name is different for each user so that clashes are avoided. On UNIX, there is a sub-option to specify groupAccess, which gives access to all users in the primary group of the user that created the cache.

In addition to this, if there is a SecurityManager installed, a class loader can only share classes if it has been explicitly granted the correct permissions. Refer to the user guide here for more details on setting these permissions.

Garbage Collection and Just-in-time Compilation

Running with class sharing enabled has no effect on class garbage collection (GC). Classes and class loaders are still garbage collected, just as they are in the non-shared case. Also, there are no restrictions placed on GC modes or configurations when using class sharing.

It is not possible to cache just-in-time (JIT) compiled code in the class cache. The AOT code in the shared cache is also subject to JIT compilation, and it affects how and when a method is JIT'ed. In addition, the JIT hints and profile data can be stored in the shared cache. You can use options Xscmaxjitdata and -Xscminjitdata to set the size for shared cache space for such JIT data.

Cache Size Limits

The current maximum theoretical cache size is 2GB. The cache size is limited by factors such as available system memory, available virtual address space, available disk space, etc. More details can be found here.

Example

To practically demonstrate the benefits of class sharing, this section provides a simple graphical demo. The source and binaries are available on GitHub.

The demo app works on Java 8 and looks for the jre\lib directory and opens each JAR, calling Class.forName() on every class it finds. This causes about 16,000 classes to be loaded into the JVM. The demo reports on how long the JVM takes to load the classes. This is a slightly contrived example, but it effectively demonstrates the benefits of class sharing. Let's run the application and see the results.

Class-loading Performance

1. Download JDK with OpenJ9 from the Adopt OpenJDK project or pull from the docker image.

2. Download shcdemo.jar from GitHub.

3. Run the test a couple of times without class sharing to warm up the system disk cache, using the command in Listing 11:

Listing 11. Warming up the Disk Cache

C:\OpenJ9>wa6480_openj9\j2sdk-image\bin\java -Xshareclasses:none -cp shcdemo.jar ClassLoadStress


When the window in Figure 1 appears, press the button. The app will load the classes.

Figure 1. Press the button

Figure 1

Once the classes have loaded, the application reports how many it loaded and how long it took, as Figure 2 shows:

Figure 2. Results are in !

Figure 2

You'll notice that the application probably gets slightly faster each time you run it; this is because of operating system optimizations.

4. Now, run the demo with class sharing enabled, as Listing 12 illustrates. A new shared cache is created. You can specify a cache size of about 50MB to ensure that there is enough space for all the classes. Listing 12 shows the command line and some sample output.

Listing 12. Running the Demo With Class Sharing Enabled

C:\OpenJ9>wa6480_openj9\j2sdk-image\bin\java -cp shcdemo.jar -Xshareclasses:name=demo,verbose -Xscmx50m ClassLoadStress
[-Xshareclasses persistent cache enabled]
[-Xshareclasses verbose output enabled]
JVMSHRC236I Created shared classes persistent cache demo
JVMSHRC246I Attached shared classes persistent cache demo
JVMSHRC765I Memory page protection on runtime data, string read-write data and partially filled pages is successfully enabled
JVMSHRC168I Total shared class bytes read=1111375. Total bytes stored=40947096
JVMSHRC818I Total unstored bytes due to the setting of shared cache soft max is 0. Unstored AOT bytes due to the setting of -Xscmaxaot is 0. Unstored JIT bytes due to the setting of -Xscmaxjitdata is 0.


You can also check the cache statistics using printStats , as Listing 13 shows:

Listing 13. Checking the Number of Cached Classes

C:\OpenJ9>wa6480_openj9\j2sdk-image\bin\java -cp shcdemo.jar -Xshareclasses:name=demo,printStats

Current statistics for cache "demo":

Cache created with:
        -Xnolinenumbers                      = false
        BCI Enabled                          = true
        Restrict Classpaths                  = false
        Feature                              = cr

Cache contains only classes with line numbers

base address                         = 0x0000000011F96000
end address                          = 0x0000000015140000
allocation pointer                   = 0x000000001403FF50

cache size                           = 52428192
softmx bytes                         = 52428192
free bytes                           = 10874992
ROMClass bytes                       = 34250576
AOT bytes                            = 1193452
Reserved space for AOT bytes         = -1
Maximum space for AOT bytes          = -1
JIT data bytes                       = 28208
Reserved space for JIT data bytes    = -1
Maximum space for JIT data bytes     = -1
Zip cache bytes                      = 902472
Data bytes                           = 351648
Metadata bytes                       = 661212
Metadata % used                      = 1%
Class debug area size                = 4165632
Class debug area used bytes          = 3911176
Class debug area % used              = 93%

# ROMClasses                         = 17062
# AOT Methods                        = 559
# Classpaths                         = 3
# URLs                               = 0
# Tokens                             = 0
# Zip caches                         = 5
# Stale classes                      = 0
% Stale classes                      = 0%

Cache is 79% full

Cache is accessible to current user = true


5. Now, start the demo again with the same Java command line. This time, it should read the classes from the shared class cache, as you can see in Listing 14.

Listing 14. Running the Application With a Warm Shared Cache

C:\OpenJ9>wa6480_openj9\j2sdk-image\bin\java -cp shcdemo.jar -Xshareclasses:name=demo,verbose -Xscmx50m ClassLoadStress
[-Xshareclasses persistent cache enabled]
[-Xshareclasses verbose output enabled]
JVMSHRC237I Opened shared classes persistent cache demo
JVMSHRC246I Attached shared classes persistent cache demo
JVMSHRC765I Memory page protection on runtime data, string read-write data and partially filled pages is successfully enabled
JVMSHRC168I Total shared class bytes read=36841382. Total bytes stored=50652
JVMSHRC818I Total unstored bytes due to the setting of shared cache soft max is 0. Unstored AOT bytes due to the setting of -Xscmaxaot is 0. Unstored JIT bytes due to the setting of -Xscmaxjitdata is 0.


You can clearly see the significant (about 40 percent) improvement in class load time from figure 3. Again, you should see the performance improve slightly each time you run the demo because of operating system optimizations.

Figure 3. Warm cache results

Image title

There are a few variations you can experiment with. For example, you can use the javaw command to start multiple demos and trigger all loading classes together to see the concurrent performance.

In a real-world scenario, the overall JVM startup time benefit that can be gained from using class sharing depends on the number of classes that are loaded by the application. A HelloWorld program will not show much benefit, whereas a large web server certainly will. However, this example has hopefully demonstrated that experimenting with class sharing is very straightforward, so you can easily test the benefits.

Memory Footprint

It is also easy to see the memory savings when running the example program in more than one JVM.

Below are four VMMap snapshots obtained using the same machine as the previous examples. In Figure 4, two instances of the demo have been run to completion without class sharing. In Figure 5, two instances have been run to completion with class sharing enabled, using the same command lines as before.

Figure 4. Two instances of demo with no class sharing

Figure 4-1

Image title

Figure 5. Two instances of demo with class sharing enabled

Image title

Figure 5-2

The share cache size is 50MB in the experiment, so the Mapped Files size of each instance in Figure 6 is 50MB more (56736KB – 5536KB) compared to Figure 5.

You can clearly see that the memory usage (Private WS) when shared classes are enabled is significantly lower. A saving of about 70MB Private WS is achieved for 2 JVM instances. More memory saving will be observed if more instances of the demo are launched with class sharing enabled. The test results above are obtained on a Windows 10 laptop with 32GB RAM, using an Intel(R) Core(TM) i7-6820HQ CPU @ 2.70GHz.

We perform the same memory footprint experiment on a Linux x64 machine as well. Listing 15 shows the result of two JVM instances with no class sharing and Listing 16 shows the result of two JVM instances with class sharing enabled.

Looking at the results, RSS does not show much improvement when class sharing is enabled. This is because the whole shared cache is included in RSS. But, if we look at the PSS, which counts only half of the shared cache to each JVM (as it is shared by 2 JVMs), there is a saving of about 34MB.

Listing 15. Footprint on Linux With Class Sharing Disabled

pmap -X 9612
9612:   xa6480_openj9/j2sdk-image/jre/bin/java -cp shcdemo.jar ClassLoadStress
Address Perm  …   Size    Rss     Pss Referenced Anonymous Swap Locked Mapping 
…
                ======= ======= ===== ========   ========= ==== ====
                2676500 118280 106192 118280     95860     0    0 KB
pmap -X 9850
9850:   xa6480_openj9/j2sdk-image/jre/bin/java -cp shcdemo.jar ClassLoadStress
Address Perm  …   Size    Rss     Pss Referenced Anonymous Swap Locked Mapping 
…
                ======= ======= ===== ========   ========= ==== ====
                2676500 124852 112792 124852     102448    0    0 KB


List 16. Footprint on Linux With Class Sharing Enabled

pmap -X 4501
4501:   xa6480_openj9/j2sdk-image/jre/bin/java -Xshareclasses:name=demo -Xscmx50m -cp shcdemo.jar ClassLoadStress
Address Perm  …   Size    Rss     Pss Referenced Anonymous Swap Locked Mapping
…
7fe7d0e00000 rw-s 4       4       2       4        0    0      0 C290M4F1A64P_demo_G35
7fe7d0e01000 r--s 33356   33356   16678   33356    0    0      0 C290M4F1A64P_demo_G35
7fe7d2e94000 rw-s 11096   48      24      48       0    0      0 C290M4F1A64P_demo_G35
7fe7d396a000 r--s 5376    1640    832     1640     0    0      0 C290M4F1A64P_demo_G35
7fe7d3eaa000 rw-s 296     0       0       0        0    0      0 C290M4F1A64P_demo_G35
7fe7d3ef4000 r--s 1072    0       0       0        0    0      0 C290M4F1A64P_demo_G35
…
                  ======= ======= ===== ======== ====== ====== ====
                  2732852 120656  90817 97988    62572  0      0 KB
pmap -X 4574
4574:   xa6480_openj9/j2sdk-image/jre/bin/java -Xshareclasses:name=demo -Xscmx50m -cp shcdemo.jar ClassLoadStress
Address Perm  …   Size    Rss     Pss Referenced Anonymous Swap Locked Mapping 
…
7f308ce00000 rw-s 4       4       2       4        0    0      0 C290M4F1A64P_demo_G35
7f308ce01000 r--s 33356   33356   16678   33356    0    0      0 C290M4F1A64P_demo_G35
7f308ee94000 rw-s 11080   48      24      48       0    0      0 C290M4F1A64P_demo_G35
7f308f966000 r--s 5392    1632    824     1632     0    0      0 C290M4F1A64P_demo_G35
7f308feaa000 rw-s 296     0       0       0        0    0      0 C290M4F1A64P_demo_G35
7f308fef4000 r--s 1072    0       0       0        0    0      0 C290M4F1A64P_demo_G35
…
                  ======= ======= ===== ======== ====== ====== ====
                  2730800 122832  92911 102584   64812  0      0 KB


Conclusion

The Shared Classes feature in the OpenJ9 implementation offers a simple and flexible way to reduce memory footprint and improve JVM startup time. In this article, you have seen how to enable the feature, how to use the cache utilities, and how to get quantifiable measurements of the benefits.

Download Building Reactive Microservices in Java: Asynchronous and Event-Based Application Design. Brought to you in partnership with Red Hat

Topics:
java ,tutorial ,helper api ,jvm ,memory footprint ,jvmti

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}