Reloading Java Classes: HotSwap and JRebel — Behind the Scenes
In this article we’ll review how classes can be reloaded without dynamic classloaders. We will take a look at the JVM HotSwap class reloading support, Instrumentation API and ZeroTurnaround's JRebel.
HotSwap and Instrumentation
In 2002, Sun introduced a new experimental technology into the Java 1.4 JVM, called HotSwap. It was incorporated within the Debugger API, and allowed debuggers to update class bytecode in place, using the same class identity. This meant that all objects could refer to an updated class and execute new code when their methods were called, preventing the need to reload a container whenever class bytecode was changed. All modern IDEs (including Eclipse, IDEA and NetBeans) support it. As of Java 5 this functionality is also available directly to Java applications, through the Instrumentation API.
Unfortunately, this redefinition is limited only to changing method bodies -- it cannot either add methods or fields or otherwise change anything else, except for the method bodies. This limits the usefulness of HotSwap, and it also suffers from other problems:
- The Java compiler will often create synthetic methods or fields even if you have just changed a method body (e.g. when you add a class literal, anonymous and inner classes, etc).
- Running in debug mode will often slow the application down or introduce other problems
This causes HotSwap to be used less than, perhaps, it should be.
Why is HotSwap limited to method bodies?
This question has been asked a lot during the almost 10 years since the introduction of HotSwap. One of the most voted for bugs for the JVM calls for supporting a whole array of changes, but so far it has not been implemented. A disclaimer: I do not claim to be a JVM expert. I have a good general idea how the JVM is implemented and over the years I talked to a few (ex-)Sun engineers, but I haven't verified everything I'm saying here against the source code. That said, I do have some ideas as to the reasons why this bug is still open (but if you know the reasons better, feel free to correct me). The JVM is a heavily optimized piece of software, running on multiple platforms. Performance and stability are the highest priorities. To support them in different environments the Sun JVM features:
- Two heavily optimized Just-In-Time compilers (-client and -server)
- Several multi-generational garbage collectors
These features make evolving the class schema a considerable challenge. To understand why, we need to look a little closer as to what exactly is necessary to support adding methods and fields (and even more advanced, changing the inheritance hierarchy). When loaded into the JVM, an object is represented by a structure in memory, occupying a continuous region of memory with a specific size (its fields plus metadata). In order to add a field, we would need to resize that structure, but since nearby regions may already be occupied, we would need to relocate the whole structure to a different region where there is enough free space to fit it in. Now, since we're actually updating a class (and not just a single object) we would have to do this to every object of that class.
In itself this would not be hard to achieve -- Java garbage collectors already relocate objects all the time. The problem is that the abstraction of one "heap" is just that, an abstraction. The actual layout of memory depends on the garbage collector that is currently active and, to be compatible with all of them, the relocation should probably be delegated to the active garbage collector. The JVM will also need to be suspended for the time of relocation, so doing GC at the same time makes sense.
Adding a method does not require updating the object structure, but it does require updating the class structure, which is also present on the heap. This would not likely require a relocation of the class structure as the method tables should be separate from the class structure. But consider this: the moment after a class has been loaded it is essentially is frozen forever. This enables the JIT to perform the main optimization that the JVM does -- inlining. Most of the method calls in your application hot spots are eliminated and the code is copied to the calling method. A simple check is inserted to ensure that the target object is indeed what we think it is.
Here's the punchline: the moment we can add methods to classes this "simple check" is not enough. We would need a considerably more complicated check that needs to ensure not only that no methods with the same name were added to the target class, but also to all it's superclasses. Alternatively we could track all the inlined spots and their dependencies and deoptimize them when a class is updated. Either way it will probably be a severe performance hit for the JVM and is likely the reason why it is not implemented.
On top of that, consider that we're talking about multiple platforms with varying memory models and instructions sets that probably require at least some specific handling and you get yourself a very expensive problem with noone willing to pay.
In 2007, ZeroTurnaround announced the availability of a tool called JRebel (then JavaRebel) that could update classes without dynamic class loaders and with very few limitations. Unlike HotSwap, which is dependent on IDE integration, the tool works by monitoring the actual compiled .class files on disk and updating the classes whenever the files are updated. This means that you can use JRebel with a text editor and command-line compiler if so willing. Of course, it's also integrated neatly into Eclipse, IntelliJ, and NetBeans. Unlike dynamic classloaders, JRebel preserves the identity and state of all existing objects and classes, allowing developers to continue using their application without delay.
How does this work?
For starters, JRebel works on a different level of abstraction than HotSwap. Whereas HotSwap works at the virtual machine level and is dependent on the inner workings of the JVM, JRebel makes use of two remarkable features of the JVM -- abstract bytecode and classloaders. Classloaders allow JRebel to recognize the moment when a class is loaded, then translate the bytecode on-the-fly to create another layer of abstraction between the virtual machine and the executed code. Others have used this features to enable profilers, performance monitoring, continuations, software transactional memory and even distributed heap. Combining bytecode abstraction with classloaders is a powerful combination, and can be used to implement a variety of features even more exotic than class reloading. As we examine the issue closer, we'll see that the challenge is not just in reloading classes, but also doing so without a visible degradation in performance and compatibility.
As we reviewed in Reloading Java Classes 101 the problem in reloading classes is that once a class has been loaded it cannot be unloaded or changed; but we are free to load new classes as we please. To understand how we could theoretically reload classes, let's take a look at dynamic languages on the Java platform. Specifically, let's take a look at JRuby (we'll simplify a lot, so don't crucify anyone important).
Although JRuby features "classes", at runtime each object is dynamic and new fields and methods can be added at any moment. This means that a JRuby object is not much more than a Map from method names to their implementations and from field names to their values. The implementations for those methods are contained in anonymously named classes that are generated when the method is encountered. If you add a method, all JRuby has to do is generate a new anonymous class that includes the body of that method. As each anonymous class has a unique name there are no issues loading it and as a result the application is updated on-the-fly.
Theoretically, since bytecode translation is usually used to modify the class bytecode, there is no reason why we can't use the information in that class and just create as many classes as necessary to fulfill its function. We could then use the same transformation as JRuby and split all Java classes into a holder class and method body classes. Unfortunately, such an approach would be subject to (at least) the following problems:
- Perfomance. Such a setup would mean that each method invocation would be subject to indirection. We could optimize, but the application would be at least an order of magnitude slower. Memory use would also skyrocket, as so many classes are created.
- Java SDK classes. The classes in the Java SDK are considerably harder to process than the ones in the application or libraries. Also they often are implemented in native code and cannot be transformed in the "JRuby" way. However if we leave them as is, then we'll cause numerous incompatibility errors, which are likely not possible to work around.
- Compatibility. Although Java is a static language it includes some dynamic features like reflection and dynamic proxies. If we apply the "JRuby" transformation none of those features will work unless we replace the Reflection API with our own classes, aware of the transformation.
Therefore, JRebel does not take such an approach. Instead it uses a much more complicated approach, based on advanced compilation techniques, that leaves us with one master class and several anonymous support classes that allow modifications to take place without any visible degradation in performance or compatibility. It also:
- Leaves as many method invocations intact as possible. This means that JRebel minimizes its performance overhead, making it lightweight.
- Avoids instrumenting the Java SDK except in a few places that are necessary to preserve compatibility.
- Tweaks the results of the Reflection API, so that we can correctly include the added/removed members in these results. This also means that the changes to Annotations are visible to the application.
Beyond Class Reloading - Archives
Reloading classes is something Java developers have complained about for a long time, but once we solved it, other problems turned up. The Java EE standard was developed without much concern for development Turnaround (the time it takes between making a change to code and seeing the effects of that change in an application). It expects that all applications and their modules be packaged into archives (JARs, WARs and EARs), meaning that before you can update any file in your application, you need to update the archive -- which is usually an expensive operation involving a build system like Ant or Maven. As we discussed in Reloading Java Classes 301 this can be minimized by using exploded development and incremental IDE builds, but for large application this is commonly not a viable option.
To solve this problem in JRebel 2.x we developed a way for the user to map archived applications and modules back to the workspace -- our users create a rebel.xml configuration file in each application and module that tells JRebel where the source files can be found. JRebel integrates with the application server, and when a class or resource is updated it is read from the workspace instead of the archive.
This allows for instant updates of not just classes, but any kind of resources like HTML, XML, JSP, CSS, .properties and so on. Maven users don't even need to create a rebel.xml file, since our Maven plugin will generate it automatically.
Beyond Class Reloading - Configurations and Metadata
En route to eliminating Turnaround, another issue becomes obvious: Nowadays, applications are not just classes and resources, they are wired together by extensive configuration and metadata. When that configuration changes it should be reflected in the running application. However it's not enough to make the changes to the configuration files visible, the specific framework must reload it and reflect the changes in the application.
To support these kinds of changes in JRebel we developed an open source API that allows our team and third party contributers to make use of JRebel's features and propagate changes in configuration to the framework, using framework-specific plugins. E.g. we support adding beans and dependencies in Spring on-the-fly as well as a wide variety of changes in other frameworks.
This article sums up the methods to reload Java classes without dynamic class loaders. We also discuss the reasons for HotSwap's limitations, how JRebel works behind the scenes and the problems that arise when class reloading is solved.