After the earlier Back to Basics: JDBC Revisited, here's a quick look at RMI, the enabling technology behind today's Enterprise Java and what makes it tick.
Intro and Basics
RMI forms the basis for Java's own remoting capabilities powering the platform's domination on servers everywhere. This article attempts to see through all the frameworks, programming models and technology built on top of it, and tries to get to the lower level basics of RMI. RMI makes remoting transparent by making remote objects look like local objects. The only difference from a user of a remote service perspective would be how the object itself is obtained. RMI is based on three abstraction layers.
- Stub / Skeleton layer
- Remote Reference layer
- Transport layer
Stubs and Skeletons
Stubs and skeletons are special objects that make remoting with RMI possible. When the client requests for a remote object, it in-fact gets a stub to the object and not the object itself. The stub masks the complexity of invoking a method remotely and poses as a local object to the client. To make this process of remote method invocation transparent to the client, the programming model mandates the creation of an interface that extends java.RMI.Remote which forms the contract between the client and the server. This interface defines the methods that would be available for remote invocation, in other words, the methods a remote object wants to expose to the world. The client is only aware of this interface and not the concrete classes that implement it. Therefore when the client receives the stub, which implements the contractual remote interface, it does not know the difference between the actual implementation and the stub. When a method is invoked on the stub, it cleverly handles all the networking details, servicing the client just like any method call.
The method call and the parameters are intercepted by the stub, marshalled (something like serialized, well look at this a bit later) across to the skeleton, a server side object which un-marshalls (de-serializes) the method call and parameters and makes the call to the actual object. The return value from the actual method invocation is intercepted by the skeleton and sent back to the Stub, who in turn returns it to the client. The stub is a proxy on the client side and the skeleton is a helper class that acts as liaison or connector between the Stub and the actual object. The Stub passes the data to the remote reference layer, which hands it to the skeleton. Note that even though we discuss skeletons here, from Java 2 onwards, skeletons are no longer used or recommended.
Marshalling is very much like serialization and if fact, its a smart abstraction that uses serialization under the hood. What makes marshalling and serialization different is that while serialization is a general way to pack and transport objects, Marshalling is RMI sensitive; If the object to be serialized is an instance of java.RMI.Remote, or simply a remote object, the actual instance is not serialized but instead its stub is serialized and sent across the wire to the client. While Serialization is pass-by-value for objects, marshalling can be thought of as pass-by-reference for remote objects.
Remote Reference Layer
The remote reference layer defines the semantics for remote object references. It connects and governs the communication between the the stub and the skeleton. Its is JRMP specific and defines classes like RemoteRef which are used by the stub to get data across to the skeleton. From Java 2 on, remote reference layer adds semantics for Activatable objects. Another example of the semantics defined and managed by this layer is Multi-cast , in which the method is invoked on several remote implementations and he first response to return is used.
The Transport Layer
The transport layer is responsible for the low level connection between the two JVMs, and management of these connections. RMI defines a wire level protocol called JRMP, JRMP is on TCP/IP and also provides some firewall penetration strategies. JRMP was updated in Java 2, to eliminate skeletons and to use reflection instead.
Looking up Remote objects
A client can use either JNDI or the more simpler RMI registry to look-up the remote objects. We will not discuss JNDI here. The RMI registry itself is a remote object., and implements the interface java.RMI.registry.Registry. RMI registries can be started as a separate process or in a programmatic way, by an object itself, and by default runs on the port 1099. Starting a registry means exporting the remote object that implements the Registry interface( which means the actual registry itself is a remote object). Once a remote object is bound to a registry under a public name, a client can look-up the registry for a remote object based on the name with which it was bound. Here's what happens :
- The client tries to get a registry, which gets him a Stub that acts as the client proxy to the registry, [This step is usually transparent because JDK classes do the job of getting the registry]
- The client makes a "look-up" on the registry with the pubic name with which the remote object was bound
- A Stub to the remote object is returned to the client.
- The client can now invoke methods on the remote object just as if it were a local object. The client does not know the difference between the stub and the actual object, as the stub, even though generated by the developer, would implement of the remote interface containing the methods the client would expect.
Parameter PassingWith RMI we have three types of parameters :
- Primitive parameters : these are passed by value , just like any other primitive parameter.
- Object parameters : Since references in one JVM are meaningless in another JVM, objects are serialized and passed by value. So changing the properties of an object in the remote JVM does not make those changes visible to the local JVM.
- Remote Parameters : Remote parameters are parameters that happen to be remote objects themselves. When a remote object is used as a parameter or a return type, the JVM passes the stub to the remote object and not the remote object itself. Under the covers the JVM checks the parameters and return types. if any of them is an instance of Remote, then its stub is fetched, serialized and sent across, instead of the actual remote object. this rule applies to all references, including say, retuning 'this' from a remote object.
Distributed Garbage Collection
RMI provides a distributed garbage collector to collect remote objects. Garbage collection in RMI is more complex because there may not be any local references to an object. Internally, the JVM keeps track of clients requesting access to a remote object, and marks that object as dirty. When the remote reference is dropped, the object is marked clean; all clean objects are eligible for garbage collection when the distributed garbage collector runs. In addition to the reference counting mechanism, there is a lease mechanism, which basically means that even if a client requested access to a remote object, unless the client is actively using the connection, the connection has a timeout associated with it (defaults to 10 minutes), and when the timeout has been reached, the object is marked clean. Like the finalize() method, RMI has an interface Unreferenced, with a single method unreferenced() that is invoked when there are no more clients holding a live connection to the remote object. Remember that the registry itself is a client to the remote object and the unreferenced method on a remote object(should it choose to implement Unreferenced) will not be invoked if the registry is holding a reference to it.
Dynamic Class LoadingNormally the clients need to have the remote interface type's class files available on the client JVM so that they can use it to cast the resulting stub from from a look-up and invoke methods on it. However, RMI supports dynamic class loading as an alternative to this static approach. Dynamic class loading in RMI works similar to applets or JNLP; here a system property for a codebase is set and the required class definitions are downloaded on demand from that code base. The property is java.RMI.server.codebase and is a URL that can be accessed with file:// ftp:// or http:// protocols. note that though we say "client JVMs download the class definition" its not restricted to client JVMs; Server JVMs can also use a code base to download other remote object definitions or client side classes for callbacks. When an object is bound in the registry, the code base is also saved with it.When a client requests a remote object, the registry returns a stub to the remote object, and the client JVM looks for the class definition of the stub in its CLASSPATH and if it does not find it there, it uses the code base from the registry, to download the definitions. RMI has its own Security manager, the RMISecurityManager and the VM will not download any files unless the security manager is present.
Object ActivationOne of the downsides to RMI is that the remote objects have to be accessible at all times, including when clients are not executing or requesting their services, and eats up a lot of resources. To solve this problem object activation was introduced to RMI. This is similar to having the remote objects on an "on demand" basis, and can be though similar to lazily instantiated references. All the magic happens in the Remote reference layer, and the stub given to client is not connected to a live remote object or its skeleton, instead its connected to the activation system, that transparently makes resources available when the stub is actually being used. The activation system is a complex ecosystem and core t it is an RMI daemon 'rmid' which is able to start new virtual machine instances for the required remote objects when required.
Activatable Remote objects extend from the abstract class Activatable (no, this is not an interface and not from the usual UnicastRemoteObject). The process of registering an activatable remote object is also considerably different, though the client does not see any difference. The steps are :
- Install the SecurityManager - RMISecurityMananger
- Create an ActivationGroup - which is an object that keeps a collection of Activatable objects
- Create an ActivationDesc - or ActivationDescriptor instance, that provides all information to the rmid (the Activator) to create a new instance of the implementation class.
- Register the remote interface with the Activator - Activatable.register() registers the Remote interface with the rmid, the activator.
- Bind the stub that was returned as a result of register() with the RMIRegistry.
once this rig is setup, Activation process works like this :
- The Stub for the Activatable object contains special information like the ActivationID that identifies an Activatable.
- The stub uses this ActivationID to call on the Activator (rmid) to activate the object.
- The activator (rmid) uses the ActivationDescriptor to find the JVM in which the object is to be activated.
- The activator locates the ActivationGroup and if the ActivationGroup does not yet exist, then a new JVM is created for the AcivationGroup to run in and the request for the object's activation is forwarded to the ActivationGroup.
- The ActivationGroup loads the class and creates a new instance of the remote object.
- The live reference of the real remote object is returned to the Activator , which records the ActiovationID and reference pairing and the live reference is returned to the stub.
- The stub now forwards all method invocations to the remote object via the live reference.
Firewalls prevent application / non-standard ports from being opened up in a public network, we can use HTTP tunneling to get RMI working. Tunnelling is essentially wrapping up the RMI calls inside an HTTP POST request.
Its the responsibility of the Transport layer to do this wrapping. There are two ways to do this :
- HTTP to port : Used when the client of an RMI service is behind a firewall. The JRMP call data is automatically wrapped in an HTTP POST and sent to the proxy server on the client. This proxy server is the http proxy server the client uses to connect to the public network. The only config required to do this is to set the proxy server settings via java system properties.
- HTTP to CGI : is used when the the server is behind a firewall and cannot accept incoming connections. Here the transport layer redirects http wrapped JRMP calls to a CGI script running in a webserver. This CGI script is java-RMI and is a part of the JDK ( see bin directory). A servlet version of the CGI is also available. The CGI intercepts requests coming on 80, which is intended for a non-default port (the client transport layer embeds this info if its wrapping up the JRMP call in http) and forwards the call to the local port.
Although tunnelling is a very attractive solution to the problem of firewalls, they should be carefully used because there will be severe performance degradation, and the CGI script is actually a security hole and tunneled applications cannot use callbacks.
RMI-IIOP : CORBA InteroperabilityRMI-IIOP was designed to make RMI applications interoperable with CORBA applications. This involves supporting both the RMI's own wire protocol JRMP s well as CORBA's wire protocol IIOP. The interoperability is achieved by letting java developers use a Java interface and the CORBA developers use the IDL for coding. With an RMI-IIOP server, the remote interface can be used to generate the corresponding IDL for a CORBA C++ client application. The clients can also use a IDL to Java conversion to convert IDL to a Java Interface, enabling the RMI-IIOP client to connect to a CORBA server.
CORBA compliant remote objects are derived from the PortableRemoteObject and they need to be rmic'd with the - iiop option to tell the compiler to generate Stubs and skeletons that can speak RMI-IIOP. They form the foundation for the enterprise level distributed computing in java and the heart of the EJB programming model, though EJBs would make most of this transparent to the developer. Another crucial difference is that unlike in a plain vanilla RMI-IIOP environment, the RMI registry is replaced by JNDI using a COS Naming Service like tnameserv or an LDAP server . Distributed garbage collection is not supported by CORBA, which means objects have to be created and destroyed explicitly. This means the object management remains in your hands, and you need to be pretty careful about it; again one of the reasons that fuelled the creation of the EJB spec. The RMI activation system is replaced by the PortableObjectAdaper and the remote references have to be down cast using the PortablRemoteObject.narrow() method and not using a normal java cast.
We have barely scratched the surface of RMI, the underlying plumbing behind today's enterprise java technologies. I hope this serves to whet your appetite on the behind the scenes work that we take for granted everyday with modern frameworks. RMI alone is not what makes enterprise java the behemoth that it is, its a host of other enabling technologies which we shall examine in the future.