Safe Unsafe: How to Write Portable and Production Quality Code Using Unsafe
Unsafe! Get it out, get it out! But... wait. Maybe, if you're very careful, you can use Unsafe's benefits without causing your building to burn down.
Join the DZone community and get the full member experience.
Join For FreeUnsafe (sun.misc.Unsafe) is one of the least understood and mysterious aspects of Java. As the name suggests, improper use of it can lead you to violate the ‘safety’ guarantees provided by Java and so the JDK builds in safeguards to make it difficult for the Java programmer to access it. In particular, trying to obtain a reference to its singleton instance in the JVM will throw a SecurityException in most cases.
The JDK itself uses Unsafe. You can see that in the Atomic classes: AtomicInteger, AtomicLong, etc., where you will see that Unsafe provides access to the CAS calls that these atomics use.
The reason why it is called Unsafe is because this class lets us manipulate memory addresses directly, much like in C/C++, and using this you can bypass the safety guarantees of Java. Unsafe also allows us to allocate and free memory directly, outside of the heap, and thus bypass the Garbage Collection mechanism for such memory. This means you’ll have to manage such memory yourself, and if not done properly this can lead to memory leaks.
Motivation: Why Would You Want to Use Unsafe?
Most Java programmers will not and should not use Unsafe — everything that a typical application needs to do can be done by regular Java. However, if you are writing high-performance Java applications which are sensitive to latency and throughput you may at some point have the need to use Unsafe directly in your programs. You may want to allocate memory directly off the heap for faster access, and you may also want to bypass garbage collection for objects. Garbage collection also introduces pauses (jitter) that may be unacceptable to low latency applications, requiring the use of off heap data structures managed by the application.
Pitfall: Outside the Safe Sandbox
The Java sandbox provides several guarantees, one of which is that it guards you against memory corruption. As long as you are using regular Java, the JVM will make all necessary checks – Null pointer checks, Array Index checks, etc. – and prevent you from corrupting memory. If it detects application misbehavior it will throw an Exception which you may choose to catch and handle. At the worst, it will throw an error like an OutOfMemory error which may be unrecoverable but you are still inside the sandbox.
This guarantee goes out of the window once you start using Unsafe. Since you are using memory addresses directly, you may end up getting a segmentation violation, much like in C/C++, if you are not careful. E.g., the following code snippet, which tries to read a byte from memory address -1L ...
long badAddress = -1L ;
byte b = unsafe.getByte(badAddress) ;
... results in the following error on my PC:
#
# A fatal error has been detected by the Java Runtime Environment:
#
# EXCEPTION_ACCESS_VIOLATION (0xc0000005) at , ,
#
# JRE version: Java(TM) SE Runtime Environment (8.0_60-b27) (build 1.8.0_60-b27)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.60-b23 mixed mode windows-amd64 )
# Problematic frame:
# V [jvm.dll+0x1e48df]
#
Pitfall: Platform Dependence
As it is probably clear by now, code that uses Unsafe is highly platform dependent. In order to write code that will work you will have to be aware of the memory layout of Java objects, the addressing mechanism, as well as other quirks of the JVM and the platform. Which brings us to the natural question: why do it?
What’s Wrong With Java’s Write Once Run Everywhere?
Nothing. In fact, it’s a great feature of the language, probably one of the foremost reasons behind its popularity. The JVM abstracts the platform from the programmer and handles all the quirks, so that the application can focus on its own logic. However, this convenience comes at a price: relative slowness. Modern JITs have addressed many of these slowness issues, but the garbage collection pauses can still be significant.
As an example, if you are writing a market data handler that connects to an exchange which is pumping in a million messages per second over a channel (multicast address) and your handler needs to normalize, enrich and distribute those messages in real time while sustaining the message rate, your primary concern would not be to write a program that will run on every conceivable platform but to squeeze out the last ounce of processing horsepower given your organization’s platform.
In a real life scenario, most organizations have a finite number of platforms that they have standardized on at any given time. As an example, the production servers may consist entirely of Linux hosts running Java 8 in 64-bit compressed mode. This will mean that your UAT as well as QA clusters will have similar machines. Your development environment may be windows, Linux or mac, but you still will run the same Java version with the same architecture, otherwise, you will be up for surprises once you deploy your code.
In fact, if you are doing low latency programming then your Java code probably already has some platform dependency built into it in the form of ‘mechanical sympathy’ considerations.
Realistic Production Environments
At the current time, most organizations are running one of the following versions of Java in the production environments: Java 6, Java 7, or Java 8
Each of these versions has a 32-bit as well as a 64-bit JVM flavor and the 64 bit JVM comes in two flavors: Compressed Oops and regular.
This gives us the following nine possible production configurations:
Fig. 1: Possible production configurations
This is good news: all you have to worry about is to ensure that your code which uses Unsafe will work in these configurations.
What Is Compressed Oops?
With a 32 bit architecture, the maximum memory that can be addressed is 232 which is about 4GB. This may be insufficient for many modern applications, hence, we need to use a 64-bit architecture. With 64 bits we can address 264 bytes - about 234 GB – which is way more than any amount of physical memory we have. However, a 64-bit architecture also requires us to store each memory address (pointer/reference) as 64 bits, consuming twice the amount of storage required for storing the same number of memory addresses in a 32-bit architecture. This is why when we move a 32 bit Java application to a 64 bit we may see a sudden increase in its heap usage.
To get around the problem of needing twice the amount of storage to store addresses 64 bit JVMs now use a mechanism called Compressed Oops (Compressed Ordinary Object Pointers).
The word length of a 64-bit machine is 8 bytes. If we allow objects in memory to start at only word boundaries, I.e., at byte addresses which are multiples of 8, then the address of each such object will be a number that is divisible by 8, or, will have 3 trailing zeros in its binary representation. We can compress such an address by right shifting it 3 times to drop these 0’s. In this way, we can fit an address that is up to 35 bits into a 32-bit word. 235 is about 32 GB, so using this compression scheme we can address up to 32 GB while still using 32-bit addresses.
That is the default mode of behavior in JVMs 7 and 8. If the JVM detects that the maximum memory used by the application is less than 32 GB then it uses compressed oops by default. You can also turn this on or off using the -XX:+UseCompressedOops VM argument.
Using compressed oops poses some extra challenges in address arithmetic: now we have to decompress the address we obtain before we can use it. The following code snippet illustrates the steps.
int classAddress = unsafe.getInt (object, klassOffset) ;
long longClassAddress = signedToUnsigned (classAddress) ;
longClassAddress = longClassAddress << 3 ; // shift to decompress
Line 1 reads a 32-bit compressed oop using unsafe. Since Java types are signed while memory addresses are not, line 2 converts the 4-byte int read to an unsigned long of 8 bytes. The last line does the left shift to decompress the address. The resulting address will represent a memory location now.
private static long signedToUnsigned (int value) {
if (value >= 0) {
return value;
}
else {
return (~0L >>> 32) & value;
}
}
Memory Layout of Java Object Instances
Since we will be reading class and instance properties using Unsafe we need to understand how the JVM stores Java Object instances and Class instances in memory. The best resource for that is to read the C++ code that implements the JVM, specifically, the files:
oop.hpp (for ver 7 : http://hg.openjdk.java.net/jdk7/jdk7/hotspot/file/9b0ca45cd756/src/share/vm/oops/oop.hpp)
klass.hpp (for ver 7 : http://hg.openjdk.java.net/jdk7/jdk7/hotspot/file/d61c7c22b25c/src/share/vm/oops/klass.hpp)
oop.hpp provides the memory layout of each instance of a Java object while klass.hpp provides the layout of the shared class instance. Keep in mind that these implementations are version specific, so you’ll need to study the implementations of all the versions for which you plan to write your code.
From the version 7 code of oop.hpp in the above link we see the following declared at the beginning:
class oopDesc {
friend class VMStructs;
private:
volatile markOop _mark;
union _metadata {
wideKlassOop _klass;
narrowOop _compressed_klass;
} _metadata;
So, we know that there is a variable called _mark that is the first variable in the instance of an object and it is followed by a pointer to the class instance itself which is called _metadata
The comments section of klass.hpp shows :
// Klass layout:
// [header ] klassOop
// [klass pointer ] klassOop
// [C++ vtbl ptr ] (contained in Klass_vtbl)
// [layout_helper ]
// [super_check_offset ] for fast subtype checks
The field we are most interested in here is the fourth field, layout_helper, which stores the size of an object if it’s not an array. For an array, it stores structural information about the array. The JVM reads this field to know the amount of memory to allocate for new object instances, and we will do so too.
The layout shown above may change from version to version, in particular, in Java 8 the klass layout has been changed to remove the first two fields, header and klass pointer, which is related to the move of all class data from perm gen to metaspace.
Going through the files above yields a picture like this:
Fig. 2: Java 7 Object and Class instance memory layout
The left-hand layout above is of an object instance for Java 7 and gives its size for each of the three configurations. Notice how sizes increase as we go from 32 bit to 64 bit compressed and then to full 64 bit.
In particular, we are interested in the second field, _metadata, which stores the address of the shared class instance. It’s important to calculate the sizes of each filed as above so that we know which offset to read a field from and how many bytes to read, e.g., to read the _metadata field we’ll read 4 bytes from offset 8 in 64 bit compressed mode but we’ll need to read 8 bytes from the same offset for full 64 bit.
The right-hand layout shows the shared class instance. Here we are interested in the 4-byte field layout_helper, which for non-arrays stores the size of each instance of the class. It is available at offset 12 or 24, depending on the configuration. For arrays, it stores array structure information.
The above layout holds for Java 6 too, but in Java 8 it changes as below.
Fig. 3: Java 8 Object and Class instance memory layout
Notice how they’ve gotten rid of two fields in the shared class instance.
Write Production Quality Code Using Unsafe
Using the information above we are now at a position to write production quality code using Unsafe. Recall that our code will need to work in the following nine configurations
We will implement two utility functions in this installment: a function to compute the shallow size of an object and a function to shallow clone an object. Both are useful in high-performance Java. While it is possible to estimate a size of an object, this function will give us the exact size allocated by the JVM. Further, since it is production quality, we will implement it for all our target configurations. Garbage collection is a significant performance issue and it is related to the amount of memory the application is consuming and a number of temporary objects it is creating. You may want to incorporate the shallow sizeof as part of your testing of every significant object you create.
While Java provides a shallow cloning mechanism, it will only work if your class is tagged by the Cloneable interface and overrides the clone () method. This can certainly be done for every class you want to clone, but you may want to clone objects that you do not have the source code of, or you’re not able to modify the source code of. In such cases, this clone function is very handy.
Detect the Configuration
Our first task to write production quality sizeof and clone functions will be to detect the configuration. We will have to detect which of the above nine configurations our code is running on and adjust it accordingly.
We start by declaring a bunch of variables. Since all these properties are VM-wide, I will implement everything as static here:
// Constants
private static final String JAVA_VERSION_PROPERTY = "java.version";
private static final String HOTSPOT_BEAN_CLASS = "com.sun.management.HotSpotDiagnosticMXBean";
private static final String HOTSPOT_BEAN_TYPE = "com.sun.management:" ;
// The runtime bean to get information about the runtime
private static final RuntimeMXBean vmBean = ManagementFactory.getRuntimeMXBean( );
// Configuration variables
// a flag to tell us if it's okay to proceed with this configuration
private static boolean canDo = true ;
private static int version ; // the Java version
private static int addressSizeInBytes ; // whether 32 or 64 bit
// the offset of the _metedata field inside an object instance, this stores the
// address of theshared class instance
private static long klassOffset ;
// the offset of the layout_helper field inside the klass object
private static long layoutHelperOffset ;
private static boolean isCompressedOops ; // are we running on compressed oops
// a single element helper array to obatin object addresses
private static Object[] singleElementObjectArray = new Object[1] ;
// offset of the first (and only) element in the above array
private static long objArrayBaseOffset ;
private static Unsafe unsafe; // the unsafe object
Then, helper functions to detect the configuration and store the offsets, etc in these variables:
Determine 32- or 64-bit:
private static void getMemoryModel () {
String dataModel = System.getProperty("sun.arch.data.model") ;
try {
int bits = Integer.parseInt(dataModel) ;
if ((bits & 7) != 0) {
canDo = false ;
}
else {
addressSizeInBytes = bits / 8 ;
}
}
catch (Exception e) {
canDo = false ;
}
}
Get the Java version:
private static void getJavaVersion () {
String[] split = System.getProperty(JAVA_VERSION_PROPERTY).split("\\.") ;
if (split.length > 1) {
try {
int ver = Integer.parseInt(split[1]) ;
switch (ver) {
case 6:
case 7:
case 8:
version = ver ;
break ;
default :
canDo = false ;
break ;
}
}
catch (Exception e) {
canDo = false ;
}
}
else {
canDo = false ;
}
}
Determine if the JVM is using compressed oops:
private static void getCompressedOops () {
if (addressSizeInBytes == 8) {
try {
final Class<?> beanClazz = Class.forName(HOTSPOT_BEAN_CLASS);
final Object hotSpotBean =
ManagementFactory.newPlatformMXBeanProxy(ManagementFactory.getPlatformMBeanServer(),
HOTSPOT_BEAN_TYPE, beanClazz);
if (hotSpotBean != null) {
final Method getVMOptionMethod = beanClazz.getMethod("getVMOption", String.class);
final Object vmOption = getVMOptionMethod.invoke(hotSpotBean, "UseCompressedOops");
isCompressedOops =
Boolean.parseBoolean(vmOption.getClass().getMethod("getValue").
invoke(vmOption).toString());
}
else {
canDo = false ;
}
}
catch (Exception e) {
canDo = false ;
}
}
else {
isCompressedOops = false ;
}
}
Set offsets appropriately:
private static void setOffsets () {
if (addressSizeInBytes == 8) { // 64 bit
klassOffset = 8 ; // _mark is always size of native word
if (version == 8) {
layoutHelperOffset = 8 ; // _vtbl_ptr is always size of native word
}
else { // ver 6 & 7
layoutHelperOffset = 24 ; // for version 6 & 7 this is
// always 24 : 8 (_mark) + 8 (_klass) +
// 8 (_vtbl_ptr)
}
}
else { // 32 bit
klassOffset = 4 ; // _mark is always size of native word
if (version == 8) {
layoutHelperOffset = 4 ; // _vtbl_ptr is always size of native word
}
else { // ver 6 & 7
layoutHelperOffset = 12 ; // 4 (_mark) + 4 (_klass) + 4 (_vtbl_ptr)
}
}
}
Get the Unsafe object:
private static void getUnsafe () {
try {
Field field = Unsafe.class.getDeclaredField( "theUnsafe" );
field.setAccessible( true );
unsafe = (Unsafe)field.get( null );
}
catch (Exception e) {
canDo = false ;
}
}
Finally, wire everything up together is a static initializer:
static {
if (!vmBean.getVmName().toLowerCase().contains("hotspot")) {
canDo = false ;
}
else {
getMemoryModel () ;
if (canDo) {
getJavaVersion () ;
}
if (canDo) {
getCompressedOops () ;
}
if (canDo) {
setOffsets () ;
}
if (canDo) {
getUnsafe () ;
objArrayBaseOffset = unsafe.arrayBaseOffset(Object[].class);
}
}
}
The shallowSizeOf Method:
public static long shallowSizeOf (Object object) throws UnsafeException {
if (canDo) {
if (object != null) {
if (object.getClass().isArray()) {
return getArrayInstanceSize (object) ;
}
else {
return unsafe.getInt(getAbsoluteClassAddress (object) + layoutHelperOffset );
}
}
else {
throw new UnsafeException ("Null object passed") ;
}
}
else {
throw new UnsafeException ("I don't know much about this VN") ;
}
}
We determine if the object passed is an instance of an array type or not. This is important because the memory layout of these two are different.
In the case where it is a non-array object instance, we ready the _matadata field using the appropriate offset, convert to unsigned if needed and uncompress if needed. The result is the address of the shared class instance. Once we have that, we read the layout_helper field of the shared class instance, which is the size of each instance of the class.
// Return the address of the shared class instance of the object
private static long getAbsoluteClassAddress (Object object) {
long longClassAddress ;
if (addressSizeInBytes == 8) {
if (isCompressedOops) {
int classAddress = unsafe.getInt (object, klassOffset) ;
longClassAddress = signedToUnsigned (classAddress) ;
longClassAddress = longClassAddress << 3 ; // shift to decompress
}
else {
longClassAddress = unsafe.getLong (object, klassOffset) ;
}
}
else {
int classAddress = unsafe.getInt (object, klassOffset) ;
longClassAddress = signedToUnsigned (classAddress) ;
}
return longClassAddress ;
}
private static long signedToUnsigned (int value) {
if (value >= 0) {
return value;
}
else {
return (~0L >>> 32) & value;
}
}
In case of arrays, the logic is different. The length (size) of the array is stored in each instance, since instances of arrays of the same type can have different sizes. This is a field called length located right after the _metadata (klass pointer) field. We read this.
Inside the klass instance of the array type, the layout_helper has four different pieces of information, each in one byte. From the klass.hpp for version 7 we see the following structure for arrays:
// For arrays, layout helper is a negative number, containing four
// distinct bytes, as follows:
// MSB:[tag, hsz, ebt, log2(esz)]:LSB
// where:
// tag is 0x80 if the elements are oops, 0xC0 if non-oops
// hsz is array header size in bytes (i.e., offset of first element)
// ebt is the BasicType of the elements
// esz is the element size in bytes
The data we are interested in here are the least significant byte, which contains the Log 2 value of the element size, and the third byte, which gives the header size of the array. Knowing the header size, the number of elements and size of each element we can compute the total size required by the array. Then we need to round it up to the nearest multiple of 8 since Java objects start at 8-byte boundaries.
private static long getArrayInstanceSize (Object object) {
int layoutHelper = unsafe.getInt(getAbsoluteClassAddress (object) + layoutHelperOffset);
int baseOffset = (layoutHelper & 0x00ff0000) >>> 16 ; // base offset is 3rd byte from LSB
int log2ElementSize = (layoutHelper & 0x000000ff) ; // LSB is element size log2
int length = getLengthOfArray (object) ; // get nunber of elements
// left shifting log2 size of each element by length bits = length x elementsize
int sizeWithoutHeader = length << log2ElementSize ;
int rawSize = baseOffset + sizeWithoutHeader ;
return ((rawSize + 7) / 8) * 8 ; // round off to 8 byte word alignment
}
Now for the shallowClone Method. This method is passed two objects, the source and the destination. It copies the source to the destination byte by byte. You can use this to change immutables too:
public static void testShallowClone () throws UnsafeException {
Integer integerOne = new Integer (1) ;
Integer integerTwo = new Integer (2) ;
System.out.println ("integerTwo = " + integerTwo) ;
System.out.println ("integerTwo.equals (integerOne) = " + integerTwo.equals(integerOne)) ;
shallowClone (integerOne, integerTwo) ;
System.out.println ("integerTwo = " + integerTwo) ;
System.out.println ("integerTwo.equals (integerOne) = " + integerTwo.equals(integerOne)) ;
}
Outputs:
integerTwo = 2
integerTwo.equals (integerOne) = false
integerTwo = 1
integerTwo.equals (integerOne) = true
So, be careful.
public static void shallowClone (Object from, Object to) throws UnsafeException {
if (canDo) {
if (from == null || to == null || !isInstanceOfSameClass (from, to)) {
return ;
}
else {
unsafe.copyMemory(getInstanceAddress (from),
getInstanceAddress (to),
shallowSizeOf (from));
}
}
else {
throw new UnsafeException ("I don't know much about this VN") ;
}
}
private static boolean isInstanceOfSameClass (Object o1, Object o2) {
return getAbsoluteClassAddress (o1) == getAbsoluteClassAddress (o2) ;
}
private static long getInstanceAddress (Object instance) {
singleElementObjectArray[0] = instance ;
long longInstanceAddress ;
if (addressSizeInBytes == 8) {
if (isCompressedOops) {
int instanceAddress = unsafe.getInt (singleElementObjectArray, objArrayBaseOffset) ;
longInstanceAddress = signedToUnsigned (instanceAddress) ;
longInstanceAddress = longInstanceAddress << 3 ; // shift to uncompress
}
else {
longInstanceAddress = unsafe.getLong (singleElementObjectArray, objArrayBaseOffset) ;
}
}
else {
int instanceAddress = unsafe.getInt (singleElementObjectArray, objArrayBaseOffset) ;
longInstanceAddress = signedToUnsigned (instanceAddress) ;
}
return longInstanceAddress ;
}
Testing Your Code
I used the following to test my code and repeated the test under all nine configurations from Fig 1. Above
public static void main (String[] args) throws UnsafeException {
printVMInfo () ;
System.out.println("");
testShallowSizeOf () ;
System.out.println("");
testShallowClone () ;
}
public static void printVMInfo () throws UnsafeException {
System.out.println("VM Name : " + vmBean.getVmName()) ;
System.out.println("VM Vendor : " + vmBean.getVmVendor()) ;
System.out.println("VM Version : " + vmBean.getVmVersion()) ;
System.out.println("Spec Name : " + vmBean.getSpecName()) ;
System.out.println("Spec Vendor : " + vmBean.getSpecVendor()) ;
System.out.println("Spec Version : " + vmBean.getSpecVersion()) ;
System.out.println("Java Version : " + System.getProperty(JAVA_VERSION_PROPERTY)) ;
System.out.println("Data Model : " + System.getProperty("sun.arch.data.model")) ;
System.out.println("canDo : " + canDo) ;
System.out.println("version : " + version) ;
System.out.println("addressSizeInBytes : " + addressSizeInBytes) ;
System.out.println("klassOffset : " + klassOffset) ;
System.out.println("layoutHelperOffset : " + layoutHelperOffset) ;
System.out.println("isCompressedOops : " + isCompressedOops) ;
System.out.println("unsafe : " + unsafe) ;
}
public static void testShallowSizeOf () throws UnsafeException {
System.out.println("sizeOf(String()) : " + shallowSizeOf (new String (""))) ;
System.out.println("sizeOf(String(abcd)) : " + shallowSizeOf (new String ("abcd"))) ;
System.out.println("sizeOf(Integer(0)) : " + shallowSizeOf (new Integer (0))) ;
System.out.println("sizeOf(Long(0)) : " + shallowSizeOf (new Long (0))) ;
System.out.println("sizeOf(byte[19]) : " + shallowSizeOf (new byte[19])) ;
System.out.println("sizeOf(byte[20]) : " + shallowSizeOf (new byte[20])) ;
System.out.println("sizeOf(short[20]) : " + shallowSizeOf (new short[20])) ;
System.out.println("sizeOf(int[20]) : " + shallowSizeOf (new int[20])) ;
System.out.println("sizeOf(long[20]) : " + shallowSizeOf (new long[20])) ;
System.out.println("sizeOf(double[20]) : " + shallowSizeOf (new double[20])) ;
System.out.println("sizeOf(String[20]) : " + shallowSizeOf (new String[20])) ;
}
public static void testShallowClone () throws UnsafeException {
Integer integerOne = new Integer (1) ;
Integer integerTwo = new Integer (2) ;
System.out.println ("integerTwo = " + integerTwo) ;
System.out.println ("integerTwo.equals (integerOne) = " + integerTwo.equals(integerOne)) ;
shallowClone (integerOne, integerTwo) ;
System.out.println ("integerTwo = " + integerTwo) ;
System.out.println ("integerTwo.equals (integerOne) = " + integerTwo.equals(integerOne)) ;
}
The UnsafeException class:
public static class UnsafeException extends Exception {
public UnsafeException(String message) {
super(message);
}
public UnsafeException(Throwable cause) {
super(cause);
}
public UnsafeException(String message, Throwable cause) {
super(message, cause);
}
}
Finally, it was interesting to see how sizes of some Java types change with the configurations:
That’s it for now. We will create off heap data structures and run some benchmarks in the next installment
Opinions expressed by DZone contributors are their own.
Comments