Shared Variable Optimization Within a Loop

DZone 's Guide to

Shared Variable Optimization Within a Loop

Let's analyze an example Java application to look into its problems and learn the performance optimization possibilities.

· Performance Zone ·
Free Resource

Recently I attended the GeeCon Krakow conference, and during one of the talks, the famous Venkat Subramaniam shared an interesting small application which captured my attention and got stuck in my mind. Going home, I decided to zoom into the problem to better understand what happens under the hood in such cases.

Below is the initial code inspired by Venkat(slightly modified, but the idea is the same):

public class BusyWaitingLoopTrick {

    static boolean canRun = true;

    public static void main(String[] args) throws InterruptedException {

        Thread thread = new Thread(new Runnable() {
            public void run() {
                System.out.println("Thread starting to run");

                int localCounter = 0;
                while (canRun) {

                System.out.println("Thread exiting");

        System.out.println("Telling Thread to stop");
        canRun = false;

which prints:

Thread starting to run
Main telling Thread to stop

However, as you might notice, it does not print the message

Thread exiting

and the program hangs.

This case was presented by Venkat and I have decided to dig into it in order to better understand the cause, and to share it with you!

Since I am very keen on performance optimizations triggered at runtime in the HotSpot JVM, I have run the same program by looking at the generated assembly. Below is a very simplified shape of it (after removing a few sections):

       call r10                       //*getstatic canRun   
       movabs r10,0x6d264f8c0         // {oop(a 'java/lang/Class'{0x00000006d264f8c0} = 'BusyWaitingLoopTrick')}
       movzx r11d,BYTE PTR [r10+0x70]
       test r11d,r11d                 // EXPLICIT BOOLEAN VALUE CHECK OUTSIDE LOOP !
       je L0001
L0000: inc ebx                        // START LOOP 
       mov r10,QWORD PTR [r15+0x70]   // - BusyWaitingLoopTrick$1::run@19 (line 15)
       test DWORD PTR [r10],eax       // {poll} *** SAFEPOINT POLL ***
       jmp L0000                      // END LOOP  
L0001: mov edx,0xffffff86

Basically, the code between START LOOP and END LOOP (i.e. corresponding to while loop) explains what really happens and why the program hangs:

  • Just In Time Compiler seems to optimize the busy waiting loop by completely removing the conditional test (i.e. canRun == true). It just inserts the goto statement, which basically loops forever, without any conditional break.
  • However, you might also notice the SAFEPOINT POLL being added by Just In Time Compiler, which is there to handle the Safe Points (i.e. Stop The World pauses inside the JVM) but has nothing to do with the correctness of the program.

So, the run() method becomes hot and gets compiled (i.e. On Stack Replacement compilation due to while loop), in the meantime the main thread sleeps for 5 seconds (e.g. Thread.sleep(5000)). Even if variable canRun is afterwards set to false (e.g. canRun = false), it has no any impact on the asynchronous run() method, which completely removed the test check.

Now, with a better understanding about what happens and why the program hangs, we might ask, "But how can we make the code work without the program hanging?" In the following sections, I will present four possible solutions (and of course, they are not exclusive).

Solution 1

The simplest one, without even touching the code, is to start the program by disabling Just In Time Compiler (i.e. bypassing the Compiler optimization in regards to conditional test within the loop).

This could be easily achieved by starting the HotSpot JVM with the flag -Djava.compiler=none (i.e. running only in Interpreter). Re-launching again it prints:

Thread starting to run
Main telling Thread to stop
Thread exiting

and the program successfully stops. However, this solution might not be feasible because disabling the Just In Time Compiler really slows down the performance.

Solution 2

Another approach is to simply make the canRun variable volatile, as follows:

static volatile boolean canRun = true

In this case, the analogous optimization for the while loop looks like:

L0000: mov r11,QWORD PTR [r15+0x70]     // START LOOP 
       inc ebx                          // - BusyWaitingLoopTrick$1::run@19 (line 15)
       test DWORD PTR [r11],eax         // {poll} *** SAFEPOINT POLL ***
L0001: movzx r11d,BYTE PTR [r10+0x70]   //*getstatic canRun 
       test r11d,r11d                   // EXPLICIT BOOLEAN VALUE CHECK ! 
       jne L0000                        // END LOOP

As you can see, the explicit boolean check is now kept within the loop. This is related to the volatile field which is not optimized by Just In Time Compiler, hence the check condition is preserved.

Solution 3

This is actually Venkat's solution, which was presented during the talk. Basically, it inserts a Thread.sleep(0) in the busy waiting loop, as follows:

while (canRun) {
    try {
    } catch (Exception ex) {

The generated optimized code looks like this:

L0000: inc ebp                  // START LOOP 
                                // - BusyWaitingLoopTrick$1::run@16 (line 15)
       xor edx,edx
       data16 xchg ax,ax
       call 0x00000200c4d17500  //*invokestatic sleep
                                // - BusyWaitingLoopTrick$1::run@20 (line 18)
L0001: movabs r10,0x6d264f938   // {oop(a 'java/lang/Class'{0x00000006d264f938} = 'BusyWaitingLoopTrick')}
movzx  r11d,BYTE PTR [r10+0x70] //*getstatic canRun
       test r11d,r11d           // EXPLICIT BOOLEAN VALUE CHECK ! 
       jne L0000                // END LOOP

You might notice the boolean check is kept within the while loop together with the Thread.sleep() call, leading the program to successfully finish after canRun is set to false by the main thread.

Solution 4

Another approach is to simply remove the Thread.sleep(5000) call. Since the asynchronous run() method does not yet get compiled, it still runs in Interpreter and checks for canRun on each iteration. After the main thread sets canRun to false, the instruction is eventually drained, hence the CPU caching coherency mechanisms will propagate the updated canRun value to the other thread, leading the program to immediately finish.

hotspot jvm, java, performance, performance optimization, tutorial

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}