DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
What's in store for DevOps in 2023? Hear from the experts in our "DZone 2023 Preview: DevOps Edition" on Fri, Jan 27!
Save your seat
  1. DZone
  2. Software Design and Architecture
  3. Performance
  4. Shared Variable Optimization Within a Loop

Shared Variable Optimization Within a Loop

Let's analyze an example Java application to look into its problems and learn the performance optimization possibilities.

Ionut Balosin user avatar by
Ionut Balosin
·
May. 20, 18 · Tutorial
Like (3)
Save
Tweet
Share
3.92K Views

Join the DZone community and get the full member experience.

Join For Free

Recently I attended the GeeCon Krakow conference, and during one of the talks, the famous Venkat Subramaniam shared an interesting small application which captured my attention and got stuck in my mind. Going home, I decided to zoom into the problem to better understand what happens under the hood in such cases.

Below is the initial code inspired by Venkat(slightly modified, but the idea is the same):

public class BusyWaitingLoopTrick {

    static boolean canRun = true;

    public static void main(String[] args) throws InterruptedException {

        Thread thread = new Thread(new Runnable() {
            @Override
            public void run() {
                System.out.println("Thread starting to run");

                int localCounter = 0;
                while (canRun) {
                    localCounter++;
                }

                System.out.println("Thread exiting");
            }
        });
        thread.start();

        Thread.sleep(5000);
        System.out.println("Telling Thread to stop");
        canRun = false;
    }
}

which prints:

Thread starting to run
Main telling Thread to stop

However, as you might notice, it does not print the message

Thread exiting

and the program hangs.

This case was presented by Venkat and I have decided to dig into it in order to better understand the cause, and to share it with you!

Since I am very keen on performance optimizations triggered at runtime in the HotSpot JVM, I have run the same program by looking at the generated assembly. Below is a very simplified shape of it (after removing a few sections):

...
       call r10                       //*getstatic canRun   
       movabs r10,0x6d264f8c0         // {oop(a 'java/lang/Class'{0x00000006d264f8c0} = 'BusyWaitingLoopTrick')}
       movzx r11d,BYTE PTR [r10+0x70]
       test r11d,r11d                 // EXPLICIT BOOLEAN VALUE CHECK OUTSIDE LOOP !
       je L0001
L0000: inc ebx                        // START LOOP 
       mov r10,QWORD PTR [r15+0x70]   // - BusyWaitingLoopTrick$1::run@19 (line 15)
       test DWORD PTR [r10],eax       // {poll} *** SAFEPOINT POLL ***
       jmp L0000                      // END LOOP  
L0001: mov edx,0xffffff86
...

Basically, the code between START LOOP and END LOOP (i.e. corresponding to while loop) explains what really happens and why the program hangs:

  • Just In Time Compiler seems to optimize the busy waiting loop by completely removing the conditional test (i.e. canRun == true). It just inserts the goto statement, which basically loops forever, without any conditional break.
  • However, you might also notice the SAFEPOINT POLL being added by Just In Time Compiler, which is there to handle the Safe Points (i.e. Stop The World pauses inside the JVM) but has nothing to do with the correctness of the program.

So, the run() method becomes hot and gets compiled (i.e. On Stack Replacement compilation due to while loop), in the meantime the main thread sleeps for 5 seconds (e.g. Thread.sleep(5000)). Even if variable canRun is afterwards set to false (e.g. canRun = false), it has no any impact on the asynchronous run() method, which completely removed the test check.

Now, with a better understanding about what happens and why the program hangs, we might ask, "But how can we make the code work without the program hanging?" In the following sections, I will present four possible solutions (and of course, they are not exclusive).

Solution 1

The simplest one, without even touching the code, is to start the program by disabling Just In Time Compiler (i.e. bypassing the Compiler optimization in regards to conditional test within the loop).

This could be easily achieved by starting the HotSpot JVM with the flag -Djava.compiler=none (i.e. running only in Interpreter). Re-launching again it prints:

Thread starting to run
Main telling Thread to stop
Thread exiting

and the program successfully stops. However, this solution might not be feasible because disabling the Just In Time Compiler really slows down the performance.

Solution 2

Another approach is to simply make the canRun variable volatile, as follows:

static volatile boolean canRun = true

In this case, the analogous optimization for the while loop looks like:

L0000: mov r11,QWORD PTR [r15+0x70]     // START LOOP 
       inc ebx                          // - BusyWaitingLoopTrick$1::run@19 (line 15)
       test DWORD PTR [r11],eax         // {poll} *** SAFEPOINT POLL ***
L0001: movzx r11d,BYTE PTR [r10+0x70]   //*getstatic canRun 
       test r11d,r11d                   // EXPLICIT BOOLEAN VALUE CHECK ! 
       jne L0000                        // END LOOP

As you can see, the explicit boolean check is now kept within the loop. This is related to the volatile field which is not optimized by Just In Time Compiler, hence the check condition is preserved.

Solution 3

This is actually Venkat's solution, which was presented during the talk. Basically, it inserts a Thread.sleep(0) in the busy waiting loop, as follows:

while (canRun) {
    localCounter++;
    try {
        Thread.sleep(0);
    } catch (Exception ex) {
    }
}

The generated optimized code looks like this:

L0000: inc ebp                  // START LOOP 
                                // - BusyWaitingLoopTrick$1::run@16 (line 15)
       xor edx,edx
       data16 xchg ax,ax
       call 0x00000200c4d17500  //*invokestatic sleep
                                // - BusyWaitingLoopTrick$1::run@20 (line 18)
L0001: movabs r10,0x6d264f938   // {oop(a 'java/lang/Class'{0x00000006d264f938} = 'BusyWaitingLoopTrick')}
movzx  r11d,BYTE PTR [r10+0x70] //*getstatic canRun
       test r11d,r11d           // EXPLICIT BOOLEAN VALUE CHECK ! 
       jne L0000                // END LOOP

You might notice the boolean check is kept within the while loop together with the Thread.sleep() call, leading the program to successfully finish after canRun is set to false by the main thread.

Solution 4

Another approach is to simply remove the Thread.sleep(5000) call. Since the asynchronous run() method does not yet get compiled, it still runs in Interpreter and checks for canRun on each iteration. After the main thread sets canRun to false, the instruction is eventually drained, hence the CPU caching coherency mechanisms will propagate the updated canRun value to the other thread, leading the program to immediately finish.

optimization

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Public Cloud-to-Cloud Repatriation Trend
  • What Should You Know About Graph Database’s Scalability?
  • Spring Boot Docker Best Practices
  • Comparing Flutter vs. React Native

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: