But It Works On My Machine...
Join the DZone community and get the full member experience.
Join For Free
what happens when we run
main()
in the code snippet below? of course, the erudite, tech-savvy java people can answer that in a heartbeat.
public class testvolatile implements runnable {
private boolean stoprequested = false;
public void run() {
while(!stoprequested) {
// do something here...
}
}
public void stop() {
stoprequested = true;
}
public static void main(string[] args) throws interruptedexception {
testvolatile tv = new testvolatile();
new thread(tv, "neverending").start();
thread.sleep(1000);
tv.stop();
}
}
the answer is that we don’t know for sure when the program will terminate, because the
stoprequested
variable is not marked
volatile
. so when we call
stop()
in the main thread, the
neverending
thread may never see that
stoprequested
has changed to true, and it just keeps going… and going… and going… just like the
energizer bunny
. we never know when it’s gonna stop.
but… it works just fine on my machine
a friend of mine wasn’t convinced though, so just to prove it to him, i typed that in eclipse and ran it, fully expecting that it would run forever.
it stopped in about one second. hmmm. weird.
seems that the change made from the main thread to
stoprequested
is
immediately
visible from the
neverending
thread, although
stoprequested
is not
volatile
.
out of curiosity, i modified the program a bit to look like this:
import java.util.concurrent.timeunit; public class testvolatile implements runnable { private boolean stoprequested = false; private volatile long justafterstoprequested; private volatile long afterneverendinghasstopped; public void run() { while(!stoprequested) { // do something } afterneverendinghasstopped = system.currenttimemillis(); } public void stop() { stoprequested = true; justafterstoprequested = system.currenttimemillis(); } public static void main(string[] args) throws interruptedexception { testvolatile tv = new testvolatile(); thread t = new thread(tv, "neverending"); t.start(); // let main thread sleep for 1 second before requesting // neverending to stop timeunit.seconds.sleep(1); tv.stop(); // wait until neverending has stopped t.join(); system.out.println(tv.afterneverendinghasstopped - tv.justafterstoprequested); } }
i’m trying to see how much time the
neverending
thread needs to realize that
stoprequested
has changed to true, and get outside the loop.
the result is 0
.
but it can’t be instantaneous–there must be
some
difference in time. so i changed the
system.currenttimemillis()
calls to
system.nanotime()
. the result is virtually the same, ranging from around -300 to +300 nanoseconds (we can’t say which one gets modified first,
justafterstoprequested
or
afterneverendinghasstopped
–it differs on each run). but for all practical purposes, we can say that the
neverending
thread sees the change to
stoprequested
almost immediately.
strange.
in
effective java, 2nd ed.
,
joshua bloch
says in his machine such a program never stops running. not that i have a self-esteem problem or whatever, but when it comes to java, if i have to choose whether to trust joshua bloch or me, i choose the former. sorry, me.
on another machine, however…
but you know what? i found that testvolatile did run forever on solaris! (well, it ran until i was back 15 minutes later and killed it anyway.) is it because of the differences between windows and solaris? not really–it’s more about the differences between the server vm and the client vm. my solaris test platform is a server-class machine , so by default i was using the server vm. whereas on my windows machine by default i was running the client vm.
indeed, when i ran the program again on solaris with the client vm (with the
-client
option), it stopped after roughly a second, without having to make
stoprequested
volatile. conversely, when i ran the program on windows with the
-server
option, it never stopped, and would only stop if i made
stoprequested
volatile.
this shows that the client vm may deceive us into thinking that we don’t need volatile (until we run our program on a server-class machine and things start breaking in strange ways). superficially, the client vm and the server vm may sound like they’re not that different. but some differences do matter: what we see here is a real example where the differences between the client vm and the server vm can make or break your application.
and that’s not all. there’s another non-obvious thing that may mask the need of volatile in our programs. like
system.out.println()
, for example.
hidden synchronized blocks sometimes hide the need for volatile
if we want to print a counter variable within the loop like this:
public class testvolatile implements runnable { private boolean stoprequested = false; private int i = 0; public void run() { while(!stoprequested) { system.out.println(i++); } } public void stop() { stoprequested = true; } // the rest }
then the program will always terminate even if we don’t mark
stoprequested
as volatile. why? because there is a synchronized block inside
system.out.println()
. when the thread that runs the
run()
method calls
system.out.println()
, its copy of
stoprequested
gets updated with the latest value, and the while loop terminates.
note though, that this is not guaranteed by the java memory model . the jmm guarantees visibility between two threads that enter and exit synchronized blocks protected by the same lock . if the synchronized blocks are protected by different locks, then the only safe assumption is to assume that there’s no synchronization at all.
this program below does stop,
public class testvolatile implements runnable { private boolean stoprequested = false; private int i = 0; private final object lock1 = new object(); private final object lock2 = new object(); public void run() { while(!stoprequested) { synchronized(lock1) {} i++; } } public void stop() { stoprequested = true; synchronized(lock2) {} } // the rest }
even though the two threads are entering synchronized blocks guarded by different locks. we can even remove the synchronized block guarded by
lock2
in
stop()
, and the program still stops. but not the other way around (that is, if we remove lock1’s synchronized block and leave lock2’s there, again the program will run forever).
so it seems that the thread that reads a variable gets the up-to-date value when it executes a synchronized block, regardless of the lock used to guard the block. even if the variable is not volatile.
which means it’s possible to have code that used to work when you had
system.out.println()
calls sprinkled throughout suddenly stops working properly when you remove those calls!
does volatile cascade to member variables? array elements? items in collections?
now let’s say instead of a primitive boolean,
stoprequested
is a member variable of another class, like this:
import java.util.concurrent.timeunit; public class testvolatile implements runnable { private a wrapper = new a(); public void run() { while(!wrapper.stoprequested) { // do something } } public void stop() { wrapper.stoprequested = true; } public static void main(string[] args) throws interruptedexception { testvolatile tv = new testvolatile(); new thread(tv, "neverending").start(); // let main thread sleep for 1 second before requesting // neverending to stop timeunit.seconds.sleep(1); tv.stop(); } } class a { boolean stoprequested = false; }
run with the
-server
flag, this one never stops either. but what if we mark wrapper as volatile:
private volatile a wrapper = new a();
does the “volatility” of the reference cascades to the member variable
stoprequested
as well? turned out, looks like the answer is yes. we can either mark
wrapper
as volatile, or
stoprequested
as volatile, and the program will terminate in about a second.
i’m not surprised that the program terminates when i mark
stoprequested
as volatile. but why
wrapper
as volatile works as well? the same thing happens when we use an
arraylist
, like this:
import java.util.arraylist; import java.util.list; public class testvolatile implements runnable { list<boolean> wrapper = new arraylist<boolean>(); public testvolatile() { wrapper.add(boolean.false); } public void run() { while(wrapper.get(0) == boolean.false) { // do something } } public void stop() { wrapper.set(0, boolean.true); } // the rest of the code... }
run as it is, it never stops. if we mark the
list<boolean>
wrapper as volatile, it stops pretty fast.
i wonder why? the object reference to that list instance itself never changes. we’re only changing an element within the list, and not the list itself. there’s no hidden synchronized block. why does the
neverending
thread sees the up-to-date value of
stoprequested
?
above and beyond the call of duty
before jsr 133 , if thread a writes to volatile field f , thread b is guaranteed to see the new value of f only. nothing else is guaranteed. with jsr 133 though, volatile is closer to synchronization than it was. reading/writing a volatile field now is like acquiring/releasing a monitor in terms of visibility. as the excellent faq says: “… anything that was visible to thread a when it writes to volatile field f becomes visible to thread b when it reads f .”
but still, all these new guarantees for volatile doesn’t answer the question:
public class testvolatile implements runnable { private volatile a wrapper = new a(); public void run() { while(!wrapper.stoprequested) { // do something } } public void stop() { wrapper.stoprequested = true; } // the rest }
why does this work? in
stop()
we’re not exactly
writing
to
wrapper
. we’re just
using
it to change the value of
stoprequested
. so why does the other thread see the change?
unfortunately, in the examples i’ve seen so far in books and countless articles, a volatile field is always a primitive, so it’s kinda hard to find the answer to my question. so i did the only remaining way i know to proceed: asking the concurrency expert himself. and i was pleasantly surprised to find that he replied very quickly! here’s what brian goetz said:
the java memory model sets the minimum requirements for visibility, but all vms and cpus generally provide more visibility than the minimum. there’s a difference between “what do i observe on machine xyz in casual use” and “what is guaranteed.”
so there. the vm in this case just goes above and beyond what it is supposed to do. but there’s no guarantee that on another vm and another cpu, the same thing will happen.
conclusion
so here’s what i’ve learned from this little experiment:
- if your application is a server application, or will run on a server-class machine, remember to use the -server flag during development and testing as well to uncover potential problems as early as possible. the server vm performs some optimizations that can expose bugs that do not manifest on the client vm.
- just because it works on your machine, doesn’t mean that it’ll work on other machines running other vms too. it’s important to know what are exactly guaranteed, and code with those minimal guarantees in mind, instead of assuming that other vms and oses will be as forgiving as the ones you’re using for development.
-
(this is closely related to #2 above.) because vms and cpus generally provide more visibility than the minimum guaranteed by jsr 133, it’s good to know the extra things that they do that may mask a potential bug. for example, at least in some vms, calling
system.out.println()
forces the change to a non-volatile variable to be visible to other threads because it has a synchronized block inside. this can explain a bug that appears after you’ve made a seemingly unrelated change (that removes a synchronized block from the execution path, for instance).
Opinions expressed by DZone contributors are their own.
Comments