JVM Advent Calendar: Project Loom
Join the DZone community and get the full member experience.Join For Free
one of the drivers behind streams in java 8 was concurrent programming.
you may also like: project loom: java with a stronger fiber
in your stream pipeline, you specify what you want to have done, and your tasks are automatically distributed onto the available processors:
parallel streams work great when the data structure is cheap to split into parts and the operations keep the processors busy. that's what it was designed for.
but this doesn't help you if your workload consists of tasks that mostly block. that's your typical web application, serving many requests, with each request spending much of its time waiting for the result of a rest service, a database query, and so on.
in 1998, it was amazing that the sun java web server (the precursor of tomcat) ran each request in a separate thread, and not an os process. it was able to serve thousands of concurrent requests this way! nowadays, that's not so amazing. each thread takes up a significant amount of memory, and you can't have millions of threads on a typical server.
that's why the modern mantra of server-side programming is: "never block!" instead, you specify what should happen once the data is available.
this asynchronous programming style is great for servers, allowing them to handily support millions of concurrent requests. it isn't so great for programmers.
here is an asynchronous request with the
what we would normally achieve with statements is now encoded as method calls. if we loved this style of programming, we would not have statements in our programming language and merrily code in lisp.
project loom takes its guidance from languages such as erlang and go, where blocking isn't a big deal. you run tasks in "fibers" or "lightweight threads" or "virtual threads". the name is up for discussion, but i prefer "fiber" since it nicely denotes the fact that multiple fibers execute in a carrier thread. fibers are parked when a blocking operation occurs, such as waiting for a lock or for i/o. parking is relatively cheap. a carrier thread can support a thousand fibers if each of them is parked much of the time.
keep in mind that project loom does not solve all concurrency woes. it does nothing for you if you have computationally intensive tasks and want to keep all processor cores busy. it doesn't help you with user interfaces that use a single thread (for serializing access to data structures that aren't thread-safe). keep using
for that usecase. project loom is useful when you have lots of tasks that spend much of their time blocking.
nb. if you have been around for a very long time, you may remember that early versions of java had "green threads" that were mapped to os threads. however, there is a crucial difference. when a green thread blocked, its carrier thread was also blocked, preventing all other green threads on the same carrier thread from making progress.
at this point, project loom is still very much exploratory. the api keeps changing, so be prepared to adapt to the latest api version when you try out the code after the holiday season.
you can download binaries of project loom at http://jdk.java.net/loom/ , but they are updated infrequently. however, on a linux machine or vm, it is easy to build the most current version yourself:
depending on what you have already installed, you may have a couple of failures in
, but the messages tell you what packages you need to install so that you can proceed.
in the current version of the api, a fiber or, as it is called right now, virtual thread, is represented as an object of the
class. here are three ways of producing fibers. first, there is a new factory method that can construct os threads or virtual threads:
however, manually creating threads has been considered poor practice for some time, so you probably shouldn't do either of these. instead, use an executor with a thread factory:
now, the familiar fixed thread pool will schedule virtual threads from the factory, in the same way as it has always done. of course, there will also be os-level carrier threads to run those virtual threads, but that's internal to the virtual thread implementation.
the fixed thread pool will limit the total number of concurrent virtual threads. by default, the mapping from virtual threads to carrier threads is done with a fork-join pool that uses as many cores as given by the system property
, or by default,
. you can supply your own scheduler in the thread factory:
i don't know if this is something that one would want to do. why have more carrier threads than cores?
back to our executor service. you execute tasks on virtual threads just like you used to execute tasks on os-level threads:
as a simple test, we can just sleep in each task.
if you now set
and comment out the
in the factory builder, the program will fail with an out of memory error. a million os-level threads take a lot of memory. but with virtual threads, it works.
at least, it should work, and it did work for me with previous builds of loom. unfortunately, with the build i downloaded on december 5, i got a core dump. that has happened to me on and off as i experimented with loom. hopefully, it will be fixed by the time you try this.
now you are ready to try something more complex. heinz kabutz recently presented a puzzler with a program that loaded thousands of dilbert cartoon images. for each calendar day, there is a page such as https://dilbert.com/strip/2011-06-05 . the program read those pages, located the url of the cartoon image on each page, and loaded each image. it was a mess of completable futures , somewhat like:
with fibers, the code is much clearer:
sure, each the call to
blocks, but with fibers, we don't care.
try this out with something you care about. read a large number of web pages, process them, do more blocking reads, and enjoy the fact that blocking is cheap with fibers.
the initial motivation for project loom was to implement fibers, but earlier this year, the project embarked on an experimental api for structured concurrency. in this highly recommended article (from which the images below are taken), nathaniel smith proposes structured forms of concurrency. here is his central argument. launching a task in a new thread is really no better than programming with goto, i.e. harmful:
when multiple threads run without coordination, it's spaghetti code all over again. in the 1960s, structured programming replaced
with branches, loops, and functions:
now the time has come for structured concurrency. when launching concurrent tasks, we should know, from reading the program text, when they have all finished.
that way, we can control the resources that the tasks use.
by summer 2019, project loom had an api to express structured concurrency. unfortunately, that api is currently in tatters because of the more recent experiment in unifying the thread and fiber apis, but you can try it with the prototype at http://jdk.java.net/loom/ .
here, we schedule a number of tasks:
blocks until all fibers finish. remember-blocking is not a problem with fibers. once the scope is closed, you know for sure that the fibers have finished.
is auto-closeable, so you can use a
but what if one of the tasks never finishes?
you can create a scope with a deadline (
) or timeout (
all fibers that haven't finished by the deadline/timeout are canceled. how? read on.
cancellation has always been a pain in java. by convention, you cancel a thread by interrupting it. if the thread is blocking, the blocking operation terminates with an
. otherwise, the interrupted status flag is set. getting the checks right is tedious. it is not helpful that the interrupted status can be reset, or that
is a checked exception.
treatment of cancellation in
has been inconsistent. consider
. if any task yields a result, the others are canceled. but
lets all tasks run to completion, even though their results will be ignored.
the summer 2019 project loom api tackled cancellation. in that version, fibers have a
operation, similar to
, but cancellation is irrevocable. the static
if the current fiber has been canceled.
when a scope times out, its fibers get canceled.
cancelation can be controlled by the following options in the
cancel_at_close: closing scope cancels all scheduled fibers instead of blocking
propagate_cancel: if owning fiber is canceled, any newly scheduled fibers automatically canceled
ignore_cancel: scheduled fibers can't be canceled
all these options are unset at the top level. the
options are inherited from the parent scope.
as you can see, there was a fair amount of tweakability. we'll have to see what comes back when this issue is revisited. for structured concurrency, it must be automatic to cancel all fibers in the scope when the scope times out or is forcibly closed.
it came as a surprise to me that one of the pain points for the project loom implementors are
variables, as well as more esoteric things-context class loaders,
. i had no idea so much was riding along on threads.
if you have a data structure that isn't safe for concurrent access, you can sometimes use an instance per thread. the classic example is
. sure, you could keep constructing new formatter objects, but that's not efficient. so you want to share one. but a global
won't work. if two threads access it concurrently, the formatting can get mangled.
so, it makes sense to have one of them per thread:
to access an actual formatter, call:
the first time you call
in a given thread, the lambda in the constructor is called. from then on, the get method returns the instance belonging to the current thread.
for threads, that is accepted practice. but do you really want to have a million instances when there are a million fibers?
this hasn't been an issue for me because it seems easier to use something threadsafe, like a
formatter. but project loom has been pondering "scope local" objects-one those
thread locals have also been used as an approximation for processor locality, in situations where there are about as many threads as processors. this could be supported with an api that actually models user intent.
state of the project
developers who want to use project loom are naturally preoccupied with the api which, as you have seen, is not settled. however, a lot of the implementation work is under the hood.
a crucial part is to enable parking of fibers when an operation blocks. this has been done for networking, so you can connect to web sites, databases and so on, within fibers. parking when local file operations block is not currently supported.
in fact, reimplementations of these libraries are already in jdk 11, 12, and 13-a tribute to the utility of frequent releases.
blocking on monitors (
blocks and methods) is not yet supported, but it needs to be eventually.
is ok now.
if a fiber blocks in a native method, that will "pin" the thread, and none of its fibers will make progress. there is nothing that project loom can do about that.
needs more work to be supported.
work on debugging and monitoring support is ongoing.
as already mentioned, stability is still an issue.
most importantly, performance has a way to go. parking and unparking fibers is not a free lunch. a section of the runtime stack needs to be replaced each time.
there has been a lot of progress in all these areas, so let's cycle back t what developers care about-the api. this is a really good time to look at project loom and think about how you want to use it.
is it of value to you that the same class represents threads and fibers? or would you prefer some of the baggage of
to be chucked out? do you buy into the promise of structured concurrency?
take project loom out for a spin and see how it works with your applications and frameworks, and provide feedback for the intrepid development team!
want to write for the java advent blog? we are looking for contributors to fill all 24 slots and would love to have your contribution! contact the java advent admin at firstname.lastname@example.org!
Published at DZone with permission of Cay Horstmann. See the original article here.
Opinions expressed by DZone contributors are their own.