The Cost of Laziness
Join the DZone community and get the full member experience.
Join For FreeRecently I had a dispute with my colleagues regarding performance penalty of lazy vals in Scala. It resulted in a set of microbenchmarks which compare lazy and non-lazy vals performance. All the sources can be found at http://git.io/g3WMzA.
But before going to the benchmark results let's try to understand what can cause the performance penalty.
For my JMH benchmark I created a very simple Scala class with lazy val in it:
@State(Scope.Benchmark)
class LazyValCounterProvider {
lazy val counter = SlowInitializer.createCounter()
}
Now let's take a look at what is hidden under the hood of lazy keyword. At first, we need to compile given code with scalac, and then it can be decompiled to correspondent Java code. For this sake I used JD decompiler. It produced the following code:
@State(Scope.Benchmark) @ScalaSignature(bytes="...") public class LazyValCounterProvider { private SlowInitializer.Counter counter; private volatile boolean bitmap$0; private SlowInitializer.Counter counter$lzycompute() { synchronized (this) { if (!this.bitmap$0) { this.counter = SlowInitializer.createCounter(); this.bitmap$0 = true; } return this.counter; } } public SlowInitializer.Counter counter() { return this.bitmap$0 ? this.counter : counter$lzycompute(); } }
As it's seen, the lazy keyword is translated to a classical double-checked locking idiom for delayed initialization.
Thus, most of the time the only performance penalty may come from a single volatile read per lazy val read (except for the time it takes to initialize lazy val instance since its very first usage). Let's finally measure its impact in numbers.
My JMH-based microbenchmark is as simple as:
public class LazyValsBenchmarks { @Benchmark public long baseline(ValCounterProvider eagerProvider) { return eagerProvider.counter().incrementAndGet(); } @Benchmark public long lazyValCounter(LazyValCounterProvider provider) { return provider.counter().incrementAndGet(); } }
A baseline method access a final counter object and increments an integer value by 1 by calling incrementAndGet .
And as we've just found out, the main benchmark method - lazyValCounter - in addition to what baseline method does also does one volatile read.
Note: all measurements are performed on MBA with Core i5 1.7GHz CPU.
All results were obtained by running JMH in a throughput mode. Both score and score error columns show operations/second. Each JMH run made 10 iterations and took 50 seconds. I performed 6 measurements with the different JVM and JMH options:
-
client VM, 1 thread
Benchmark Score Score error baseline 412277751.619 8116731.382 lazyValCounter 352209296.485 6695318.185
-
client VM, 2 threads
Benchmark Score Score error baseline 542605885.932 15340285.497 lazyValCounter 383013643.710 53639006.105
-
client VM, 4 threads
Benchmark Score Score error baseline 551105008.767 5085834.663 lazyValCounter 394175424.898 3890422.327
-
server VM, 1 thread
Benchmark Score Score error baseline 407010942.139 9004641.910 lazyValCounter 341478430.115 18183144.277
-
server VM, 2 threads
Benchmark Score Score error baseline 531472448.578 22779859.685 lazyValCounter 428898429.124 24720626.198
-
server VM, 4 threads
Benchmark Score Score error baseline 549568334.970 12690164.639 lazyValCounter 374460712.017 17742852.788
The numbers show that lazy vals performance penalty is quite small and can be ignored in practice.
For further reading about the subject I would recommend SIP 20 - Improved Lazy Vals Initialization, which contains very interesting in-depth analysis of existing issues with lazy initialization implementation in Scala.
Published at DZone with permission of Roman Gorodyshcher, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments