Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

How Does Passing this::method Reference Within a Loop Affect Performance?

DZone's Guide to

How Does Passing this::method Reference Within a Loop Affect Performance?

Do you know about the performance issues associated with this::method references?

· Performance Zone ·
Free Resource

Sensu is an open source monitoring event pipeline. Try it today.

The problem I would like address affects performance when thethis::method reference is passed to another method (especially within a long-running loop). Hence, it is important to be aware and avoid this situation.

The Problem

This arises when a lambda or a method reference captures  this the reference (e.g. this::method), which is passed to another method (e.g. doCompute() method in my example) within a loop (especially a long-running loop). In this situation, there seems to be a considerable performance penalty, also spotted by benchmark tests.

This problematic pattern could be summarized as follows:

[sourcecode language="java"]
method() {
    loop() {     
        result = doCompute(this::method); // the trap!
        // use the result ...
    }
}
[/sourcecode]


What happens is that at run-time capturing this reference causes the lambda or the method reference to allocate an object every time it is passed the other method (i.e. doCompute() method), even though, in such case, this might not happen anyway (since lambda itself does not modify any state).

How to Avoid This Issue

One solution is to use a variable for the method reference outside of the loop and pass it to the method within the loop, as follows:

[sourcecode language="java"]
method() {
    methodReference = this::method; // local variable
    loop() {
        result = doCompute(methodReference);
        // use the result ...
    }
}
[/sourcecode]


However, please notice that this might not work in all cases since, during runtime, the JIT Compiler tries to optimize the loops by hoisting the loop iteration!

The second option is to move the method reference into a field outside of the method, which belongs to the class:

[sourcecode language="java"]
methodReference = this::method; // class field
// ...
method() { 
      loop() { 
           result = doCompute(methodReference);
           // use the result ...
     } 
} 
[/sourcecode]


Another possible solution might be to remove the method reference, have a class that implements the functional interface, and passes this as an argument (and, of course, other variations, but I will not focus on them in this article).

Microbenchmark

I have created a short benchmark corresponding to the first option (i.e. using a variable for the method reference) to test the performance for different loop sizes.

[sourcecode language="java" highlight="37,52"]
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Warmup(iterations = 5, time = 5, timeUnit = TimeUnit.MICROSECONDS)
@Measurement(iterations = 5,  time = 5, timeUnit = TimeUnit.MICROSECONDS)
@Fork(value = 3, warmups = 1, jvmArgsAppend = { })
@State(Scope.Benchmark)
public class LinkToTargetMethodVsLambdaReferenceBenchmark {

    @Param({"1000", "10000", "100000", "1000000", "10000000"})
    int size;

    @Param({ "1" })
    int factor;

    int[] array;
    int shiftResult;

    public static void main(String[] args) throws RunnerException {
        Options opt =
            new OptionsBuilder()
                .include(LinkToTargetMethodVsLambdaReferenceBenchmark.class.getSimpleName())
                .build();
        new Runner(opt).run();
    }

    @Setup
    public void setupList(){
        array = new int[size];
        for (int i = 0; i < size; i ++) {
            array[i] = i;
        }
    }

    @Benchmark
    public long linkToTargetMethod() {
        // Create a method reference variable outside of the loop
        ComputeFunctionalInterface fInterface = this::getValue;
        long result = 0;
        for (int i = 0; i < size; i++ ) {
            shiftResult = array[i] >> factor;
            result += doCompute(i, fInterface);
        }
        return result;
    }

    @Benchmark
    public long lambdaReference() {
        long result = 0;
        for (int i = 0; i < size; i++ ) {
            shiftResult = array[i] >> factor;
            // Pass the reference directly to the method within the loop
            result += doCompute(i, this::getValue);
        }
        return result;
    }

    public int doCompute(int i, ComputeFunctionalInterface fInterface) {
        return i * fInterface.compute(i);
    }

    private int getValue(int value) {
        // Irrelevant method computation ...
        // Capturing the state is all that matters!
        return (shiftResult % 3 == 0 || shiftResult % 5 == 0) ? value : 1;
    }

}

@FunctionalInterface
public interface ComputeFunctionalInterface {
    int compute(int i);
}
[/sourcecode]


Test output:

[sourcecode language="plain" highlight="6, 7"]
Benchmark                (size)    Mode   Cnt        Score     Error   Units

lambdaReference           1,000    avgt    15        2.584     0.047   us/op
lambdaReference          10,000    avgt    15       24.885     1.535   us/op
lambdaReference         100,000    avgt    15      237.756     3.130   us/op
lambdaReference       1,000,000    avgt    15    3,748.218    36.642  us/op
lambdaReference      10,000,000    avgt    15   37,597.158   477.473  us/op

linkToTargetMethod         1000    avgt    15        2.569     0.064  us/op
linkToTargetMethod       10,000    avgt    15       24.254     0.456  us/op
linkToTargetMethod      100,000    avgt    15      239.485     2.551  us/op
linkToTargetMethod    1,000,000    avgt    15    2,946.587   222.539  us/op
linkToTargetMethod   10,000,000    avgt    15   28,890.146   313.623  us/op
[/sourcecode]


Tests triggered using JDK 11.0.1 (latest JDK release at the moment) on my machine (CPU: Intel i7-6600U Skylake; MEMORY: 16GB DDR4 2133 MHz; OS: Windows 7 Enterprise)

Conclusions After This Test Scenario

  1.  lambdaReference seems to be significantly slower than linkToTargetMethod for bigger loop sizes (e.g. 1M and 10M iterations)
  2. The most significant difference between these two starts for more than 1M elements, when  linkToTargetMethod performs, on average, around 30 percent better

What Really Happens?

An easy way to profile these methods is by using IDEA v2018.3 Ultimate edition, which comes with Async Profiler integrated, and to have a look over generated flame charts.

 lambdaReference test case:

As pointed out, the issue here is the call to  Lambda$51.get$Lambda, which generates a new instance every time. Below is the VM anonymous class generated during startup, which reveals this:

[sourcecode language="java"]
final class LinkToTargetMethodVsLambdaReferenceBenchmark$$Lambda$51 implements ComputeFunctionalInterface {
    private final LinkToTargetMethodVsLambdaReferenceBenchmark arg$1;

    // Private constructor
    private  LinkToTargetMethodVsLambdaReferenceBenchmark$$Lambda$51(LinkToTargetMethodVsLambdaReferenceBenchmark var1) {
        this.arg$1 = var1;
}
    // Generates a new instance for every call !
    private static ComputeFunctionalInterface get$Lambda(LinkToTargetMethodVsLambdaReferenceBenchmark var0) {
        return new LinkToTargetMethodVsLambdaReferenceBenchmark$$Lambda$51(var0);
    }

    // Just dispatches the call back to LinkToTargetMethodVsLambdaReferenceBenchmark
    @Hidden
    public int compute(int var1) {
        return this.arg$1.getValue(var1);
    }
[/sourcecode]


To understand how lambda works under the hood and how to get the VM anonymous classes during startup, please check my slides from a previous conference.

 linkToTargetMethod test case:

As opposed to the previous flame graphs, in this case, there is no captured Lambda$51.get$Lambda call by the graph; hence, there substantially fewer heap allocations.

Final Conclusions

  • Avoid passing  this::method to a method within a loop since it really affects performance
  • To overcome this issue, replace  this::method by (one of below options):
    • A variable outside of the loop (inside the method)
    • A field variable belonging to the class (outside of the method)
    • Create a class that implements the functional interface and passes this as an argument

If you are interested in other Java performance topics, I would kindly encourage you to subscribe to my Java Performance and Tuning course, or follow me on Twitter and check out my website.

Sensu: workflow automation for monitoring. Learn more—download the whitepaper.

Topics:
java ,performance ,benchmarking ,tutorial ,loops ,this ,method ,loop ,lambda

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}