Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Measuring the Serialization Performance of Lambdas with JMH

DZone's Guide to

Measuring the Serialization Performance of Lambdas with JMH

Learn all about measuring Java Lambda serialization performance, complete with a breakdown of why serializing lambdas is important, and Lambdas in distributed systems.

Free Resource

Transform incident management with machine learning and analytics to help you maintain optimal performance and availability while keeping pace with the growing demands of digital business with this eBook, brought to you in partnership with BMC.

A much anticipated addition to Java 8 was Lambdas. They have a number of uses such as:

Reducing boilerplate code you would have needed for anonymous inner classes.

Reducing the scope of values captured. Lambdas do not implicitly include the this of the outer class, reducing memory leaks.

Easy integration with existing APIs and the new Streams API.

A lesser-known feature of lambdas is that they can be Serialized.

Why Serialize a Lambda?

Serialization is useful for persisting state, and passing objects over the network. Lambdas should be as stateless as possible, so while you can save Lambdas, this is not an obvious use case. 

Lambdas are designed to pass snippets of code to a library to provide interaction with what the library does. But what if the library supports a distributed system, like Chronicle Engine?

What is Chronicle Engine?

Chronicle Engine is a library which allows you to access the data structures in your application remotely either as a Java or C# client, or as an NFS file system. It also supports storing and persisting the data off heap, as well as replication.

Lambdas in Distributed Systems

Lambdas can be a simple way to execute an operation that may or may not be run locally. You can perform operations like:

MapView<String, Long> map = acquireMap(“map-name”, 
                                      String.class, Long.class);
map.put(“key”, 1);
long ret = map.applyToKey(“key”, v -> v + 1); // ret == 2

I don't need to know where the data is stored. If it is on a remote server, the lambda is serialized and executed on that server, with the result returned to me.

Image title

The above screenshot shows how AppDynamics is able to monitor and visualize all your Java applications

Capturing Lambdas

A lambda that captures no fields is handled by Java more efficiently. It doesn't need to create a new object each time as all instances would be the same. However, for a lambda which captures a value that is not known at compile time, a new object can be created and this will hold the value captured.

Non capturing Lambda

Function<String, String> appendStar = s -> s + "*"

Capturing Lambda

String star = "*";
Function<String, String> appendStar = s -> s + star;

Serializable Lambdas

Lambdas are not serializable by default. They have to implement an interface that is Serializable. You can provide a hint with something which looks like a cast but is actually a way of forcing the type inference to use the type you give it.

Function<String, String> appendStar = 
     (Function<String, String> & Serializable) (s -> s + star);

I don't like having to do this, because it rather defeats the objective of reducing boiler plate code.  A way around this is to define your own interface which is Serializable.

@FunctionalInterface
public interface SerializableFunction<I, O> 
       extends Function<I, O>, Serializable {

This allows you to write:

SerializableFunction<String, String> appendStar = s -> s + star;

Or if you have a method like:

<R> R applyToKey(K key, @NotNull SerializableFunction<E, R> function) {

The caller of your library can write:

String s = map.applyToKey(“key”, s-> s + “*”);

without any boilerplate code.

Live Queries with Lambdas

By having serializable lambdas, you can use this is live queries like this.

// print the last name of all the people in NYC
acquireMap(“people”, String.class, Person.class).query()
  .filter(p -> p.getCity().equals(“NYC”)) // executed on the server
  .map(p → p.getLastName())  // executed on the server
  .subscribe(System.out::println); // executed on the client.

The Queryable interface is required so that the filter Predicate and the map Function are implicitly Serializable.. If you used the Streams API, you would have to use the complex cast used earlier.

Performance of Serializing Lambdas.

I used JMH to sample the latency of both serializing and deserializing a simple lambda to append a “*” to a string. I compared both non-capturing and capturing as well as seeing how it compared with sending an enum to so the same thing. The code and results are here:

The 99.99% latency means that 99.99% of the tests were under this latency. Times are in micro-seconds:

Test

Typical latency

99.99% latency

Java Serialization, non-capturing

 33.9 µs

215 µs

Java Serialization, capturing

 36.3 µs

704 µs

Java Serialization, with an enum

7.9 µs

134 µs

Chronicle Wire (Text), non-capturing

20.4 µs

147 µs

Chronicle Wire (Text), capturing

22.5 µs

148 µs

Chronicle Wire (Text), with an enum

1.2 µs

5.9 µs

Chronicle Wire (Binary), non-capturing

11.7 µs

103 µs

Chronicle Wire (Binary), capturing

12.7 µs

135 µs

Chronicle Wire (Binary), with an enum

1.0 µs

1.2 µs


What Does it Mean to Use an enum?

While using a lambda is simple, it is not as efficient, so you need an alternative should it appear that using a lambda is causing a performance problem.

enum Functions implements SerializableFunction<String, String> {
    APPEND_STAR {
        @Override
        public String apply(String s) {
            return s + '*';
        }
    }
}

To see what using an enum makes a difference you can compare how much data needs to be sent to the server. You can see all the serialized data here.

This is how the non-capturing lambda looks like when serialized in TextWire (Based on YAML)

!SerializedLambda {
  cc: !type lambda.LambdaSerialization,
  fic: net/openhft/chronicle/core/util/SerializableFunction,
  fimn: apply,
  fims: (Ljava/lang/Object;)Ljava/lang/Object;,
  imk: 6,
  ic: lambda/LambdaSerialization,
  imn: lambda$textWireSerialization$ea1ad110$1,
  ims: (Ljava/lang/String;)Ljava/lang/String;,
  imt: (Ljava/lang/String;)Ljava/lang/String;,
  ca: [
  ]
}

and the enum serialization looks like this

!Functions APPEND_STAR

Note: you can't use an enum if you need to capture some values. What we do is allow you to pass an enum with additional arguments t get the most efficient combination.

Using enums Like Stored Procedures

One benefit of using enum instead of lambdas, is you can track all the functions clients are performing and a consolidated manner. This makes fixing bugs in any individual function that might be used in many places easier. We use enums in some places for this reason. An example is MapFunction which was originally lots of different lambdas which have now been grouped into one class. Note: using enums are not as clean to implement.

Conclusion

You can use lambdas for distributed applications cleanly if the API you use supports it. You can also switch to using enums if you need the extra performance.



Evolve your approach to Application Performance Monitoring by adopting five best practices that are outlined and explored in this e-book, brought to you in partnership with BMC.

Topics:
performance ,java ,java lamdas ,appdynamics

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}