DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • High-Performance Java Serialization to Different Formats
  • Writing DTOs With Java8, Lombok, and Java14+
  • Distribution Design Patterns in Java - Data Transfer Object (DTO) And Remote Facade Design Patterns
  • Java: Why a Set Can Contain Duplicate Elements

Trending

  • How To Build Resilient Microservices Using Circuit Breakers and Retries: A Developer’s Guide To Surviving
  • AI Agents: A New Era for Integration Professionals
  • Securing the Future: Best Practices for Privacy and Data Governance in LLMOps
  • Useful System Table Queries in Relational Databases
  1. DZone
  2. Coding
  3. Languages
  4. Did You Know the Fastest Way of Serializing a Java Field Is Not Serializing It at All?

Did You Know the Fastest Way of Serializing a Java Field Is Not Serializing It at All?

This article elaborates on different ways of serializing Java objects and benchmarks performance for the variants.

By 
Per-Åke Minborg user avatar
Per-Åke Minborg
·
Jan. 20, 22 · Tutorial
Likes (9)
Comment
Save
Tweet
Share
9.0K Views

Join the DZone community and get the full member experience.

Join For Free

Learn how to apply C++'s trivially copyable scheme in Java and get blazing serialization speed using Unsafe and memcpy to directly copy the fields in one single sweep to memory or to a memory-mapped file.

In a previous article about open-source Chronicle Queue, there was some benchmarking and method profiling indicating that the speed of serialization had a significant impact on execution performance. After all, this is only to be expected as Chronicle Queue (and other persisted queue libraries) must convert Java objects located on the heap to binary data which is subsequently stored in files. Even for the most internally efficient libraries, this inevitable serialization procedure will largely dictate performance.

Data Transfer Object

In this article, we will use a Data Transfer Object (hereafter DTO) named MarketData which contains financial information with a relatively large number of fields. The same principles apply to other DTOs in any other business area.

Java
 
abstract class MarketData extends SelfDescribingMarshallable {

    long securityId;
    long time;

    // bid and ask quantities
    double bidQty0, bidQty1, bidQty2, bidQty3;
    double askQty0, askQty1, askQty2, askQty3;

    // bid and ask prices
    double bidPrice0, bidPrice1, bidPrice2, bidPrice3;
    double askPrice0, askPrice1, askPrice2, askPrice3;

    // Getters and setters not shown for clarity

}


Default Serialization

Java’s Serializable marker interface provides a default way to serialize Java objects to/from the binary format, usually via the ObjectOutputStream and ObjectInputStream classes. The default way (whereby the magic writeObject() and readObject() are not explicitly declared) entails reflecting over an object's non-transient fields and reading/writing them one by one, which can be a relatively costly operation.

Chronicle Queue can work with Serializable objects but also provides a similar, but faster and more space-efficient way to serialize data via the abstract class SelfDescribingMarshallable. Akin to Serializable objects, this relies on reflection but comes with substantially less overhead in terms of payload, CPU cycles, and garbage.

Default serialization often comprises the steps of:

  • Identifying the non-transient fields using reflection
  • Reading/writing the identified non-transient field values using reflection
  • Writing/reading the field values to a target format (eg binary format)

The identification of non-transient fields can be cached, eliminating this step to improve performance.

Here is an example of a class using default serialization:

Java
 
public final class DefaultMarketData extends MarketData {}


As can be seen, the class does not add anything over its base class and so it will use default serialization as transitively provided by SelfDescribingMarshallable.

Explicit Serialization

Classes implementing Serializable can elect to implement two magic private (sic!) methods whereby these methods will be invoked instead of resorting to default serialization.

This provides full control of the serialization process and allows fields to be read using custom code rather than via reflection which will improve performance. A drawback with this method is that if a field is added to the class, then the corresponding logic must be added in the two magic methods above or else the new field will not participate in serialization. Another problem is that private methods are invoked by external classes. This is a fundamental violation of encapsulation.

SelfDescribingMarshallable classes work in a similar fashion but thankfully it does not rely on magic methods and invoking private methods externally. A SelfDescribingMarshallable class provides two fundamentally different concepts of serializing: one via an intermediary Chronicle Wire open-source (which can be binary, text, YAML, JSON, etc) providing flexibility and one implicitly binary providing high performance. We will take a closer look at the latter one in the sections below.

Here is an example of a class using explicit serialization whereby public methods in implementing interfaces are explicitly declared:

Java
 
public final class ExplicitMarketData extends MarketData {

    @Override
    public void readMarshallable(BytesIn bytes) {
        securityId = bytes.readLong();
        time = bytes.readLong();
        bidQty0 = bytes.readDouble();
        bidQty1 = bytes.readDouble();
        bidQty2 = bytes.readDouble();
        bidQty3 = bytes.readDouble();
        askQty0 = bytes.readDouble();
        askQty1 = bytes.readDouble();
        askQty2 = bytes.readDouble();
        askQty3 = bytes.readDouble();
        bidPrice0 = bytes.readDouble();
        bidPrice1 = bytes.readDouble();
        bidPrice2 = bytes.readDouble();
        bidPrice3 = bytes.readDouble();
        askPrice0 = bytes.readDouble();
        askPrice1 = bytes.readDouble();
        askPrice2 = bytes.readDouble();
        askPrice3 = bytes.readDouble();

    }

    @Override
    public void writeMarshallable(BytesOut bytes) {
        bytes.writeLong(securityId);
        bytes.writeLong(time);
        bytes.writeDouble(bidQty0);
        bytes.writeDouble(bidQty1);
        bytes.writeDouble(bidQty2);
        bytes.writeDouble(bidQty3);
        bytes.writeDouble(askQty0);
        bytes.writeDouble(askQty1);
        bytes.writeDouble(askQty2);
        bytes.writeDouble(askQty3);
        bytes.writeDouble(bidPrice0);
        bytes.writeDouble(bidPrice1);
        bytes.writeDouble(bidPrice2);
        bytes.writeDouble(bidPrice3);
        bytes.writeDouble(askPrice0);
        bytes.writeDouble(askPrice1);
        bytes.writeDouble(askPrice2);
        bytes.writeDouble(askPrice3);
    }

}


It can be concluded that this scheme relies on reading or writing each field explicitly and directly, eliminating the need to resort to slower reflection. Care must be taken to ensure fields are referenced in a consistent order and class fields must also be added to the methods above. 

Trivially Copyable Serialization

The concept of Trivially Copyable Java Objects is derived from and inspired by C++. 

As can be seen, the MarketData class above contains only primitive fields. In other words, there are no reference fields like String, List or the like. This means that when the JVM lays out the fields in memory, field values can be put adjacent to one another. The way fields are laid out is not specified in the Java standard which allows for individual JVM implementation optimizations. 

Many JVMs will sort primitive class fields in descending field size order and lay them out in succession. This has the advantage that read and write operations can be performed on even primitive type boundaries. Applying this scheme on the ExplicitMarketData for example will result in the long time field being laid out first and, assuming we have the initial field space 64-bit aligned, allows the field to be accessed on an even 64-bit boundary. Next, the int securityId might be laid out, allowing it and all the other 32-bit fields to be accessed on an even 32-bit boundary. 

Imagine instead if an initial byte field was initially laid out, then subsequent larger fields would have to be accessed on uneven field boundaries. This would add a performance overhead for some operations, and would indeed prevent a small set of operations from being performed at all (eg unaligned CAS operations on the ARM architecture).

How is this relevant to high-performance serialization? Well, as it turns out, it is possible to access an object’s field memory region directly via Unsafe and use memcpy to directly copy the fields in one single sweep to memory or to a memory-mapped file. This effectively bypasses individual field access and replaces, in the example above, the many individual field accesses with a single bulk operation. 

The way this can be done in a correct, convenient, reasonably portable, and safe way is outside the scope of this article. Luckily, this feature is readily available in Chronicle Queue, open-source Chronicle Bytes, and other similar products out-of-the-box.

Here is an example of a class using trivially copyable serialization:

Java
 
import static net.openhft.chronicle.bytes.BytesUtil.*;

public final class TriviallyCopyableMarketData extends MarketData {

    static final int START = 
            triviallyCopyableStart(TriviallyCopyableMarketData.class);


    static final int LENGTH = 
            triviallyCopyableLength(TriviallyCopyableMarketData.class);

    @Override
    public void readMarshallable(BytesIn bytes) {
        bytes.unsafeReadObject(this, START, LENGTH);
    }

    @Override

    public void writeMarshallable(BytesOut bytes) {
        bytes.unsafeWriteObject(this, START, LENGTH);
    }

}


This pattern lends itself well to scenarios where the DTO is reused. Fundamentally, It relies on invoking Unsafe under the covers for improved performance.

Benchmarks

Using JMH, serialization performance was assessed for the various serialization alternatives above using this class:

Java
 
@State(Scope.Benchmark)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(NANOSECONDS)
@Fork(value = 1, warmups = 1)
@Warmup(iterations = 5, time = 200, timeUnit = MILLISECONDS)
@Measurement(iterations = 5, time = 500, timeUnit = MILLISECONDS)
public class BenchmarkRunner {

    private final MarketData defaultMarketData = new DefaultMarketData();
    private final MarketData explicitMarketData = new ExplicitMarketData();
    private final MarketData triviallyCopyableMarketData = new TriviallyCopyableMarketData();
    private final Bytes<Void> toBytes = Bytes.allocateElasticDirect();
    private final Bytes<Void> fromBytesDefault = Bytes.allocateElasticDirect();
    private final Bytes<Void> fromBytesExplicit = Bytes.allocateElasticDirect();
    private final Bytes<Void> fromBytesTriviallyCopyable = Bytes.allocateElasticDirect();

    public BenchmarkRunner() {
        defaultMarketData.writeMarshallable(fromBytesDefault);
        explicitMarketData.writeMarshallable(fromBytesExplicit);
        triviallyCopyableMarketData.writeMarshallable(fromBytesTriviallyCopyable);
    }

    public static void main(String[] args) throws Exception {
        org.openjdk.jmh.Main.main(args);
    }

    @Benchmark
    public void defaultWrite() {
        toBytes.writePosition(0);
        defaultMarketData.writeMarshallable(toBytes);
    }

    @Benchmark
    public void defaultRead() {
        fromBytesDefault.readPosition(0);
        defaultMarketData.readMarshallable(fromBytesDefault);
    }

    @Benchmark
    public void explicitWrite() {
        toBytes.writePosition(0);
        explicitMarketData.writeMarshallable(toBytes);
    }

    @Benchmark
    public void explicitRead() {
        fromBytesExplicit.readPosition(0);
        explicitMarketData.readMarshallable(fromBytesExplicit);
    }

    @Benchmark
    public void trivialWrite() {
        toBytes.writePosition(0);
        triviallyCopyableMarketData.writeMarshallable(toBytes);
    }

    @Benchmark
    public void trivialRead() {
        fromBytesTriviallyCopyable.readPosition(0);
        triviallyCopyableMarketData.readMarshallable(fromBytesTriviallyCopyable);
    }

}


This produced the following output on a MacBook Pro (16-inch, 2019) with 2.3 GHz 8-Core Intel Core i9 CPU under JDK 1.8.0_312, OpenJDK 64-Bit Server VM, 25.312-b07:

Plain Text
 
Benchmark                      Mode  Cnt   Score   Error  Units
BenchmarkRunner.defaultRead    avgt    5  88.772 ± 1.766  ns/op
BenchmarkRunner.defaultWrite   avgt    5  90.679 ± 2.923  ns/op
BenchmarkRunner.explicitRead   avgt    5  32.419 ± 2.673  ns/op
BenchmarkRunner.explicitWrite  avgt    5  38.048 ± 0.778  ns/op
BenchmarkRunner.trivialRead    avgt    5   7.437 ± 0.339  ns/op
BenchmarkRunner.trivialWrite   avgt    5   7.911 ± 0.431  ns/op


Using the various MarketData variants, explicit serialization is more than two times faster than default serialization. Trivially copyable serialization is four times faster than explicit serialization and more than ten times faster than default serialization as illustrated in the graph below (lower is better):

Trivially copyable serialization, explicit serialization, default

More fields generally favor trivially copyable serialization over explicit serialization. Experience shows break-even is reached at around six fields in many cases. 

Interestingly, the concept of trivially copyable can be extended to hold data normally stored in reference fields such as a String or an array field. This will provide an even more relative performance increase for such classes.  Contact the Chronicle team if you want to learn more as this, again, is out of the scope of the article.

Why Does it Matter?

Serialization is a fundamental feature of externalizing DTOs to persistent queues, sending them over the wire or putting them in an off-heap Map, and otherwise handling DTOs outside the Java heap. Such data-intensive applications will almost always gain performance and experience reduced latencies when the underlying serialization performance is improved.

Resources

Chronicle Queue (open-source)

GitHub Chronicle Bytes (open-source)

Java (programming language) Serialization Data transfer object Open source file IO Object (computer science) 64-bit Fastest

Opinions expressed by DZone contributors are their own.

Related

  • High-Performance Java Serialization to Different Formats
  • Writing DTOs With Java8, Lombok, and Java14+
  • Distribution Design Patterns in Java - Data Transfer Object (DTO) And Remote Facade Design Patterns
  • Java: Why a Set Can Contain Duplicate Elements

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!