DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Using Python Libraries in Java
  • Designing a Java Connector for Software Integrations
  • How to Convert XLS to XLSX in Java
  • Recurrent Workflows With Cloud Native Dapr Jobs

Trending

  • Streamlining Event Data in Event-Driven Ansible
  • Mastering Fluent Bit: Installing and Configuring Fluent Bit on Kubernetes (Part 3)
  • Apache Doris vs Elasticsearch: An In-Depth Comparative Analysis
  • The Modern Data Stack Is Overrated — Here’s What Works
  1. DZone
  2. Coding
  3. Java
  4. Java: Creating Terabyte Sized Queues with Low-Latency

Java: Creating Terabyte Sized Queues with Low-Latency

Learn how to leverage an open-source library to create large, persisted queues with high performance and low latency leveraging object reuse and memory mapping.

By 
Per-Åke Minborg user avatar
Per-Åke Minborg
·
Dec. 02, 21 · Tutorial
Likes (8)
Comment
Save
Tweet
Share
7.9K Views

Join the DZone community and get the full member experience.

Join For Free

Queues are often fundamental components in software design patterns. But, what if there are millions of messages received every second and multi-process consumer need to be able to read the complete ledger of all messages? Java can only hold so much information before the heap becomes a limiting factor with high-impacting garbage collections as a result, potentially preventing us from fulfilling targeted SLAs or even halting the JVM for seconds or even minutes.

This article covers how to create huge persisted queues while retaining predictable and consistent low latency using open-source Chronicle Queue.

The Application

In this article, the objective is to maintain a queue of objects from market data feeds (e.g. the latest price for securities traded on an exchange). Other business areas such as sensory input from IoT devices or reading crash-recording information within the automotive industry could have been chosen as well. The principle is the same.

To start with, a class holding market data is defined:

Java
 
public class MarketData extends SelfDescribingMarshallable {

    int securityId;
    long time;
    float last;
    float high;
    float low;

    // Getters and setters not shown for brevity
}


Note: In real-world scenarios, great care must be taken when using float and double for holding monetary values as this could otherwise cause rounding problems [Bloch18, Item 60]. However, in this introductory article, I want to keep things simple.

There is also a small utility function MarketDataUtil::create that will create and return a new random MarketData object when invoked:

Java
 
static MarketData create() {

    MarketData marketData = new MarketData();
    int id = ThreadLocalRandom.current().nextInt(1000);
    marketData.setSecurityId(id);
    float nextFloat = ThreadLocalRandom.current().nextFloat();
    float last = 20 + 100 * nextFloat;

    marketData.setLast(last);
    marketData.setHigh(last * 1.1f);
    marketData.setLow(last * 0.9f);
    marketData.setTime(System.currentTimeMillis());

    return marketData;
}


Now, the objective is to create a queue that is durable, concurrent, low-latency, accessible from several processes and that can hold billions of objects.

The Naïve Approach

Armed with these classes, the naïve approach of using a ConcurrentLinkedQueue can be explored:

Java
 
public static void main(String[] args) {

    final Queue<MarketData> queue = new ConcurrentLinkedQueue<>();
    for (long i = 0; i < 1e9; i++) {
        queue.add(MarketDataUtil.create());
    }
  
}


This will fail for several reasons:

  1. The ConcurrentLinkedQueue will create a wrapping Node for each element added to the queue. This will effectively double the number of objects created.

  2. Objects are placed on the Java heap, contributing to heap memory pressure and garbage collection problems. On my machine, this led to my entire JVM becoming unresponsive and the only way forward was to kill it forcibly using “kill -9”.

  3. The queue cannot be read from other processes (i.e. other JVMs).

  4. Once the JVM terminates, the content of the queue is lost. Hence, the queue is not durable.

Looking at various other standard Java classes, it can be concluded that there is no support for large persisted queues.

Using Chronicle Queue

Chronicle Queue is an open-source library and is designed to meet the requirements set forth above.  Here is one way to set it up and use it:

Java
 
public static void main(String[] args) {

    final MarketData marketData = new MarketData();
    final ChronicleQueue q = ChronicleQueue
            .single("market-data");

    final ExcerptAppender appender = q.acquireAppender();

    for (long i = 0; i < 1e9; i++) {
        try (final DocumentContext document =
                     appender.acquireWritingDocument(false)) {

             document
                    .wire()
                    .bytes()
                    .writeObject(MarketData.class, 
                            MarketDataUtil.recycle(marketData));
        }
    }
}


Using a MacBook Pro 2019 with a 2.3 GHz 8-Core Intel Core i9, north of 3,000,000 messages per second could be inserted using only a single thread. The queue is persisted via a memory-mapped file in the given directory “market-data”. One would expect a MarketData object to occupy 4 (int securityId) + 8 (long time) + 4*3 (float last, high and low) = 24 bytes at the very least.

In the example above, 1 billion objects were appended causing the mapped file to occupy 30,148,657,152 bytes which translates to about 30 bytes per message. In my opinion, this is very efficient indeed.

As can be seen, a single MarketData instance can be reused over and over again because Chronicle Queue will flatten out the content of the current object onto the memory-mapped file, allowing object reuse. This reduces memory pressure even more. This is how the recycling method works:

Java
 
static MarketData recycle(MarketData marketData) {

    final int id = ThreadLocalRandom.current().nextInt(1000);
    marketData.setSecurityId(id);
    final float nextFloat = ThreadLocalRandom.current().nextFloat();
    final float last = 20 + 100 * nextFloat;

    marketData.setLast(last);
    marketData.setHigh(last * 1.1f);
    marketData.setLow(last * 0.9f);
    marketData.setTime(System.currentTimeMillis());

    return marketData;
}


Reading From a Chronicle Queue

Reading from a Chronicle Queue is straightforward. Continuing the example from above, the following shows how the first two MarketData objects can be read from the queue:

Java
 
public static void main(String[] args) {

    final ChronicleQueue q = ChronicleQueue
            .single("market-data");
  
    final ExcerptTailer tailer = q.createTailer();

    for (long i = 0; i < 2; i++) {

        try (final DocumentContext document =
                     tailer.readingDocument()) {

            MarketData marketData = document
                    .wire()
                    .bytes()
                    .readObject(MarketData.class);

            System.out.println(marketData);
        }
    }
}


This might produce the following output:

Plain Text
 
!software.chronicle.sandbox.queuedemo.MarketData {
  securityId: 202,
  time: 1634646488837,
  last: 45.8673,
  high: 50.454,
  low: 41.2806
}

!software.chronicle.sandbox.queuedemo.MarketData {
  securityId: 117,
  time: 1634646488842,
  last: 34.7567,
  high: 38.2323,
  low: 31.281
}


There are provisions to efficiently seek the tailer’s position, for example, to the end of the queue or to a certain index.

What’s Next?

There are many other features that are out of scope for this article. For example, queue files can be set to roll at certain intervals (such as each day, hour or minute) effectively creating a decomposition of information so that older data may be cleaned over time. There are also provisions to be able to isolate CPUs and lock Java threads to these isolated CPUs, substantially reducing application jitter.

Finally, there is an enterprise version with replication of queues across server clusters paving the way towards high availability and improved performance in distributed architectures. The enterprise version also includes a variety of other features such as encryption, time zone rolling and asynchronous appenders.

Java (programming language) Terabyte

Published at DZone with permission of Per-Åke Minborg, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Using Python Libraries in Java
  • Designing a Java Connector for Software Integrations
  • How to Convert XLS to XLSX in Java
  • Recurrent Workflows With Cloud Native Dapr Jobs

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!