DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
View Events Video Library
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Integrating PostgreSQL Databases with ANF: Join this workshop to learn how to create a PostgreSQL server using Instaclustr’s managed service

Mobile Database Essentials: Assess data needs, storage requirements, and more when leveraging databases for cloud and edge applications.

Monitoring and Observability for LLMs: Datadog and Google Cloud discuss how to achieve optimal AI model performance.

Automated Testing: The latest on architecture, TDD, and the benefits of AI and low-code tools.

Related

  • Resilient Kafka Consumers With Reactor Kafka
  • Event-Driven Order Processing Program
  • Implement a Distributed Database to Your Java Application
  • Kafka-Streams - Tips on How to Decrease Re-Balancing Impact for Real-Time Event Processing On Highly Loaded Topics

Trending

  • The Systemic Process of Debugging
  • Mastering Persistence: Why the Persistence Layer Is Crucial for Modern Java Applications
  • Deploy a Session Recording Solution Using Ansible and Audit Your Bastion Host
  • Exploring Sorting Algorithms: A Comprehensive Guide

Optimizing event processing

Oren Eini user avatar by
Oren Eini
·
Oct. 07, 14 · Interview
Like (0)
Save
Tweet
Share
3.77K Views

Join the DZone community and get the full member experience.

Join For Free

During the RavenDB Days conference, I got a lot of questions from customers. Here is one of them.

There is a migration process that deals with event sourcing system. So we have 10,000,000 commits with 5 – 50 events per commit. Each event result in a property update to an entity.

That gives us roughly 300,000,000 events to process. The trivial way to solve this would be:

foreach(var commit in YieldAllCommits())

{

    using(var session = docStore.OpenSession())

    {

        foreach(var evnt in commit.Events)

        {

            var entity = evnt.Load<Customer>(evnt.EntityId);

            evnt.Apply(entity);

        }

        session.SaveChanges();

    }

}

That works, but it tends to be slow. Worse case here would result in 310,000,000 requests to the server.

Note that this has the nice property that all the changes in a commit are saved in a single commit. We’re going to relax this behavior, and use something better here.

We’ll take the implementation of this LRU cache and add an event for dropping from the cache and iteration.

usging(var bulk = docStore.BulkInsert(allowUpdates: true))

{

    var cache = new LeastRecentlyUsedCache<string, Customer>(capacity: 10 * 1000);

    cache.OnEvict = c => bulk.Store(c);

    foreach(var commit in YieldAllCommits())

    {

        using(var session = docStore.OpenSession())

        {

            foreach(var evnt in commit.Events)

            {

                Customer entity;

                if(cache.TryGetValue(evnt.EventId, out entity) == false)

                {

                    using(var session = docStore.OpenSession())

                    {

                        entity = session.Load<Customer>(evnt.EventId);

                        cache.Set(evnt.EventId, entity);

                    }

                }

                evnt.Apply(evnt);

            }

        }

    }

    foreach(var kvp in cache){

        bulk.Store(kvp.Value);

    }

}

Here we are using a cache of 10,000 items. With the assumption that we are going to have clustering for events on entities, so a lot of changes on an entity will happen on roughly the same time. We take advantage of that to try to only load each document once. We use bulk insert to flush those changes to the server when needed. This code will handle the case where we flushed out a document from the cache then we get events for it again, but he assumption is that this scenario is much lower.

Event Processing

Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Resilient Kafka Consumers With Reactor Kafka
  • Event-Driven Order Processing Program
  • Implement a Distributed Database to Your Java Application
  • Kafka-Streams - Tips on How to Decrease Re-Balancing Impact for Real-Time Event Processing On Highly Loaded Topics

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: