Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Optimizing event processing

DZone's Guide to

Optimizing event processing

· Performance Zone
Free Resource

Discover 50 of the latest mobile performance statistics with the Ultimate Guide to Digital Experience Monitoring, brought to you in partnership with Catchpoint.

During the RavenDB Days conference, I got a lot of questions from customers. Here is one of them.

There is a migration process that deals with event sourcing system. So we have 10,000,000 commits with 5 – 50 events per commit. Each event result in a property update to an entity.

That gives us roughly 300,000,000 events to process. The trivial way to solve this would be:

foreach(var commit in YieldAllCommits())

{

    using(var session = docStore.OpenSession())

    {

        foreach(var evnt in commit.Events)

        {

            var entity = evnt.Load<Customer>(evnt.EntityId);

            evnt.Apply(entity);

        }

        session.SaveChanges();

    }

}

That works, but it tends to be slow. Worse case here would result in 310,000,000 requests to the server.

Note that this has the nice property that all the changes in a commit are saved in a single commit. We’re going to relax this behavior, and use something better here.

We’ll take the implementation of this LRU cache and add an event for dropping from the cache and iteration.

usging(var bulk = docStore.BulkInsert(allowUpdates: true))

{

    var cache = new LeastRecentlyUsedCache<string, Customer>(capacity: 10 * 1000);

    cache.OnEvict = c => bulk.Store(c);

    foreach(var commit in YieldAllCommits())

    {

        using(var session = docStore.OpenSession())

        {

            foreach(var evnt in commit.Events)

            {

                Customer entity;

                if(cache.TryGetValue(evnt.EventId, out entity) == false)

                {

                    using(var session = docStore.OpenSession())

                    {

                        entity = session.Load<Customer>(evnt.EventId);

                        cache.Set(evnt.EventId, entity);

                    }

                }

                evnt.Apply(evnt);

            }

        }

    }

    foreach(var kvp in cache){

        bulk.Store(kvp.Value);

    }

}

Here we are using a cache of 10,000 items. With the assumption that we are going to have clustering for events on entities, so a lot of changes on an entity will happen on roughly the same time. We take advantage of that to try to only load each document once. We use bulk insert to flush those changes to the server when needed. This code will handle the case where we flushed out a document from the cache then we get events for it again, but he assumption is that this scenario is much lower.

Is your APM strategy broken? This ebook explores the latest in Gartner research to help you learn how to close the end-user experience gap in APM, brought to you in partnership with Catchpoint.

Topics:

Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}