Batch Processing With Subscriptions in RavenDB 4.0
This is the kind of thing that can really make the operations team happy because they can do targeted jobs with very little friction.
Join the DZone community and get the full member experience.
Join For FreeSubscriptions are a somewhat neglected feature in RavenDB. It was created to handle a specific customer need and grew from there, but it had relatively little traction and was a bit of a pain to use. When we looked at the things we wanted to do in the RavenDB 4.0 re-working, how people use subscription was high enough on the list that it got a dedicated dev for about a year.
Here is how a subscription looks in RavenDB 3.x:
var orders = store.Subscriptions.Open<Order>(ordersSubscription, new SubscriptionConnectionOptions());
orders.Subscribe(order =>
{
GenerateInvoice(order);
});
orders.Subscribe(order =>
{
if(order.State == OrderState.Invalid)
MarkForManualProcessingBy(order.Employee);
});
It is only available from code, and the model used is heavily influenced by reactive extensions. It gives you a reliable subscription to data, even if the client or server went down, it could recover upon restart. But it was complex to do the more advanced things. There are events that you can register to respond to things that are happening, but there isn’t a complete story. Other things, such as automatic failover or responding to deletes, were flat out impossible.
With RavenDB 4.0, we decided to do things differently. I talked about this before several times, but recently, we completed a major restructuring and simplification of the user-visible behavior that I’m really happy about. To start with, we ditched the reactive extensions and observable model. This is just not the right fit for the kind of things we want to do. Instead, we are going with full-blown batch processing.
await subscription.Run( batch =>
{
foreach(var item in batch.Items)
{
Order order = item.Result;
// do something with this order
}
});
Instead of being called once per item, we are going to call once per batch. This is actually how things are going over the wire, and exposing it directly to the user make our life a lot easier. It also means that you have a much better model to actually do things in a batch mode, such as applying modifications to all the items in the batch and saving them back in a single operation.
Subscriptions in RavenDB 4.0 are also fault-tolerant and highly available (both client and server), allow accessing versioned and deleted snapshots, allowing to apply complex filtering and transformations on the server side and in general a lot more suitable for the task we intend them for.
Perhaps what is more exciting is that subscriptions are available to all the clients, and in some cases, it just makes more sense to write them as a batch processing script. Consider:
#!/usr/bin/env python3
from pyravendb import RavenDB
def process_batch(batch):
with batch.open_session() as session:
for item in batch.items:
customer = item.result
if customer.WelcomeEmailSent:
continue
send_welcome_email(customer);
customer.WelcomeEmailSent = True
session.store(customer)
session.save_changes()
store = document_store (
urls = ["http://rvn-srv-2:8080"],
database = "HelpDesk"
)
store.initialize()
with store.subscriptions.open(
subscription_connection_options(
id = "new-customers-welcome-email"
)) as subscription:
subscription.run(process_batch)
This is the kind of thing that can really make the operations team happy because they can do targeted jobs with very little friction. I spend the whole of Chapter 5 talking about subscriptions, and I think it is well worth it.
Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments