DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. Databases
  4. Transactional Patterns: Conversation vs. Batch

Transactional Patterns: Conversation vs. Batch

This post talks about the difference between the two styles and how that impacts your work.

Oren Eini user avatar by
Oren Eini
·
Aug. 23, 18 · Analysis
Like (2)
Save
Tweet
Share
3.95K Views

Join the DZone community and get the full member experience.

Join For Free

When I designed RavenDB, I had a very particular use case at the forefront of my mind. That scenario was a business application talking to a database, usually as a web application.

These kinds of applications have a particular style of communication with the user. As you can see below, there are two very distinct operations. Show the user the data, followed by some “think time” (seconds at minimum, but can be much longer) and then followed by an action.

image

This shouldn’t really be a surprise for anyone who developed any kind of application for the last decade or two, so why do I mention this explicitly? I mention this because of the nature of communication between the application and the database.

Some databases have a conversation pattern with the application. In terms of API, this will look something like this:

  • BeginTransaction()
  • Update()
  • Insert()
  • Commit()

This is a very natural model and should be quite familiar for most developers. The other alternative to this method is to use batches:

  • SaveChanges( [Update, Insert] )

I want to use this post to talk about the difference between the two styles and how that impacts your work. Relational databases use the conversation style while RavenDB uses batch style. On the surface, it looks like it would be more complex to use RavenDB to achieve the same task, but there is very little difference in the API as far as the user is concerned. In both cases, the code looks very much the same:

// RavenDB
using(var session = documentStore.OpenSession())
{
       session.Store(user); // update
       session.Store(comment); // insert
       session.SaveChanges();
}

//EF Core
using (var context = new MyContext()) 
{
    context.Update(user);
    context.Insert(comment);
    context.SaveChanges();
}

Behind the scenes, however, the RavenDB code will send just a single request to the server, while a relational database will need four separate commands to execute the transaction. In many cases, you can send all of these commands to the server in a single roundtrip, but that is an optimization that doesn’t always work and often isn’t applied even when it is possible.

Sidebar: Reducing server roundtrips

Why is the reduction in server roundtrips so important? Because it has a lot of implications on the overall performance of the system. In many cases the cost of making a remote query from the application to the database far outstrips the costs of actually executing the query. This ties closely to the Fallacies of Distributed Computing. Latency isn’t zero, even though when you develop locally it certainly seems like this is the case.

The primary goal of this design in RavenDB was to reduce the number of network roundtrips that your application must endure. Because in the vast majority of the cases, your application is going to follow the “show data” / “modify data” as two separate operations (often separated by a long idle time) there is a lot of value in having the database interaction model match what you will actually be doing.

As it turns out, there are some additional advantages (and disadvantages, which I’ll cover a bit later) to this approach, beyond just the obvious reduction in the number of server roundtrips.

When the server gets all the operations that need to be done in a single request, it can apply all of them at once. For that matter, it can choose how to apply them in the most optimal order. This gives the database server a lot more chances for optimization. It is similar to going to the supermarket with a list of items to purchase vs. a treasure hunt. When you have the full list, you can decide to pick things up based on how close they are on the shelves. If you only get the next instruction after you complete the previous one, you have no option for optimization.

When using the conversation style, durability and state management become more complex as well. Relational databases typically use some variation of ARIES for their journals. This is because they need to record information about ongoing transactions that haven’t yet been committed. This adds significant complexity to the amount of work that is required from the database engine. Furthermore, when running in a distributed system, you need to share this transaction state (which hasn’t yet been committed!) across the nodes to allow failover of the transaction if the server fails. With the conversation style, you need to support concurrent transactions all operating at the same time and potentially reading and modifying the same data. This lead to a great deal of code that is required to properly manage locking and latching inside the database engine.

On the other hand, batch mode gives the server all the operations in the transaction in a single go. This means that failover can simply be sending the batch of operations to another node, without the need to share complex state between them. It means that the database server has all the required information and can make decisions based on it. For example, if there are no data dependencies, it can execute the operations in the transaction in whatever order it desires, leading to more optimal execution time. The database can also mix & match operations from different transactions into a single batch (as long as it keeps the externally visible behavior consistent, of course) to optimize things even further.

There are two major disadvantages for batch mode. The first of which is that there is usually a strict separation of reads from writes. That means that you usually can’t get a single consistent read/modify operation that stays in the same transaction. The second issue is similar because you need to generate all the operations ahead of time, you can’t make decisions about what operations to execute based on the data you read, at least not in the same transaction. The typical solution for that is to send a script in the batch. This script can then read/modify data in the same context, apply logic, etc. The important thing here is that this script runs inside the server, already inside the transaction. This means that you don’t pay network round trips time to make such operations.

On the other hand, it means that you need to write potentially complex logic in the database’s scripting language, rather than your own platform, which you’ll likely prefer.

Luckily, for most scenarios, especially with web applications, you don’t need to execute complex logic on the server side. You can usually just send the commands you need in a single batch and be done with it. Often, just have optimistic concurrency is enough to get you the consistency you want, with scripting reserved for more exceptional cases.

RavenDB’s usage scenario was meant to make the common operations easy and the hard stuff possible. I think that we got it right and ended up with an API that is functional, highly performant, and one that has withstood the test of time very well.

Database Relational database Conversations (software) application

Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Hidden Classes in Java 15
  • The 12 Biggest Android App Development Trends in 2023
  • An Introduction to Data Mesh
  • Asynchronous HTTP Requests With RxJava

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: