Over a million developers have joined DZone.

What You Didn't Know About JDBC Batch

· Java Zone

Microservices! They are everywhere, or at least, the term is. When should you use a microservice architecture? What factors should be considered when making that decision? Do the benefits outweigh the costs? Why is everyone so excited about them, anyway?  Brought to you in partnership with IBM.

In our previous blog post “10 Common Mistakes Java Developers Make When Writing SQL“, we have made a point about batching being important when inserting large data sets. In most databases and with most JDBC drivers, you can get a significant performance improvement when running a single prepared statement in batch mode as such:

PreparedStatement s = connection.prepareStatement(
"INSERT INTO author(id, first_name, last_name)"
+ "  VALUES (?, ?, ?)");
s.setInt(1, 1);
s.setString(2, "Erich");
s.setString(3, "Gamma");
s.addBatch();
s.setInt(1, 2);
s.setString(2, "Richard");
s.setString(3, "Helm");
s.addBatch();
s.setInt(1, 3);
s.setString(2, "Ralph");
s.setString(3, "Johnson");
s.addBatch();
s.setInt(1, 4);
s.setString(2, "John");
s.setString(3, "Vlissides");
s.addBatch();
int[] result = s.executeBatch();

Or with jOOQ:

create.batch(
insertInto(AUTHOR, ID, FIRST_NAME, LAST_NAME)
.values((Integer) null, null, null))
.bind(1, "Erich", "Gamma")
.bind(2, "Richard", "Helm")
.bind(3, "Ralph", "Johnson")
.bind(4, "John", "Vlissides")
.execute();

What you probably didn’t know, however, is how dramatic the improvement really is and that JDBC drivers like that of MySQL don’t really support batching, whereas Derby, H2, and HSQLDB don’t really seem to benefit from batching. James Sutherland has assembled this very interesting benchmark on his Java Persistence Performance blog, which can be summarised as such:

DATABASEPERFORMANCE GAIN WHEN BATCHED
DB2503%
Derby7%
H220%
HSQLDB25%
MySQL5%
MySQL332% (with rewriteBatchedStatements=true)
Oracle503%
PostgreSQL325%
SQL Server325%

The above table shows the improvement when comparing each database against itself for INSERT, not databases against each other. Regardless of the actual results, it can be said that batching is never worse than not batching for the data set sizes used in the benchmark.

See the full article here to see a more detailed interpretation of the above benchmark results, as well as results for UPDATE statements:

http://java-persistence-performance.blogspot.ch/2013/05/batch-writing-and-dynamic-vs.html

Discover how the Watson team is further developing SDKs in Java, Node.js, Python, iOS, and Android to access these services and make programming easy. Brought to you in partnership with IBM.

Topics:
java ,high-perf ,frameworks ,persistence

Published at DZone with permission of Lukas Eder, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}