Some APIs are set in stone. For instance, the JDK’s. Or public APIs, like the one between a database and a database client (e.g. JDBC).
This makes designing such APIs rather difficult as a lot of thinking needs to be done prior to publishing an API. Which means that being defensive when designing the API is a good choice.
One defensive API design strategy is to always work with parameter objects and return objects. We’ve already blogged about parameter objects before. Let’s have a look at an API, that doesn’t use return objects, and why that’s so terrible:
Database Updatable Statements
When fetching data from a database, we get back a convenient API type, the JDBC
ResultSet. Other languages than Java have similar types to model database results. While the
ResultSet mainly models a set of tuples, it also contains various additional useful features, like
ResultSet.getWarnings(), which are clever backdoors for passing arbitrary, additional information with the
What’s best about these result types is that they can be extended backwards-compatibly. New methods and features can be added to these result types, without modifying:
- Any existing contracts
- Any existing client code
The only thing that might break is JDBC drivers, but since Java 8, JDBC 4.2, and default methods, this is a thing of the past as well.
Things look quite different when calling an update statement in the database:
int count = stmt.executeUpdate();
count value. That’s it? What about any trigger-generated information? What about warnings (I know, they’re available from the statement. Which was modified by the call)?
Interestingly enough, this
count value being an
int seems to have bothered some people long enough for the method to have been de-facto overloaded in JDBC 4.2:
long count = stmt.executeLargeUpdate();
I’m saying “de-facto overloaded” because it is really technically an overload, but because Java doesn’t support overloading by return type, the name was changed as well. (Well, the JVM does support it, but not the language).
When you read the Javadoc of
executeUpdate() method, you will notice that different states are encoded in this single primitive value:
Returns: either (1) the row count for SQL Data Manipulation Language (DML) statements or (2) 0 for SQL statements that return nothing
What’s more, there’s a similar method called
getUpdateCount(), which encodes even more complex state into a single primitive:
the current result as an update count; -1 if the current result is a ResultSet object or there are no more results
And as if this wasn’t bad enough, here’s a very peculiar workaround for the above limitation was implemented by the MySQL database, which encodes different states for
UPSERT statements as such:
With ON DUPLICATE KEY UPDATE, the affected-rows value per row is 1 if the row is inserted as a new row and 2 if an existing row is updated. – See here
If Performance Doesn’t Matter, Always Return a Reference Type!
This is really bad. The call runs over the wire against a database. It is inherently slow. We wouldn’t lose anything if we had an
UpdateResult data type as a result of
executeUpdate(). A different example is
String.indexOf(...) which encodes “not found” as
-1 for performance reasons.
The mistake doesn’t only happen in these old APIs that pre-date object oriented programming. It is repeated again in newer APIs in many applications when the first thing that comes to mind as being a useful method result is a primitive value (or worse: void).
If you’re writing a fluent API (like the Java 8 Stream API, or jOOQ), this will not be an issue as the API always returns the type itself, in order to allow for users to chain method calls.
In other situations, the return type is very clear, because you’re not implementing any side-effectful operation. But if you do, please, think again whether you really want to return just a primitive. If you have to maintain the API for a long time, you might just regret it some years later.