Designing APIs on the NetBeans Platform (Part 0): The Metadata API
With this article I'm starting a new series of posts about the NetBeans Platform. It's about how I designed (and am still designing) a specific API of the blueMarine project that is completely self-contained (thus it's a good matter for teaching) and thought through in the context of a highly modular concept. This series will go in parallel with the one about "Idioms for the NetBeans Platform" and, of course, it's complementary to it. I will deal with the NetBeans Platform, but many things could be probably reused outside of it. Also, we're talking about design which is a general topic.
For the record, the whole API sources can be downloaded:
svn co https://bluemarine.dev.java.net/svn/bluemarine/trunk/src/Metadata --username guest
The most up-to-date javadoc (automatically created by Hudson) is available at https://bluemarine.dev.java.net/nonav/javadoc/Metadata/it-tidalwave-metadata/index.html
The project is managed by CI (http://www.tidalwave.it/hudson/job/blueMarine%20Metadata%20-%20nightly%20build/) and it has got a good coverage (70+% at the moment) so it is ready for inclusion in other's projects.
To keep this first post simple, today I'm just introducing the API and its specifications and describing the first class.
blueMarine is, among other things, a DAM - a Digital Asset Manager, which includes the capability of manipulating metadata. Manipulating means to extract them from files and eventually store them into a database for fast queries. Please note that "eventually": the idea is that blueMarine is not only a desktop application, but also a set of modules that can be recomposed and uses or other purposes (for instance, blueOcean is a server side version of it); requirements can be different in various scenarios and the database might be in or out. Also, the database could be a traditional RDBMS or "something else".
Also, there are many different kinds of data to manage. blueMarine started with photos (= rasterized still images), but the incubator contains support for PDF files and some types of movies; in future more types will be added, such as sounds and possibly vectorial images. To keep the design general, the Metadata API (but indeed the whole blueMarine) doesn't depend on a specific class, but just on the generic DataObject that comes from the NetBeans Platform API (DataSystems API). A DataObject represents a datum described by a file (or a set of files) with an associated MIME Type: it's right the kind of abstraction I need.
Given the wide gamma of media types that can be supported, we have many different metadata types. For instance, photos have got EXIF, TIFF, IPTC, plus some proprietary stuff (metadata for the "camera raw" formats). PDF files and movies have different metadata. Since the thing must be modular, in some usages I might want to support all the metadata types, in others only a subset. This means that also the support for a specific kind of metadata must be modular, that is it needs to be implemented in a self-contained module that can be added or not to the project. Since beans binding is a good thing, the Metadata API requires that each metadata item is modeled by a fully featured JavaBean, that is a class with getters and setters and property change support. Furthermore, the Metadata API should accomodate existing implementations from third-party libraries (e.g. an EXIF class), thus it must be possible to automatically add support for properties if the original library lacks it.
This is a rough but decent picture of the requirements; we can now briefly introduce the first classes of the API.
The main module is the "Metadata" one (it.tidalwave.metadata) - it just contains interfaces and mostly abstract classes and it's the starting point for everything related to metadata. The most important thing is the Metadata interface - an instance of it must be associated to every DataObject instance, so to start manipulate the metadata for an object you do:
DataObject dataObject = ...
Metadata metadata = dataObject.getLookup().lookup(Metadata.class);
Metadata acts as a global factory / storage of single metadata items, that can be retrieved with code such as:
EXIF exif = metadata.findOrCreateItem(EXIF.class).get();
IPTC iptc = metadata.findOrCreateItem(IPTC.class).get();
Classes such as EXIF or IPTC (called "metadata items") are not part of the Metadata API, but directly come from a third-party library. For instance, EXIF and IPTC comes from Mistral, which is a plain JSE project of mine. The Metadata API guarantees that metadata items support bound properties, so exif and iptc can be directly used with Beans Binding.
If the Metadata API is installed in your project, it's guaranteed that a non-null object is always returned, even though there are no metadata for that DataObject; the same statement holds true for any metadata item (that is, the exif and iptc instances in the above example are always guaranteed to be non null). This approach is known as the Null Object or Special Case pattern (they aren't the same thing, but serve the same purpose) and I've adopted it as a best practice for the whole API, since it makes it possible to get rid of tons of "if (something != null)" tests: in a few words, the Metadata APIs never returns a null, but rather a special object instance that provides a "null" behaviour.
The findOrCreateItem() method supports a number of optional parameters: for instance it is possible to specify where the metadata should be retrieved from:
import static it.tidalwave.metadata.Metadata.StorageType.*;
EXIF exif = metadata.findOrCreateItem(EXIF.class, EXTERNAL).get();
IPTC iptc = metadata.findOrCreateItem(IPTC.class, INTERNAL).get();
The Metadata API doesn't strictly define what "EXTERNAL" and "INTERNAL" mean, leaving the responsibility to concrete implementations; in general, an external source would mean an external file (e.g. you want to extract the data from a photo), while "INTERNAL" would mean an internal database (which in this case would offer the capability to retrieve the data even when the file is not available, for instance because it's on an off-line disk). There are some other optional supported parameters that will be discussed in the next part - as you can see, they are not implemented with the classic Fluent Interface pattern (which I appreciate in other cases, but it's too verbose for simple things like this), rather with a different approach that I described on my blog.
A variant of the find method supports multiple instances (there can be multiple instances of the same metadata item in a single file, and you could ask for both EXTERNAL and INTERNAL things - this actually happens if you don't specify a StorageType):
List<EXIF> exifs = metadata.findOrCreateItems(EXIF.class).get();
List<IPTC> iptcs = metadata.findOrCreateItems(IPTC.class).get();
It is possible to dinamicaly query for supported metadata items by calling:
Set<Class<?>> itemClasses = metadata.getItemClasses();
For instance, itemClasses might contain stuff such as EXIF.class or IPTC.class. In general, you get a different set of classes according to the type of DataObject (photo, movie, etc...) and to the set of implementation modules in the classpath. You can query whether a metadata item is available without retrieving it by calling:
if (metadata.isItemAvailable(EXIF.class, EXTERNAL))
And that seems enough for today. Next time I'll introduce some more advanced way to manipulate metadata instances, such as support for multiple items and modification/creation of items.