I've actually experienced this during blueMarine development in the hard way, something that now I can't afford any longer as I get closer to a real release. My beta testers have been pretty clear about their performance expectations of some features and one of the real, "political" disadvantages of Java is that the persisting myth "Java is slow", while false, can give you additional problems if you do something wrong in this area. Just be inefficient once in your code, and somebody will soon poo-poo your desktop application since it's "Java crap" (that is, even if the error is yours, and you can actually fix it, you risk a perpetual refusal of your application because of the technology it is made with).
So I've started doing some periodical, manual measurements of performance in the most important areas of blueMarine. So far I'm running some special load tests, getting the numbers out of it and putting them into an OpenOffice spreadsheet where I can track the progress, see trends and eventually make comparisons between figures etc... (the very first example has been talked about on my blog a couple of months ago).
Now, in perspective, I see the usual problem of "continuous" tasks: it's taking longer and longer, and the more performance tests I add, the worse. So this must be automated in some way, and possibly integrated with Hudson.
Of the many problems about continuous profiling, the very first is how to collect data. Now, NetBeans has got an excellent profiler (apart from a few bugs that are causing me troubles in my specific RCP scenario), but in my opinion this is a different thing:
- The Java profiler is a specific development tool, while I want to be possible that some profiling data can be collected at any time with the production code, even by some (beta) tester. While it's true that the Java 6 profiler is much easier to set up, it's still a JDK tool that would scare a casual user; what I want is a specific UI option pane where the user can click on a checkbox and possibly select the functional area to profile.
- The Java profiler is not "semantic aware". What I mean is that it can tell me how many times and how long a certain method—say loadImage()—has been executed, while I need e.g. to segment the collected data in function of the parameters (for instance, the image type and so on). Please read the example below for a better clarification of the point.
- The Java profiler collects too many data for a "continuous profiling" approach, and this of course also impacts on the performance. Of course, with the Java profiler it's perfectly possible to fine tune the profiling points, but again this is a task for a programmer, not for a (beta) user.
Of the previous points, I'd say that the stronger is the number #2, considering that improvements in the Java Profiler may weaken the #1 and #3 as time goes on.
For these reasons I'm thinking of a very simple API. Here it is an example: I have a class which is responsible for importing a piece of metadata, represented by its class, from an image; and the following is the decorator which triggers the stop watches:
public class ImageBridgeStopWatchDecorator implements ImageBridge
private final ImageBridgeImpl delegate = new ImageBridgeImpl();
public <T> T readMetadataItem (final File file, final Class<T> itemClass)
final StopWatch stopWatch = StopWatch.create(delegate.getClass(), String.format("read/%s/%s", itemClass.getName(), extension(file)));
return delegate.readMetadataItem(file, itemClass);
As you can see, a StopWatch is just created (and automatically started) and defined with a name, which is composed by the class it refers to and a hierarchical name, part of which is context dependant (the extension of the file name in this case). This is an example of the data that can be collected:
STATS: it/tidalwave/bluemarine/metadata/importer/MetadataImporter/process/ALL [ 155 samples]: acc: 1.38/ 0.01/ 0.06/ 0.01 msec, ela: 96549.59/ 225.20/ 2781.93/ 622.90 msec (val/min/max/avg)
STATS: it/tidalwave/bluemarine/metadata/importer/MetadataImporter/process/image/crw [ 8 samples]: acc: 3151.33/ 102.31/ 1134.09/ 393.92 msec, ela: 4396.07/ 225.18/ 1336.43/ 549.51 msec (val/min/max/avg)
STATS: it/tidalwave/bluemarine/metadata/importer/MetadataImporter/process/image/dng [ 10 samples]: acc: 6413.12/ 345.50/ 1541.62/ 641.31 msec, ela: 18840.89/ 996.86/ 2781.86/ 1884.09 msec (val/min/max/avg)
STATS: it/tidalwave/bluemarine/metadata/importer/MetadataImporter/process/image/jpeg [ 63 samples]: acc: 18608.76/ 104.39/ 583.07/ 295.38 msec, ela: 30141.81/ 309.16/ 915.03/ 478.44 msec (val/min/max/avg)
STATS: it/tidalwave/bluemarine/metadata/importer/MetadataImporter/process/image/mrw [ 10 samples]: acc: 3076.36/ 272.67/ 419.63/ 307.64 msec, ela: 4089.75/ 370.50/ 518.07/ 408.97 msec (val/min/max/avg)
STATS: it/tidalwave/bluemarine/metadata/importer/MetadataImporter/process/image/nef [ 59 samples]: acc: 26909.63/ 293.88/ 759.67/ 456.10 msec, ela: 37054.66/ 440.88/ 1118.92/ 628.05 msec (val/min/max/avg)
STATS: it/tidalwave/bluemarine/metadata/importer/MetadataImporter/process/image/srf [ 2 samples]: acc: 709.45/ 247.23/ 462.23/ 354.73 msec, ela: 1120.44/ 445.31/ 675.14/ 560.22 msec (val/min/max/avg)
STATS: it/tidalwave/bluemarine/metadata/importer/MetadataImporter/process/image/tiff [ 3 samples]: acc: 355.54/ 109.47/ 134.61/ 118.51 msec, ela: 902.77/ 258.22/ 333.88/ 300.92 msec (val/min/max/avg)
As you can see, the data has been segmented by the MIME type of the image, which is very important for me since different codecs are triggered and very different performance are shown (for instance, look at the awful figures about the DNG format). The StopWatch collects the number of invocations, the elapsed and accumulated times, as well as max/min and average. The precision is nanoseconds (by means of System.nanoTime()). At the moment the data are just collected in memory and dumped at the end of the test, but there's also a specific API to access them as JavaBeans, so I can put them in test assertions. If a function all of a sudden becomes slower, a test will fail (of course there are problems here, such as being sure that the CI server is not overloaded by other tasks, otherwise collected times will be artificially too long).
Enhancements could be to make the data available through JMX, so a JConsole could easily track them in real time, if needed. So far I've seen that the impact of StopWatches is very small in my tests, and in any case they could be disabled by configuration (in this case a dummy StopWatch class would be returned every time by the create() method).
What do you think? I'd like to hear from you about this, since continuous profiling is something new for me. In particular, I'd like to know whether there is an existing simple StopWatch API with the features described in the example above, before I go on and develop my own one. Also, any suggestion and best practice would be appreciated.