Platinum Partner
java

Asynchronous Testing of Swing Applications

I'm telling you about some experiences I had with specific issues when testing Swing applications (it happened with the blueMarine development, but this article is focused on Swing and not on the NetBeans Platform). I'd like to hear your opinion about a solution I've found and, before I code more on it, whether you know an existing framework that works in a similar way.

As many have said many times, testing is one of the most important activities for the success of a project. This is probably even more true for desktop applications than regular web applications, since they have more sophisticated behaviours and interactivity with the user. Indeed, many Web 2.0 applications are getting rich in interactivity too, but most of them don't have sophisticated asynchronous models, either for limitations of the technology or because they are made simple on purpose. After all the Web is still modelled on network transactions, even if you're using AJAX; this means that you still have plenty of probing points for integration testing using the communicating channel between the client and the server.

Consider instead a Swing-based desktop application with the following behaviour:

  1. you have a file system explorer and you select folders
  2. upon each selection, the application scans the folders (potentially in a recursive fashion) for searching contained media
  3. when the selection has been completed, some thumbnails appear in a viewer.

Of course, this must be done with background threads, so the application is responsive, that is the user is able to keep on selecting other folders while the previous scan is not completed yet; in this case, the still-running thread is cancelled.

Now, let's suppose we want to write a high-level integration test for this scenario. The probing point is just the user interface: you perform a selection and want to assert that "some time later" another part of the UI has been updated. We could think of this test code sketch:

package it.tidalwave.bluemarine.filesystemexplorer.test;

public class FolderSelectionTest extends AutomatedTest
{
private static final int THUMBNAIL_SELECTION_TIMEOUT = 4000;

private FileSystemExplorerTestHelper f;
private ThumbnailViewerTestHelper t;

...

@Before
public void prepare()
{
f = new FileSystemExplorerTestHelper();
t = new ThumbnailViewerTestHelper();
}

@After
public void shutDown()
{
f.dispose();
t.dispose();
}

@Test
public void run()
throws Exception
{
activate(f.fileSystemExplorerTopComponent);
f.resetSelection();

selectNode(f.upperExplorer, f.view.findUpperNodeByPath(BaseTestSet.getPath()));
select(f.cbSubfolders, true);
// WAIT FOR THE COMPUTATION TO COMPLETE

assertActivated(f.fileSystemExplorerTopComponent.getClass().getName());
assertOpened(t.thumbnailViewerTopComponent.getClass().getName());
ThumbnailViewerTestHelper.assertShownFiles(t.thumbnailListView, BaseTestSet.getPath(), BaseTestSet.getAllFiles());
...
}

...
}

This is actually real code from blueMarine's tests, and I think it's pretty readable (at least if you know a bit of basic concepts of the NetBeans Platform). It activates a TopComponent, resets the folder selection, selects a node in an explorer, selects a checkbox (cbSubfolders is a checkbox that enables recursive scanning), and asserts that the TopComponent for selecting files is still active (= it has got the focus), the thumbnail viewer TopComponent is opened and its view is populated with all the files of the test set. The TestHelpers are facility classes that cooperate with some parts of the UI classes, offering references that point to the relevant UI components and specific assertion methods.

The various methods in the test body are statically imported from an utility class and perform the proper UI manipulation in the EDT thread - this is a common solution of many Swing-related testing frameworks.

blueMarine at the moment doesn't use any of the existing, such as Abbot and Costello or Jemmy / Jelly. This is for historical reasons (initial tests were written quite a few years ago, before the conversion to NetBeans, even though most of them has been "lost" in the conversion) and they will eventually converge to the standard framework used by NetBeans. But this is not relevant with the problem I'm talking about.

The tough point is that "WAIT FOR THE COMPUTATION TO COMPLETE". How to implement it? A simple solution could be a delay of an appropriate length, but this is really a bad idea; first, it will unnecessarily slow down tests; second, it will prevent you from measuring the performance while testing; third, you'll soon discover that, sooner or later, a specific run of the test will casually take much longer than expected (e.g. because the operating system is swapping memory) and the test will fail.

This has been a trouble for me until today, especially if you consider that I've configured this kind of tests for being executed by users (I call these tests "Acceptance Tests"). This is a very important point, especially for an open source project in order to take advantage of its community, and specifically for finding problems in a context that you can't reproduce (for instance, a specific computer of a troubled user). A subtle point is that my users aren't computer engineers, but end users: so you can't ask them for downloading, compiling code and running tests with Ant. In fact, I made tests available as plugins that can be installed into blueMarine by means of an update center. But you can't either pretend that users get into technical details so they can understand if a test failed because of a spike or because of a real bug. Indeed, what I expect is that users just press a couple of buttons and either tell me "all tests passed" or they report some failure by just sending a log file (a thing that could even happen automatically).

Add to this another point: once you've added an option for repeating the test suite for a potentially high number of times you are able to run load tests (for instance to find problems in the long run, such as memory leaks and so on).

The Automated Test pane in blueMarine

This means that you can't tolerate false positives in tests and that wait must be at the same time the shortest possible and effective.

As I've said, until today my Acceptance Tests were plagued by spikes and were not suitable for being executed by non technical people. I've spent days and days trying to define a good way to guess whether a certain asynchronous process has been completed, but trouble arose even in a scenario that looks simple as the three points that I've introduced at the beginning of this article.

The idea I pursued until yesterday was to have a very small facility for detecting events. Look at the following code variation and in particular the use of the Waitable object:

    @Test
public void run()
throws Exception
{
activate(f.fileSystemExplorerTopComponent);
f.resetSelection();

selectNode(f.upperExplorer, f.view.findUpperNodeByPath(BaseTestSet.getPath()));
final Waitable selectionChanged = t.thumbnailViewSelectionChanged();
select(f.cbSubfolders, true);
selectionChanged.waitForOccurrences(2, DEFAULT_TIMEOUT);

assertActivated(f.fileSystemExplorerTopComponent.getClass().getName());
assertOpened(t.thumbnailViewerTopComponent.getClass().getName());
ThumbnailViewerTestHelper.assertShownFiles(t.thumbnailListView, BaseTestSet.getPath(), BaseTestSet.getAllFiles());
...
}

 The Waitable encapsulates the logics for detecting that a certain event has been triggered. TestHelpers provide factory methods for different Waitables representing a number of interesting events that you might want to wait for. In this specific case, the selectionChanged Waitable listens for changes in the view component that renders the thumbnails. The timeout here is quite large and is useful only for preventing the test from lingering indefinitively in case of something going wrong.

So far, it's straightforward. The hard part that I faced with is that blueMarine might generate multiple events even in simple cases. For instance, every selection usually ends up in two events: first all views are notified with a special empty result that stands for "Please wait, search in progress" (and at the same times immediately clears the results of a previous search), later followed by the real result. That's why I added a parameter for specifying that you might want to wait for multiple occurences of the same event (2 in this cases). Any kind of event has a progressive counter; when you invoke the Waitable factory method, the current value of the counter is copied into the Waitable (e.g. 10) and when waitForOccurences(2, ...) is called the thing waits for the counter to reach 10 + 2 = 12. So you're able to wait for a certain number of occurences of an event starting from a known point.

But this solution was not enough. Consider the above sample: both the selectNode(...) and select(f.cbSubFolders, true) operations *might* trigger a scan, if they are changing the current state of the UI (i.e. the specified node was not already selected, or the checkbox was already checked); they won't in the opposite case. Since a test must be reproductable, at the very beginning of the test you are sure about what the initial state is; you aren't a few steps after the beginning. Please bear in mind that these are NOT unit tests, that are usually pretty simple, but integration tests: they mimick a real user interaction with the application and they can be complex and pretty long.

Having to consider preconditions every time you are coding a wait for the completion of a process resulted extremely frustrating, as even minor changes caused the code to break. Furthermore, the advanced asynchronicity of the application makes things even worse: sometimes a scan triggered by previous calls starts with a great delay, eventually *after* the Waitable is created. This of course breaks the mechanism of counters. Furthermore, sometimes pending operations are cancelled because are replaced by others; other times they are allowed to finish before they are cancelled. Summing up, the number of occurences of any event can change in very complex and unexpected ways. Strange things usually happens during spikes, for instance when the computer is busy in other tasks, thus in very unpredictable scenarios.

I've spent a lot of time trying to write more effective syncing points: on this purpose I added more and more probes in the code and eventually used more complex conditionals (for instance: ignore events when happens this and that). While this proved to be not enough for more complex sequences, it started polluting the implementation code of the application in an unbearable way. Furthermore, this approach was more and more coupling the test code with the implementation, making tests even more fragile.

When I'm spending too long time for solving a problem and I'm not satisfied with the elegance of the solution I think it's high time I stopped and think of something different. So I reverted all the latest changes and looked in a different direction. When you feel you're getting into a complexity trap, a good approach is to try thinking as close as the real world as you can. What's happening in my scenario? Well:

1a. I press a button
1b. a sequence of things happens
1c. a result is made available
2a. I press another button
2b. another sequence of things happens
2c. another result is made available

and eventually 1* and 2* sequences can be intermixed like that:

1a. I press a button
1b. a sequence of things happens
2a. I press another button
2b. another sequence of things happens
1c. a result is made available
2c. another result is made available

Still, looking at the above pseudo-code, we are able to track what's happening thanks to the 1* and 2* identifiers. That is, we have "tagged" the different operation sequences. Since any operation is a sequence of threads, why don't we just tag them? Bingo.

Look at the following code:

    @Test
public void run()
throws Exception
{
activate(f.fileSystemExplorerTopComponent);
f.resetSelection();

assertActivated(f.fileSystemExplorerTopComponent.getClass().getName());
select(f.cbSubfolders, false);
final Tag tag1 = selectNode(f.upperExplorer, f.view.findUpperNodeByPath(BaseTestSet.getPath()));
t.thumbnailSelectionChanged().waitForNextOccurrence(tag1, THUMBNAIL_SELECTION_TIMEOUT);
delay(200); // Give time for the AWT thread to work so changes go to the UI

assertActivated(f.fileSystemExplorerTopComponent.getClass().getName());
assertOpened(t.thumbnailViewerTopComponent.getClass().getName());
...
}

Now the point is that every method that initiates an interaction with the UI creates a new instance of Tag. The tag is attached to the EDT thread (by means of ThreadLocal) before calling Swing and propagated to related threads (for instance, typically the EDT starts a background thread by means of a java.util.concurrency.Executor and, when things are ready, the result is passed again to the EDT for refreshing the UI). This means that instead of waiting for countable event occurences, we wait for tagged events occurrences; in the last code example waitForNextOccurrence(tag1, ...) blocks until the specified event happens in a properly tagged thread.

The tricky part is how to propagate the tag from thread to thread, but in the end I made it to work without too many hassles. The most annoying part is custom code in place of EventQueue.invokeLater(), but I think that the AWT event queue is customizable and I could make it possible for tagging to work with the standard method calls.

Enough for today. First I'd like to know what you think about it; second, if you know an existing framework that works in this way. If it doesn't, I'll show you the underlying code next time.

{{ tag }}, {{tag}},

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}
{{ parent.authors[0].realName || parent.author}}

{{ parent.authors[0].tagline || parent.tagline }}

{{ parent.views }} ViewsClicks
Tweet

{{parent.nComments}}