Software is still mostly stuck in the simulation stage. The original Agile book, Kent Beck‘s Extreme Programming was about the making of a payroll program. Kind of the perfect example of a pure simulation: the application is merely aping a process that has been established to distribute compensation. It kind of stands to reason that simulations live and die by how well they model the thing that they are simulating. Most people never talk about the models they are working on in their projects, but with the advent of free software, we have seen that modeling is usually either not done or done badly, and when applications start to fail to meet expectations, it‘s very rare that the solution is to take a fresh look at the model and see where its failing short. Case in point: testing.
Testing is pretty devoid of a model. Most people see it only as a stage in the build model, but the problem is that just treating it as an amorphous jumble of vitamins has lead to what we have now, which is most projects either have no tests, or a bunch of dead tests, or a huge bungle of tests that are so vast that starting the build fills people on the team with dread. Some of the worst projects I have seen have been the ones who, on the surface, seemed so adamant about their dedication to testing. Here‘s a clue: if you are sounding like you just logged in from testing storm troupers, have a build that isn‘t a complete crapshoot and tests that aren‘t stuffed to the gills with things like this:
assert that I called ‘this‘
you might be a test poser.
Lately, I have had my head in one iOS project and a Java one. Both have a lot of tests, but the Java one has descended into test chaos: many of the tests call out to urls that the application depends on. Sure, we have pulled in some of the REST services and HTML they process into local files and written tests against them, but there is still the need at some point to know that the APIs you are depending upon are still there and functioning in the expected ways.
So I have two additional places to go with this:
What a Test Model Might Look Like
So imagine this project where there are outside content sources that we
are regularly syncing with. We should probably make components out of
all the things that are doing the various syncing work (one jar) that
are not domain specific. Then we should have a domain-specific project
that uses those components to do things like relocate documents into
domain-controlled storage, onboard their content (e.g. index, etc.). But
that should probably be broken up between the interfaces that are
outside that they depend upon and the internal implementations. Here is
the main reason (probably not what you were thinking): if you have a
bunch of tests that go against those outside resources, your build is
going to become a nightmare, and frankly, you don‘t want to be hammering
those locations, but you do need to know if something changes or goes
down. So in this particular case, keep in mind, if you split these
things, you are going to probably want to make them peers of a common
parent project, which means to prevent the tests from running each time,
when you change code on the implementation side, you should ONLY
build that jar. Frankly, this is one of the ways this model works best
because if you are using Jenkins or its like, this should happen
automatically: you will end up framing up the interfaces, with tests,
and probably almost never changing them, then, as you modify
implementations, those will be regenerated automatically and deposited
into your nexus repo (or wherever).
Modularity in Java is still such a mess. Not clear that any other language has done this better, but it probably would make sense to consider in this case that literally every content source, or perhaps type, is its own pair of jars!
Ok, so now we have:
synchronizer [jar] – (generic code for synchronizing remote content)
outside-content [pom project]
+ outside-content-apis [jar]
+ outside-content-impls [jar]
The key is, if all these things were in one big project, we‘d rapidly have a 30m build that keeps verifying interfaces that almost never change.
Of course, another thing Jenkins offers in this situation is we can create a job that just runs the API tests at some fixed interval, whether our code changed or not.
Testing Modalities in an Unstructured World
I have blogged a lot about testing. In the last year, mainly about Arquillian and the book Growing Objected Oriented Software Using Tests This week while I was working through some of these issues, I realized that the camps or test clans that have cropped up are really outgrowths of the lack of a more comprehensive model behind testing. For instance, I read a good article about the NoMock movement. Having just done a tour on a codebase that was, as Fielding Mellish says in Banans ‘A mockery of a sham of two shams of a mockery.‘ I was kind of partial to the arguments. Arquillian is clearly in the NoMock camp (if not its spiritual leader). The Growing team is in the other camp. I would argue that both camps are finding a solution that is too simple and doesn‘t address the real problem. For instance, in my review of the GOOSwT, I pointed out that the example app was one where 90% of the functionality was protocol-based (lends itself to mocks). On the other hand, while Arquillian does address the issue of making sure that your code really does run against the target platform, it doesn‘t really offer much for problems like the above. That‘s to say: if you have such problems, you are on your own. Should you be? That‘s the real question.
What would a model-based Testing World look like?
Well, consider most things that went from infancy to something more mature. Usually, there was either no model followed by some or a whimpy one followed by a better one. The build world is a good example: Maven‘s main contribution is the POM, which has model in its name. When it first appeared, the hordes of dug in Paleos wailed about how their project required its own unique structure and model, but alas, Maven won. Even established products have to go through model evolution. iTunes was unable to deal with things like tracks that had different personnel than the album (the norm in Jazz). The rewrite of iTunes was obviously incredibly painful.
I want a test model that has a bunch of notions built into it that are utterly absent now:
- inside v. outside – some of above illustrates, but it would be cool to forbid reaching outside from certain areas
- protocol verification as independent of chosen tool – this could take all kinds of forms, it‘s not likely to be a simple DSL now that BDD has exploded and Hamcrest and the like are here, but at least we could have some notion somewhere that a test was only verifying a protocol (this is a lead in to a bigger point: the need for QUALITATIVE, not QUANTITATIVE analysis)
- some notion of case expansion – the idea here would be that given certain types of tests, we could have a set of cases that were automatically expanded and would have to be filled in, e.g. how cool would it be to say that you are testing something like a range query and put some constraints on data for it to generate? It is frankly absurd that we have one type of test (Utter typelessness is the key feature of reductive anemia.)
Ultimately, the test world is rife with anemia and its partner in crime: infantile assertions of total devotion. The lynchpin to the argument is the fact that the whole coverage argument was advanced by the cretins as the badge of honor when everyone who‘s ever done a lick of TDD knows its nearly meaningless. It doesn‘t prove anything. It is funny that Meyer was so obsessed with provability and the pendulum has swung away from his brand toward test-based provability, but its so clearly a case of blind faith: we think we are proving that our app works, but we have no way to prove our means of proof (the tests themselves)! Nor will we ever until we start to think about a test model, which would open the door to all kinds of things: richer project semantics, meaning less of the ‘here goes the hideous build train, let‘s push the button and all go to Starbucks‘ and more room for potentially qualitative evolution so people really could know whether their crash capsule is AAA safety rated or a coffin with a tombstone on top.