This post comes from David Nalley at the Citrix blog.
I recently brought up the idea of release criteria for Apache CloudStack on the development mailing list. I thought I'd pontificate on why I think this is a good idea, and what it really entails. But first a bit of background:
Time based releasesCloudStack has chosen to embark on time based releases. I personally think this is a good choice. Feature-based releases tend to have scope and time creep. Additionally I think that our user community is reassured when we have regular timely releases. Communities hate surprise, and the corollary to that is that they love knowing what they can expect and when they can expect it, and I think time-based releases do that well, especially for rapidly evolving projects. For projects that are operating in maintenance mode, this is less of a concern.
This doesn't mean that time-based releases are without downsides. One of those is accepting the fact that things may not be perfect, and not letting the perfect be the enemy of the good. I think that Ubuntu has an excellent analogy to this that resonates well. In their page on time-based releases they note that shipping time-based releases of software is a bit like putting on a play in a theatre. There is a tremendous amount of effort expended in getting ready for a play, and the audience wants to see it, they've paid their money for tickets and are waiting for the doors to open. The show must go on, even if things aren't completely perfect.
Admitting that things aren't completely perfect does worry people. We all want perfection. The reality is that we aren't going to eliminate all problems. No software is free from bugs. Our software needs to deliver value, and hopefully keep the time to achieve value to a minimum. This means that we must very cautiously guard and protect a known good state. In my mind this means verifying that proposed changes don't deleteriously affect CloudStack. We have literally thousands of tests that verify various operations of CloudStack - why aren't we using them? Well, in some respects we are and have been. The problem I perceive is that the testing cycle time is relatively long, and that we lack a way of 'enforcing' that.
Alex Huang and Prasanna Santhanam have been working on putting together some automated testing and proposing how we should be using it. I've mentioned before that I think that we need to treat failures in testing as an Andon cord for CloudStack. Commits that break CloudStack aren't bugs, they shouldn't be there in the first place. Especially bugs for which we have a test.
Because 'the show must go on' we must keep CloudStack in a known good state at all times. Features and bugfixes that wish to come in should really demonstrate that they can prove that they don't harm the state of the system. Having a robust testing system that continuously proves that CloudStack is in good shape is certainly a way to ensure that we really are shipping quality software; and don't have days or weeks of cleanup and bugfixing AFTER we discover a problem.