I recently had an IRC conversation about
. The main question that the person who was chatting with me had was “How far out is the 4.0 release?” The answer, as with almost any open source project, is “when it’s released.”
Naturally, that answer doesn’t really help get to the crux of what most IT teams who either use or are considering
need to figure out, which is whether 4.0 is stable enough to deploy in a live environment.
, even in unrelated versions, has historically been pretty stable. So, if a new version, in this case 4.0, has the functions that you’re looking for – in this conversation, it was function queries like idf() or termfreq() – then unless you’re comfortable with compiling a previous version of
and creating your own code on top of it, then you’re probably going to want to go with the latest version.
Of course, this approach does come with risk. I have only heard of 1 actual “bug” that led to incorrect/wrong results sneaking into the
code base in an unreleased project, and it was quickly found and fixed. But, since you’re working on a code base which may change somewhat, if you are building indexes that you can not easily rebuild, for example, indexing the Internet and can’t recrawl to generate the data – meaning if
is your “system of record”, then be aware that over time the index file format may change because Lucene is changing under the covers and periodically there is an email that tells you that you need to rebuild your indexes. But, if you are basically taking a download of
4.0 as it is today, and then only going to update a) when new killer awesome feature added or b) when 4.0 comes out, then reindexing shouldn’t be a problem.
The other aspect of deploying
is your testing environment. If you have strong system and functional testing, then you can be fairly sure that things are working appropriately. If you’re not certain about testing, check out my presentation on
Better Search Engine Testing
from this year’s Software Test and Performance Conference.