I recently had an IRC conversation about Solr
. The main question that the person who was chatting with me had was
“How far out is the 4.0 release?” The answer, as with almost any open
source project, is “when it’s released.”
Naturally, that answer doesn’t really help get to the crux of what most IT teams who either use or are considering Solr
need to figure out, which is whether 4.0 is stable enough to deploy in a live environment.
even in unrelated versions, has historically been pretty stable. So, if
a new version, in this case 4.0, has the functions that you’re looking
for – in this conversation, it was function queries like idf() or
termfreq() – then unless you’re comfortable with compiling a previous
version of Solr
and creating your own code on top of it, then you’re probably going to want to go with the latest version.
Of course, this approach does come with risk. I have
only heard of 1 actual “bug” that led to incorrect/wrong results
sneaking into the Solr
code base in an unreleased project, and it was quickly found and fixed.
But, since you’re working on a code base which may change somewhat, if
you are building indexes that you can not easily rebuild, for example,
indexing the Internet and can’t recrawl to generate the data – meaning
is your “system of record”, then be aware that over time the index file
format may change because Lucene is changing under the covers and
periodically there is an email that tells you that you need to rebuild
your indexes. But, if you are basically taking a download of Solr
4.0 as it is today, and then only going to update a) when new killer
awesome feature added or b) when 4.0 comes out, then reindexing
shouldn’t be a problem.
The other aspect of deploying Solr
is your testing environment. If you have strong system and
functional testing, then you can be fairly sure that things are working
appropriately. If you’re not certain about testing, check out my
presentation on Better Search Engine Testing
from this year’s Software Test and Performance Conference.