Flexible indexing is one of the new features in Lucene's next major release, 4.0. It includes big changes to a number of places in Lucene: a new, higher performance postings iteration API; terms as arbitrary opaque bytes (not chars); direct visibility and control of deleted documents; a low-level, pluggable codec API giving applications full control over the postings data. Several interesting codecs have already been created, including the default "standard" codec, which enables sizable RAM reduction for searchers, and a "pulsing" codec that inlines postings data directly into the terms dictionary, which provides a solid performance boost for primary key fields. In this talk Michael McCandless presents an overview of all of these exciting changes, as well as several concrete, real-world examples of how applications can tap into these new features.
Michael McCandless, IBM at Lucene Revolution: "Fun With Flex" from Lucene Revolution on Vimeo.