Over a million developers have joined DZone.

2X faster PhraseQuery with Lucene using C++ via JNI

DZone's Guide to

2X faster PhraseQuery with Lucene using C++ via JNI

· Performance Zone ·
Free Resource

Learn how error monitoring with Sentry closes the gap between the product team and your customers. With Sentry, you can focus on what you do best: building and scaling software that makes your users’ lives better.

I recently described the new lucene-c-boost github project, which provides amazing speedups (up to 7.8X faster) for common Lucene query types using specialized C++ implementations via JNI.

The code works with a stock Lucene 4.3.0 JAR and default codec, and has a trivial API: just call NativeSearch.search instead of IndexSearcher.search.

Now, a quick update: I've optimized PhraseQuery now as well:

Task QPS base  StdDev base  QPS opt  StdDev opt  % change
HighPhrase 3.5 (2.7%) 6.5 (0.4%) 1.9 X
MedPhrase 27.1 (1.4%) 51.9 (0.3%) 1.9 X
LowPhrase 7.6 (1.7%) 16.4 (0.3%) 2.2 X

~2X speedup (~90% - ~119%) is nice!

Again, it's great to see a reduced variance on the runtimes since hotspot is mostly not an issue. It's odd that LowPhrase gets slower QPS than MedPhrase: these queries look mis-labelled (I see the LowPhrase queries getting more hits than MedPhrase!).

All changes have been pushed to lucene-c-boost; next I'd like to figure out how to get facets working.

What’s the best way to boost the efficiency of your product team and ship with confidence? Check out this ebook to learn how Sentry's real-time error monitoring helps developers stay in their workflow to fix bugs before the user even knows there’s a problem.


Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}