Over a million developers have joined DZone.

2X faster PhraseQuery with Lucene using C++ via JNI

DZone's Guide to

2X faster PhraseQuery with Lucene using C++ via JNI

Free Resource

Transform incident management with machine learning and analytics to help you maintain optimal performance and availability while keeping pace with the growing demands of digital business with this eBook, brought to you in partnership with BMC.

I recently described the new lucene-c-boost github project, which provides amazing speedups (up to 7.8X faster) for common Lucene query types using specialized C++ implementations via JNI.

The code works with a stock Lucene 4.3.0 JAR and default codec, and has a trivial API: just call NativeSearch.search instead of IndexSearcher.search.

Now, a quick update: I've optimized PhraseQuery now as well:

Task QPS base  StdDev base  QPS opt  StdDev opt  % change
HighPhrase 3.5 (2.7%) 6.5 (0.4%) 1.9 X
MedPhrase 27.1 (1.4%) 51.9 (0.3%) 1.9 X
LowPhrase 7.6 (1.7%) 16.4 (0.3%) 2.2 X

~2X speedup (~90% - ~119%) is nice!

Again, it's great to see a reduced variance on the runtimes since hotspot is mostly not an issue. It's odd that LowPhrase gets slower QPS than MedPhrase: these queries look mis-labelled (I see the LowPhrase queries getting more hits than MedPhrase!).

All changes have been pushed to lucene-c-boost; next I'd like to figure out how to get facets working.

Evolve your approach to Application Performance Monitoring by adopting five best practices that are outlined and explored in this e-book, brought to you in partnership with BMC.


Published at DZone with permission of Michael Mccandless, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.


Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.


{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}