Over a million developers have joined DZone.

Moving readVInt to C

· Java Zone

Microservices! They are everywhere, or at least, the term is. When should you use a microservice architecture? What factors should be considered when making that decision? Do the benefits outweigh the costs? Why is everyone so excited about them, anyway?  Brought to you in partnership with IBM.

By far the hottest spot in Lucene during searching is the method (DataInput.readVInt) that decodes Lucene's variable-length integer representation (vInt). This method is called an insane number of times, while iterating the postings lists (docs, freqs, positions), during query execution.

This representation is not CPU friendly: for every single byte read, the CPU hits a hard-to-predict if statement (testing the high bit). Alternative block-based formats do a good job reducing branches while decoding, such as these encodings (PFOR-DELTA is currently an experimental codec attached as a patch on LUCENE-1410) or Google's Group Varint.

Anyway, I decided to test whether porting readVInt to C would gives us a performance boost. I had known the overhead of JNI was highish in the past, but I was hoping in modern JREs this had been improved.

Unfortunately, it wasn't: I see a ~10%-~40% slowdown, depending on the query. Likely hotspot's inability to inline native methods is also a factor here. Perhaps, if/when we switch to a block-based codec, we'll see a gain with a native implementation, since the amortized overhead of invoking JNI will be lower.

Discover how the Watson team is further developing SDKs in Java, Node.js, Python, iOS, and Android to access these services and make programming easy. Brought to you in partnership with IBM.

Topics:

Published at DZone with permission of Michael Mccandless, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}