Over a million developers have joined DZone.

A Phrase-based, Out-of-order Solr Autocomplete Suggester

DZone's Guide to

A Phrase-based, Out-of-order Solr Autocomplete Suggester

· Big Data Zone
Free Resource

Need to build an application around your data? Learn more about dataflow programming for rapid development and greater creativity. 

Solr has a number of Autocomplete implementations that are great for most purposes. However, a client of mine recently had some fairly specific requirements for Autocomplete:

1. Phrase-based substring matching
2. Out-of-order matches ('foo bar' should match 'the bar is foo')
3. Fallback matching to a secondary field when substring matching on the primary field fails, e.g., 'windstopper jac' doesn't match anything on the 'title' field, but matches on the 'category' field

The most direct way to model this would probably have been to create a separate Solr core and use n-gram plus shingles indexing, along with Solr queries, to obtain results. However, because the index was fairly small, I decided to go with an in-memory approach.

The general strategy was:

1. For each entry in the primary field, create n-gram tokens, adding entries to a Guava Table where key is n-gram, column is string and value is a distance score.
2. For each entry in the secondary field, create n-gram tokens and add entries to a Guava Multimap where key is n-gram and value is term.
3. When an Autocomplete query is received, split it by space, then do look-ups against the primary table.
4. If no matches are found, look-up against the secondary Multimap.
5. Return results.

The scoring for the primary table was a simple one based on length of word and distance of token from the start of the string.

Check out the Exaptive data application Studio. Technology agnostic. No glue code. Use what you know and rely on the community for what you don't. Try the community version.


Published at DZone with permission of Kelvin Tan. See the original article here.

Opinions expressed by DZone contributors are their own.


Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.


{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}