Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Solr Autocomplete with Document Suggestions

DZone's Guide to

Solr Autocomplete with Document Suggestions

· Java Zone
Free Resource

Learn how to troubleshoot and diagnose some of the most common performance issues in Java today. Brought to you in partnership with AppDynamics.

Solr 3.5 comes with a nice autocomplete/typeahead component that is based on the SolrSpellCheckComponent.

You provide it a query and a field, and the Suggester returns a list of suggestions based on the query. For example:

<?xml version="1.0" encoding="UTF-8"?>
<response>
  <lst name="spellcheck">
    <lst name="suggestions">
      <lst name="ac">
        <int name="numFound">2</int>
        <int name="startOffset">0</int>
        <int name="endOffset">2</int>
        <arr name="suggestion">
          <str>acquire</str>
          <str>accommodate</str>
        </arr>
      </lst>
      <str name="collation">acquire</str>
    </lst>
  </lst>
</response>

Nice.

Now what if, as part of the autocomplete request, you needed a list of documents that contain the suggested terms for the given field? That's what I'm about to cover here.

TermDocs is your friend

The basic idea here is to call reader.termDocs() for each term, collect the document ids, and use that as the basis of a docslice. Here are relevant bits of code.

AND the doc ids for the various suggestions into a single docset.

NamedList spellcheck = (NamedList) rb.rsp.getValues().get("spellcheck");
NamedList suggestions = (NamedList) spellcheck.get("suggestions");
final SolrIndexReader reader = rb.req.getSearcher().getReader();
OpenBitSet docset = null;
for (int i = 0; i < suggestions.size(); ++i) {
  String name = suggestions.getName(i);
  if ("collation".equals(name)) continue;
  NamedList query = (NamedList) suggestions.getVal(i);
  Set<String> suggestion = (Set<String>) query.get("suggestion");

  OpenBitSet docs = collectDocs(field, reader, result);
  if (docset == null) docset = docs;
  else {
    docset.and(docs);
  }
}
 

collectDocs is implemented here:

private OpenBitSet collectDocs(String field, SolrIndexReader reader, Set<String> terms) throws IOException {
  OpenBitSet docset = new OpenBitSet();
  TermDocs te = reader.termDocs();
  for (String s : terms) {
    Term t = new Term(field, s);
    te.seek(t);
    while (te.next()) {
      docset.set(te.doc());
    }
  }
  te.close();
  return docset;
}

Now with the OpenBitSet of document ids matching the suggested terms, you can return a list of documents.

One problem is that you don't have document scores since no search was actually performed. Ideally, you'd want to return the documents in sorted by some field, and use the field value as the score.



Understand the needs and benefits around implementing the right monitoring solution for a growing containerized market. Brought to you in partnership with AppDynamics.

Topics:

Published at DZone with permission of Kelvin Tan. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}