DZone
Java Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Java Zone > Updating a Solr Analysis Plugin from 1.4.1 (Lucene 2.9) to Solr / Lucene 4.0 (current trunk)

Updating a Solr Analysis Plugin from 1.4.1 (Lucene 2.9) to Solr / Lucene 4.0 (current trunk)

Mats Lindh user avatar by
Mats Lindh
·
Jul. 05, 11 · Java Zone · Interview
Like (0)
Save
Tweet
7.64K Views

Join the DZone community and get the full member experience.

Join For Free

Three years and a couple of weeks ago I wrote a post about how to get started writing a simple Solr Analysis Plugin to handle incoming tokens and modifying them in place when an update is requested.

Since then the whole version number structure of Solr has changed (and is now in sync with the underlying Lucene version), and not surprisingly, the current API has also been updated. This means that a few small changes are required to get your analysis plugins running on the current trunk of Lucene and Solr.

The main change is that the previously named TermAttribute is now named CharTermAttribute, this means that any imports will have to change:

    - import org.apache.lucene.analysis.tokenattributes.TermAttribute;
    + import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;

Any declarations of TermAttributes will need to be CharTermAttributes instead:

    - private TermAttribute termAtt;
    + private CharTermAttribute termAtt;

public NorwegianNameFilter(TokenStream input)
  {
      super(input);
-     termAtt = (TermAttribute) addAttribute(TermAttribute.class);
+     termAtt = input.getAttribute(CharTermAttribute.class);
  }

We now fetch the attribute from the current TokenStream (not sure if the old way I did it has been deprecated, but this seems to be the suggested way now). We also change any references to TermAttribute.class to CharTermAttribute.class.

The actual TermAttribute interface has also changed, meaning we’ll have to change a few of the old method calls:

    - termAtt.setTermLength(this.parseBuffer(termAtt.termBuffer(), termAtt.termLength()));
    + termAtt.setLength(this.parseBuffer(termAtt.buffer(), termAtt.length()));

.setTermLength() => .setLength()
.termBuffer => .buffer()
.termLength => .length()

The methods will behave in the same manner as in the previous API, .buffer() will retrieve a char array (char[]) which is the current buffer of the actual term which can you modify in place, while length() and setLength() retrieves the current length of the buffer (the buffer can be larger than the part used) and sets the new length of the buffer (if you’re collapsing characters).

The new implementation of our analysis filter skeleton:

    package no.derdubor.solr.analysis;
     
    import java.io.IOException;
    import org.apache.lucene.analysis.Token;
    import org.apache.lucene.analysis.TokenFilter;
    import org.apache.lucene.analysis.TokenStream;
    import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
     
    public class NorwegianNameFilter extends TokenFilter
    {
        private CharTermAttribute termAtt;
     
        public NorwegianNameFilter(TokenStream input)
        {
            super(input);
            termAtt = input.getAttribute(CharTermAttribute.class);
        }
     
        public boolean incrementToken() throws IOException
        {
            if (this.input.incrementToken())
            {
                termAtt.setLength(this.parseBuffer(termAtt.buffer(), termAtt.length()));
                return true;
            }
           
            return false;
        }
       
        protected int parseBuffer(char[] buffer, int bufferLength)
        {
     
        }
    }

From http://e-mats.org/2011/07/updating-a-solr-analysis-plugin-from-1-4-1-lucene-2-9-to-solr-lucene-4-0-current-trunk/

Lucene Trunk (software)

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • How to Properly Format SQL Code
  • Datafaker: An Alternative to Using Production Data
  • Troubleshooting Memory Leaks With Heap Profilers
  • Composable Architecture

Comments

Java Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo