DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone >

Lucene.net: Your First Application

Simone Chiaretta user avatar by
Simone Chiaretta
·
Sep. 14, 09 · · News
Like (0)
Save
Tweet
6.24K Views

Join the DZone community and get the full member experience.

Join For Free

In the first two posts of the tutorial you learnt how to get the latest version of Lucene.net, where to get the (little) documentation available, which are the main concepts of Lucene.net and Lucene.net main development steps.

In this third post I’m going to put in practice all the concepts explained the previous post, writing a simple console application that indexes the text entered in the console.

I’ll refer to the steps I outlined in my previous post. So if you haven’t already I recommend you go back and read it.

Step 1 – Initialize the Directory and the IndexWriter

As I said in my previous post, there are two possible Directory you can use: one based on the file system and one based on RAM. You’d usually want to use the FS based one: it’s pretty fast anyway, and you don’t need to constantly dump it to the filesystem. Probably the RAM is more a test fake than something to use for real in production.

And once you have instantiated the Directory you have to open an IndexWriter on it.

Directory directory = FSDirectory.GetDirectory("LuceneIndex");
Analyzer analyzer = new StandardAnalyzer();
IndexWriter writer = new IndexWriter(directory, analyzer);

If you are not interested in getting the reference to the Directory, you don’t want to call additional methods on it, and you are interested in just a FSDirectory, you use the short version, and create the IndexWriter with just one line of code.

IndexWriter writer = new IndexWriter("LuceneIndex", analyzer);

Step 2 – Add Documents to the index

I’ll cover this topic more in depth in a subsequent post, but the basic code for adding a document to the index is pretty straightforward. Create a document, add some fields to it, and then add the document to the Index.

Document doc = new Document();
doc.Add(new Field("id", i.ToString(), Field.Store.YES, Field.Index.NO));
doc.Add(new Field("postBody", text, Field.Store.YES, Field.Index.TOKENIZED));
writer.AddDocument(doc);

And when you are done with adding all the documents you need, you might call the Optimize method “priming the index for the fastest available search”, and later either Flush to commit all the updates to the Directory or, if you don’t need to add to the index any more, call the Close method to flush and then close all the files in the Directory.

writer.Optimize();
//Close the writer
writer.Flush();
writer.Close();

Step 3 – Create the Query

The Query can be either created via API or parsing Lucene query syntax with the QueryParser.

QueryParser parser = new QueryParser("postBody", analyzer);
Query query = parser.Parse("text");

or

Query query = new TermQuery(new Term("postBody", "text"));

The two snippets are functionally the same, so when is it good to use the API and when to use the QueryParser? I personally would use the QueryParser when the search string is supplied by the user, and I’d use directly the API when the query is generated by your code.

Step 4 – Pass the Query to the IndexSearcher

Once you have your Query, all you need is passing it to the Search method of the IndexSearcher.

//Setup searcher
IndexSearcher searcher = new IndexSearcher(directory);
//Do the search
Hits hits = searcher.Search(query);

The Searcher must be instantiated before the usage and, for performance reasons, it’s recommended that only one Searcher is open. So open one and use it in all your searches. This might pose some issues in multi-thread environment (like in web applications), but we’ll come to this topic in a future post.

Step 5 – Iterates over the Results

The Search method returns a Hits object, which contains all the documents returned by the query. To list the results, just loop through all the results.

int results = hits.Length();
Console.WriteLine("Found {0} results", results);
for (int i = 0; i < results; i++)
{
Document doc = hits.Doc(i);
float score = hits.Score(i);
Console.WriteLine("Result num {0}, score {1}", i+1,score);
Console.WriteLine("ID: {0}", doc.Get("id"));
Console.WriteLine("Text found: {0}" + Environment.NewLine, doc.Get("postBody"));
}

You get the current Document using the Doc(num) method, and the Score (which is a unbund float) using the Score(num) method. You might notice that this a pretty strange API compared to what we are used in .NET. I might have expected to do a foreach over the returned Hits object. Probably this is due to the API being a class-per-class port of the Java version, and so it uses the API design conventions that are typical of the Java world. We can debate over this purist-port approach vs a more idiomatic one for ages, but that’s the way it is.

Step 6 – Close everything

Once you are done with everything, you need to close all the resources: Directory and IndexSearcher.

searcher.Close();
directory.Close();

Get the code

You can download a short sample application that stitch together all that code into a console application that lets you index any text you enter, and later search for it.

Download the sample code

What’s next

This was a very simple application: it was single-threaded and had both the indexing and searching phases in the same piece of code. But before going into the details of the implementation I’m doing for Subtext, in the next post I’ll cover the concept of document and fields more in depth.

application

Published at DZone with permission of Simone Chiaretta, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • 6 Things Startups Can Do to Avoid Tech Debt
  • Password Authentication. How to Correctly Do It.
  • DZone's Article Submission Guidelines
  • Ultra-Fast Microservices: When Microstream Meets Wildfly

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo