DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Lucene.net: Your First Application

Simone Chiaretta user avatar by
Simone Chiaretta
·
Sep. 14, 09 · News
Like (0)
Save
Tweet
Share
6.40K Views

Join the DZone community and get the full member experience.

Join For Free

In the first two posts of the tutorial you learnt how to get the latest version of Lucene.net, where to get the (little) documentation available, which are the main concepts of Lucene.net and Lucene.net main development steps.

In this third post I’m going to put in practice all the concepts explained the previous post, writing a simple console application that indexes the text entered in the console.

I’ll refer to the steps I outlined in my previous post. So if you haven’t already I recommend you go back and read it.

Step 1 – Initialize the Directory and the IndexWriter

As I said in my previous post, there are two possible Directory you can use: one based on the file system and one based on RAM. You’d usually want to use the FS based one: it’s pretty fast anyway, and you don’t need to constantly dump it to the filesystem. Probably the RAM is more a test fake than something to use for real in production.

And once you have instantiated the Directory you have to open an IndexWriter on it.

Directory directory = FSDirectory.GetDirectory("LuceneIndex");
Analyzer analyzer = new StandardAnalyzer();
IndexWriter writer = new IndexWriter(directory, analyzer);

If you are not interested in getting the reference to the Directory, you don’t want to call additional methods on it, and you are interested in just a FSDirectory, you use the short version, and create the IndexWriter with just one line of code.

IndexWriter writer = new IndexWriter("LuceneIndex", analyzer);

Step 2 – Add Documents to the index

I’ll cover this topic more in depth in a subsequent post, but the basic code for adding a document to the index is pretty straightforward. Create a document, add some fields to it, and then add the document to the Index.

Document doc = new Document();
doc.Add(new Field("id", i.ToString(), Field.Store.YES, Field.Index.NO));
doc.Add(new Field("postBody", text, Field.Store.YES, Field.Index.TOKENIZED));
writer.AddDocument(doc);

And when you are done with adding all the documents you need, you might call the Optimize method “priming the index for the fastest available search”, and later either Flush to commit all the updates to the Directory or, if you don’t need to add to the index any more, call the Close method to flush and then close all the files in the Directory.

writer.Optimize();
//Close the writer
writer.Flush();
writer.Close();

Step 3 – Create the Query

The Query can be either created via API or parsing Lucene query syntax with the QueryParser.

QueryParser parser = new QueryParser("postBody", analyzer);
Query query = parser.Parse("text");

or

Query query = new TermQuery(new Term("postBody", "text"));

The two snippets are functionally the same, so when is it good to use the API and when to use the QueryParser? I personally would use the QueryParser when the search string is supplied by the user, and I’d use directly the API when the query is generated by your code.

Step 4 – Pass the Query to the IndexSearcher

Once you have your Query, all you need is passing it to the Search method of the IndexSearcher.

//Setup searcher
IndexSearcher searcher = new IndexSearcher(directory);
//Do the search
Hits hits = searcher.Search(query);

The Searcher must be instantiated before the usage and, for performance reasons, it’s recommended that only one Searcher is open. So open one and use it in all your searches. This might pose some issues in multi-thread environment (like in web applications), but we’ll come to this topic in a future post.

Step 5 – Iterates over the Results

The Search method returns a Hits object, which contains all the documents returned by the query. To list the results, just loop through all the results.

int results = hits.Length();
Console.WriteLine("Found {0} results", results);
for (int i = 0; i < results; i++)
{
Document doc = hits.Doc(i);
float score = hits.Score(i);
Console.WriteLine("Result num {0}, score {1}", i+1,score);
Console.WriteLine("ID: {0}", doc.Get("id"));
Console.WriteLine("Text found: {0}" + Environment.NewLine, doc.Get("postBody"));
}

You get the current Document using the Doc(num) method, and the Score (which is a unbund float) using the Score(num) method. You might notice that this a pretty strange API compared to what we are used in .NET. I might have expected to do a foreach over the returned Hits object. Probably this is due to the API being a class-per-class port of the Java version, and so it uses the API design conventions that are typical of the Java world. We can debate over this purist-port approach vs a more idiomatic one for ages, but that’s the way it is.

Step 6 – Close everything

Once you are done with everything, you need to close all the resources: Directory and IndexSearcher.

searcher.Close();
directory.Close();

Get the code

You can download a short sample application that stitch together all that code into a console application that lets you index any text you enter, and later search for it.

Download the sample code

What’s next

This was a very simple application: it was single-threaded and had both the indexing and searching phases in the same piece of code. But before going into the details of the implementation I’m doing for Subtext, in the next post I’ll cover the concept of document and fields more in depth.

application

Published at DZone with permission of Simone Chiaretta, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Asynchronous HTTP Requests With RxJava
  • Hidden Classes in Java 15
  • Remote Debugging Dangers and Pitfalls
  • How Observability Is Redefining Developer Roles

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: