DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Coding
  3. Java
  4. Getting Real-Time Field Values in Lucene

Getting Real-Time Field Values in Lucene

Michael Mccandless user avatar by
Michael Mccandless
·
Jan. 28, 13 · Interview
Like (0)
Save
Tweet
Share
3.59K Views

Join the DZone community and get the full member experience.

Join For Free

We know Lucene's near-real-time search is very fast: you can easily refresh your searcher once per second, even at high indexing rates, so that any change to the index is available for searching or faceting at most one second later. For most applications this is plenty fast. 

But what if you sometimes need even better than near-real-time? What if you need to look up truly live or real-time values, so for any document id you can retrieve the very last value indexed? 

Just use the newly committedLiveFieldValues class! 

It's simple to use: when you instantiate it you provide it with your SearcherManager orNRTManager, so that it can subscribe to the RefreshListener to be notified when new searchers are opened, and then whenever you add, update or delete a document, you notify theLiveFieldValues instance. Finally, call the get method to get the last indexed value for a given document id. 

This class is simple inside: it holds the values of recently indexed documents in aConcurrentHashMap, keyed by the document id, to hold documents that were just indexed but not yet available through the near-real-time searcher. Whenever a new near-real-time searcher is successfully opened, it clears the map of all entries that are now included in that searcher. It carefully handles the transition time from when the reopen started to when it finished by checking two maps for the possible value, and failing that, it falls back to the current searcher. 

LiveFieldValues is abstract: you must subclass it and implement the lookupFromSearchermethod to retrieve a document's value from an IndexSearcher, since how your application stores the values in the searcher is application dependent (stored fields, doc values or even postings, payloads or term vectors). 

Note that this class only offers "live get", i.e. you can get the last indexed value for any document, but it does not offer "live search", i.e. you cannot search against the value until the searcher is reopened. Also, the internal maps are only pruned after a new searcher is opened, so RAM usage will grow unbounded if you never reopen! It's up to your application to ensure that the same document id is never updated simultaneously (in different threads) because in that case you cannot know which update "won" (Lucene does not expose this information, althoughLUCENE-3424 is one possible solution for this). 

An example use-case is to store a version field per document so that you know the last version indexed for a given id; you can then use this to reject a later but out-of-order update for that same document whose version is older than the version already indexed. 

LiveFieldValues will be available in the next Lucene release (4.2).

Lucene Document

Published at DZone with permission of Michael Mccandless, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • The Future of Cloud Engineering Evolves
  • 5 Factors When Selecting a Database
  • Deploying Java Serverless Functions as AWS Lambda
  • Bye Bye, Regular Dev [Comic]

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: