DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
View Events Video Library
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Integrating PostgreSQL Databases with ANF: Join this workshop to learn how to create a PostgreSQL server using Instaclustr’s managed service

Mobile Database Essentials: Assess data needs, storage requirements, and more when leveraging databases for cloud and edge applications.

Monitoring and Observability for LLMs: Datadog and Google Cloud discuss how to achieve optimal AI model performance.

Automated Testing: The latest on architecture, TDD, and the benefits of AI and low-code tools.

Related

  • A Plan for Performance Bugs in 10 Steps: When Managers Want Answers Now
  • Get Started With Vue Grid in 5 Minutes
  • Leveraging Observability Techniques for Energy Efficiency Optimization in Data Centers
  • Unlocking the Power of AIOps: Enhancing DevOps With Intelligent Automation for Optimized IT Operations

Trending

  • A Better Web3 Experience: Account Abstraction From Flow (Part 1)
  • Apache Flink
  • Supercharge Your Communication With Twilio and Ballerina
  • The Promise of Personal Data for Better Living
  1. DZone
  2. Culture and Methodologies
  3. Agile
  4. Solr Index – delete or update?

Solr Index – delete or update?

Rafał Kuć user avatar by
Rafał Kuć
·
Aug. 03, 11 · News
Like (0)
Save
Tweet
Share
10.29K Views

Join the DZone community and get the full member experience.

Join For Free

From time to time, in working with Solr there is a common problem – when do you update the Solr index structure. There are various reasons for these changes – the new functional requirements, optimization, or anything else – it is not important. What is important is the questions that arise – should we remove the index, or simply change the structure and do a full indexing? Contrary to appearances, the answer to this question depends on the changes we made in the structure of the index.

Personally, I am an advocate of solutions that have the smallest chance to cause problems – I just like to sleep at night. I think that removing the index after updateing its structure and then doing the full indexation of the data is one of those solutions, at least in my opinion. I am aware, however, that this type of solution is not always acceptable. So when are we not forced to remove the index, and when will it expose us to potential problems with the Solr when we don't do it?

The answer to the question depends on what changed in the structure of the index. Such changes can be divided into three areas covering most of the changes that we make in the structure of the index:

  • Adding / removing new field
  • Similarity modification
  • Field modification


Adding / removing new field

In the case of the first type of modification of the matter is quite simple – if we add or remove a new field to schema.xml there is no need to remove the entire index before re-indexing. Solr handles adding a new field to the current index. Of course, you should be aware that the documents which will not be after this operation will not be re-indexed or automatically updated.

Similarity modification

In the second case – the change of the class that is responsible for Similarity also does not force us to to delete the index after the change. But unlike the previous example, if we want Solr to correctly calculate the score, and thus to sort in the correct order we will be forced to re-index all documents previously present in the index.

Field modification

Let's stop a minute on the third case. Let’s suppose that we modify slightly the field in the index for a prosaic reason – we are no longer are interested in the normalization of its length. We set omitNorms=”true” (I assume that the previous setting was omitNorms=”false”). If we re-index all the documents, the Lucene indexes, in the combined segments, will still have information about length normalization of the field. Something went wrong. This is precisely the case when it is necessary to delete the index after the change to its structure, and prior to full indexation. At first glance, it seems that this is a very small change, but thinking further, we have some side effects of the change. It is worth remembering that some of the field properties are overwritten by others, as in the case of normalization of the length – if one segment will have length normalization, and the second will not, when you combine the segments you will have length normalization in the one that was created.

Document Requirement Data (computing) optimization Property (programming) Sort (Unix) Lucene

Published at DZone with permission of Rafał Kuć, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • A Plan for Performance Bugs in 10 Steps: When Managers Want Answers Now
  • Get Started With Vue Grid in 5 Minutes
  • Leveraging Observability Techniques for Energy Efficiency Optimization in Data Centers
  • Unlocking the Power of AIOps: Enhancing DevOps With Intelligent Automation for Optimized IT Operations

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: