Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Fixing Solr Java Heap OOM with “OmitNorms=true”

DZone's Guide to

Fixing Solr Java Heap OOM with “OmitNorms=true”

· Java Zone ·
Free Resource

The CMS developers love. Open Source, API-first and Enterprise-grade. Try BloomReach CMS for free.

I found another post today from the blogger who decided to clone Wikipedia and index it with Solr.  This time he's got a short commentary which can serve as useful advice to you search indexer's  out there.

Solr has been running out of heap memory while trying to add a *small* number of documents to my 11,000,000 document Wikipedia index. So, diving bravely into the world of Java heap memory …

Because I am indexing diverse types of data (Wikipedia and Nutch to begin with), I have a lot of fields: I count 31 without omitNorms values, which is false by default.

11,000,000 * 1 * 31 = 31 x 10M = 310MB RAM all by itself. So time to fire up the schema editor!  -- Fred Zimmerman


I would also check out Fred's posts on Indexing Nutch Crawls and DataImportHandler Commands.

Source:  http://business.zimzaz.com/wordpress/2011/10/fixing-solr-heap-oom-with-omitnormstrue/

BloomReach CMS: the API-first CMS of the future. Open-source & enterprise-grade. - As a Java developer, you will feel at home using Maven builds and your favorite IDE (e.g. Eclipse or IntelliJ) and continuous integration server (e.g. Jenkins). Manage your Java objects using Spring Framework, write your templates in JSP or Freemarker. Try for free.

Topics:

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}