{{ !articles[0].partner.isSponsoringArticle ? "Platinum" : "Portal" }} Partner

Fixing Solr Java Heap OOM with “OmitNorms=true”

I found another post today from the blogger who decided to clone Wikipedia and index it with Solr.  This time he's got a short commentary which can serve as useful advice to you search indexer's  out there.

Solr has been running out of heap memory while trying to add a *small* number of documents to my 11,000,000 document Wikipedia index. So, diving bravely into the world of Java heap memory …

Because I am indexing diverse types of data (Wikipedia and Nutch to begin with), I have a lot of fields: I count 31 without omitNorms values, which is false by default.

11,000,000 * 1 * 31 = 31 x 10M = 310MB RAM all by itself. So time to fire up the schema editor!  -- Fred Zimmerman

I would also check out Fred's posts on Indexing Nutch Crawls and DataImportHandler Commands.

Source:  http://business.zimzaz.com/wordpress/2011/10/fixing-solr-heap-oom-with-omitnormstrue/
{{ tag }}, {{tag}},

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}
{{ parent.authors[0].realName || parent.author}}

{{ parent.authors[0].tagline || parent.tagline }}

{{ parent.views }} ViewsClicks