Discover how AppDynamics steps in to upgrade your performance game and prevent your enterprise from these top 10 Java performance problems, brought to you in partnership with AppDynamics.
I found another post today from the blogger who decided to clone Wikipedia and index it with Solr
. This time he's got a short commentary which can serve as useful advice to you search indexer's out there.
Solr has been running out of heap memory while trying to add a
*small* number of documents to my 11,000,000 document Wikipedia index.
So, diving bravely into the world of Java heap memory …
Because I am indexing diverse types of data (Wikipedia and Nutch to
begin with), I have a lot of fields: I count 31 without omitNorms
values, which is false by default.
11,000,000 * 1 * 31 = 31 x 10M = 310MB RAM all by itself. So time to fire up the schema editor! -- Fred Zimmerman
I would also check out Fred's posts on Indexing Nutch Crawls
and DataImportHandler Commands
The Java Zone is brought to you in partnership with AppDynamics. AppDynamics helps you gain the fundamentals behind application performance, and implement best practices so you can proactively analyze and act on performance problems as they arise, and more specifically with your Java applications. Start a Free Trial.