Solr has been running out of heap memory while trying to add a *small* number of documents to my 11,000,000 document Wikipedia index. So, diving bravely into the world of Java heap memory …
Because I am indexing diverse types of data (Wikipedia and Nutch to begin with), I have a lot of fields: I count 31 without omitNorms values, which is false by default.
11,000,000 * 1 * 31 = 31 x 10M = 310MB RAM all by itself. So time to fire up the schema editor! -- Fred Zimmerman
I would also check out Fred's posts on Indexing Nutch Crawls and DataImportHandler Commands.