Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Do I have to look for maxBooleanClauses when using filters?

DZone's Guide to

Do I have to look for maxBooleanClauses when using filters?

· Java Zone ·
Free Resource

Build vs Buy a Data Quality Solution: Which is Best for You? Gain insights on a hybrid approach. Download white paper now!

One of the configuration variables we can find in the solrconfig.xml file is maxBooleanClauses, which specifies the maximum number of boolean clauses that can be combined in a single query. The question is, do I have to worry about it when using filters in Solr? Let’s try to answer that question without getting into Lucene and Solr source code.

Query

Let’s assume that we have the following query we want to change:

 
q=category:1 AND category:2 AND category:3 ... AND category:2000

Sending such a query to the Solr instance with the default configuration would result in the following exception: “too many boolean clauses“. Of course we could modify the maxBooleanClauses variable in solrconfig.xml file and get rid of the exception, but let’s try do it the other way:

Let’s change the query to use filters

So, we change the above query to use filters – the fq parameter:

 
q=*:*&fq=category:(1 2 3 ... 2000)

We send the query to Solr and … and again the same situation happens – exception with the “too many boolean clauses” message. It happens because Solr has to “calculate” filter content and thus run the appropriate query. So, let’s modify the query once again:

Final query change

After the final modification our query should look like this:

 
q=*:*&fq=category:1&fq=category:2&fq=category:3&....&fq=category:2000

After sending such a query we will get the search results (of course if there are documents matching the query in the index). This time Solr didn’t have to run a single, large boolean query and that’s why we didn’t exceed the maxBooleanClauses limit.

To sum up

As you can see the answer to the question asked in the begining of the post depend on the query we want to run. If our query is using AND boolean operator we can use fq parameter because multiple fq parameters are concatenated using AND. However if we have to use OR we would have to change the limit defined by maxBooleanClauses configuration variable. But please remember that changing this limit can have a negative effect on performance and memory usage.



Source:   http://solr.pl/en/2011/12/19/do-i-have-to-look-for-maxbooleanclauses-when-using-filters


Build vs Buy a Data Quality Solution: Which is Best for You? Maintaining high quality data is essential for operational efficiency, meaningful analytics and good long-term customer relationships. But, when dealing with multiple sources of data, data quality becomes complex, so you need to know when you should build a custom data quality tools effort over canned solutions. Download our whitepaper for more insights into a hybrid approach.

Topics:

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}