Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Solr Cloud Performance - Windows Server 2012 R2

DZone's Guide to

Solr Cloud Performance - Windows Server 2012 R2

· Cloud Zone ·
Free Resource

Discover a centralized approach to monitor your virtual infrastructure, on-premise IT environment, and cloud infrastructure – all on a single platform.

Hi all,

I would like to share with you the results and conclusions from our last Solr Cloud 4.8.1 performance tests at NICE.

Queries were generated randomly using a JMeter and contained the following parameters:

  • Main query of 2-3 random words
  • Filter query on a multi-value field with 6-10 terms
  • 70% of the queries contained additional simple filter on a string field
  • Highlight request with 50 snippets and fragsize of 50
  • 7 return fields - no textual data 

Document structure was:

  • 3 stored multi-value string fields with ~6 values
  • 8 numeric stored fields
  • 2 stored textual fields with ~400 words
  • 3 simple stored strings
  • 1 boolean field
  • 1 date field
  • All fields were indexed


General information:

  • Single machine Xeon-2650 @2.00 GHz, 192gb DDR3 RAM, 16 cores 32 logical processors, 8 Solr 4.8.1 instances on Jetty, 5x15K hard drives in RAID-0.
  • Total index size is 1.91TB containing 102222957 documents.
  • Single document size in the index is ~0.019Mb.
  • Queries were sent to a mediator service which forward the queries to Solr (does not manipulate the data), the average latency of the service is 6ms.
  • OS - Windows server 2012 R2 Standard
  • Jetty JVM parameters:  -server -XX:CMSInitiatingOccupancyFraction=80 -XX:+UseParNewGC-XX:+UseConcMarkSweepGC


Main conclusions:

  1. Splitting the same index to more shards increase performance even on the same number of nodes (Solr instance) and same RAM - See the difference between test #1 and test #2 as the average query time reduces from 10 seconds to 3.3 seconds just by splitting the shards.
  2. OS cache has a significant role in performance results – during the tests we saw that as time goes by the average query time improved in more than X10 (same behaviour for Linux and Windows). Though, adding more RAM to Solr doesn't necessarily improves query performance. In test #4 (isn't detailed below) we increased every Solr node RAM from 8Gb to 10Gb but the performance decreased from average of 1.5 seconds to 4.1 seconds.
  3. Adding the 'cost' parameter to complex query filters makes a huge improvement in query performance. One of our filters was a complex query filter, once we set the filter cost to 110 query performance reduced from 1502ms in test #3 to 220ms.  By  setting the cost to a value greater than 100 Solr executes the filter after the main query.

  4. Splitting the data to more shards decrease the CPU usage bit increase the disk reads – this behavior is natural as Solr has less lookups to make inside the index data structure (less CPU) but maintains more index files (more IO).See how the CPU usage decreased from 30% in test #1 to 8% in test #3.

  5. Solr startup is much faster as the number of shards increased – we saw that the elapsed time from the moment we started the cluster until we could get a query results decreased significantly when we had 16 shards over 8 shards, as each shard holds less data (1/16 and not 1/8) the start time is faster.

Query Performance

Test

Index Size in #Docs

Average Time (including network latency of 6ms)

Median Query Time

User/Sec

DocCache hit

FilterCache hit

Solr Heap (per single instance)

Solr CPU Average

Machine CPU

Machine RAM

Disk Read Rate

Notes

1

102M

(8 shards - 1 shard/node)

10 sec

9.1 sec

0.8/sec

(8 threads)

0.15

0.44

~5gb (out of 6gb)

~30%

100%

99% (cache)

~2MB/Sec

2

102M

(16 shards - 2 shard/node)

3324ms (5 threads)

Additional counters:

1950ms (3 threads)

1222ms (1 threads)

3323ms

1.5/ sec

(5 threads)

0.11

0.44

4.36 (out of 6gb)

15%

100%

100%

~12MB/sec


3

102M

(16 shards - 2 shard/node)

1502ms (5 threads)


1526ms

3.3/ sec

(5 threads)

0.11

0.44

5.5 (out of 8gb)

~8%

60%

100%

~18MB/sec


Learn how to auto-discover your containers and monitor their performance, capture Docker host and container metrics to allocate host resources, and provision containers.

Topics:

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}