Over a million developers have joined DZone.

Solr New Metrics API: Quick Look at Solr 6.4

DZone's Guide to

Solr New Metrics API: Quick Look at Solr 6.4

The new Solr metrics API helps provide a convenient method to fetch data in various formats and makes it possible to report metrics to external systems.

· Big Data Zone
Free Resource

Learn best practices according to DataOps. Download the free O'Reilly eBook on building a modern Big Data platform.

At Sematext, we love logs and metrics and we enjoy playing with them on a daily basis. We have our Logsene, which is all about logs and their analysis. We have our SPM, which can monitor half of the world when it comes to performance metrics, give you a perfect view of class level performance with transaction tracing, and give you a perfect view of your environment with components view. And that’s not all.

Because we enjoy working with metrics, we decided to take a look at the shiny, new Solr metrics API that came out with recently released Solr 6.4. It comes with the promise of providing a convenient method to fetch data in various formats. It also makes it possible to report metrics to external systems. As metrics aficionados and Solr users ourselves, we are thrilled to have this at our disposal and jumped on the opportunity to play around with Solr’s new metrics goodness. Let’s start with some history on how it used to be in the good old days.

Back in the Day...

A long, long time ago, in the Apache country ruled by king Solr and queen Lucene, there was a messenger called JMX. OK, wrong story. To make a long story short, until now, the only thing you could do to fetch metrics from Solr is to use JMX. If you wanted to see what is happening inside Solr, you had to either use the Solr admin panel and look into what you were interested in or use a Java tool that could read those metrics. Actually, any kind of tool that is able to read data from JMX (for example, jconsole) that is shipped in the JDK package, can be used to fetch metrics from Solr.

solr new metrics api

This gives us some visibility, but there are issues with such an approach. The main problem is that we don’t have a full view of the historical data. We see cumulative statistics or changes since the time we started looking into metrics, but if we want to see what was happening one hour ago, we couldn’t do that or at least not out of the box.

Of course, those obstacles can be overcome by using monitoring tools for Solr. We can either use something open source like Ganglia and configure everything from scratch using information available out in the wild places of the Internet or we can use something that does everything for us and we only need to install an agent, as is the case with SPM from Sematext.

solr new metrics api

However, if you are allergic to any kind of UI that helps you with understanding your metrics you were forced to use JMX and tools that could fetch metrics from it. Until now.

Metrics in Solr: The New Kid on the Block

So, what can we do now? Well, with SOLR-4735, Solr metrics evolved. Metrics are no longer limited to being exposed for polling via JMX. Instead, Solr can now also send metrics to external destinations and file, exposes a nice REST API and an extended set of metrics. What’s more, the extended metrics are now not only about Solr components, but you can also retrieve JVM level information or Jetty and HTTP related statistics. Sounds fun if you ask me!

However, this is not all. Right now, Solr can export metrics to other systems and that functionality is available out of the box. Yes, apart from the standard JMX reporting, we can save metrics to log file using an SLF4J reporter or even send metrics to Ganglia or Graphite. All of that is available out of the box without any special plugins or additions.

Finally, one last thing: the API. Once you know what you want to fetch, it is as easy as pointing your browser or curl command to /admin/metrics and providing the type and group of metrics you would like to retrieve. Solr will return all the requested information.  

Next, let’s dig a bit deeper and see what we need to do to make all of that working or configured.

Types of Meters

The improvements of Solr metrics are based on the metrics of DropWizard — a Java library supporting easy metrics reporting and different kinds of meters. The ones used by Solr are:

  • Counters.
  • Gauges.
  • Histograms.
  • Meters.
  • Timers.

The counters are there to measure simple counts — for example, number of queries processed by the request handler. The gauges allow one to check the current value — for example, the length of a queue or number, like number of pending documents. Histograms let Solr calculate the approximate distribution of certain values and provide statistics around them. For example, the histogram can be used to give us percentiles for query response times. Meters are there to provide rates of events, like the average number of query requests per second in the periods of 1, 5, and 15 minutes. Finally, the timers are there to count the duration of certain events and allow for both counts and histograms.

Solr Metric Groups

Solr uses the above mentioned standardized meters to gather the data from various metric, categorizing them into the following groups:

  • Core information (solr.core.<collection_name> group, each for one core).
  • HTTP information (solr.http group).
  • Jetty information (solr.jetty group).
  • JVM information (solr.jvm group).
  • Solr node information (solr.node group).
  • More metrics are coming in the future, like shard leaders metrics, cluster nodes metrics, and aggregated per-core metrics. However, these are available in Solr 6.4.

So, what can we get from those metrics? The core information group lets us get:

  • Metrics request handlers (counters, timers, errors, timeouts, etc.).

  • Index-related events (meters and gauges for merges, both current and historical, deleted documents information).

  • Lucene Directory I/O information (read and writes meters and histograms with split per file type — like doc values or terms dictionary).

In the future, we will also see shard replication, transaction log information, and caches.

The HTTP group provides information regarding available, open and pending connections for shard handlers and update handlers.

The Jetty related group gives us a view of a few very important statistics related to the servlet container in which Solr runs. We can look at threads and thread pools usage, connection and request timers, and meters for various HTTP response codes — like the number of requests with 200 response code, the number of 400 or 500, and so on.

Very interesting JVM group gives us low-level information on what is happening inside Solr and inside the JVM itself. We have the direct and mapped buffer pools information, class loading and unloading related information, garbage collection meters and timers, operating system memory, CPU timer, file descriptors count, system load, and swap usage (BTW, remember that JVM hates its memory to be swapped; it reacts with poor performance).

Finally, the node-related group provides information related to Solr node, such as the number of cores that are loaded on the core, lazy or unloaded, information about request handlers, collections, configuration sets, and so on.

Solr Metrics API

One of the possible methods of retrieving Solr metrics is using the Metrics API. The admin/metrics endpoint provides access to all the metrics that we can get. We should just send an HTTP GET request to that endpoint and let Solr know what we are interested in via available parameters. The parameters of the endpoint are as follows:

  • Group: Comma-separated metric group names we are interested or all value if we are interested in all of them — for example, node,jvm.
  • Type: Comma-separated types of metrics that we would like to retrieve or all if we are interested in all types. Possible values are the same as the type of metrics that we already mentioned, so: counter, gauge, histogram, meter, and timer.
  • Prefix: The filter that will display metrics starting with the specified prefix — for example, saying prefix=INDEX will return only indexing specific metrics.

Let’s now look at some example responses from Solr so we know what to expect.

For example, to retrieve minor merge related metrics we could run:


The response looks as follows:

<?xml version="1.0" encoding="UTF-8"?> 
<response> <lst name="responseHeader">  
  <int name="status">0</int>  
  <int name="QTime">0</int> </lst> 
  <lst name="metrics">  <lst name="solr.jvm"/>  
    <lst name="solr.jetty"/>  <lst name="solr.node"/>  
    <lst name="solr.core.data.shard1.replica1">    
      <lst name="INDEX.merge.minor">      
        <long name="count">2</long>      
        <double name="meanRate">0.022729587366906834</double>      
        <double name="1minRate">0.018095662944349354</double>      
        <double name="5minRate">0.00588664962556148</double>      
        <double name="15minRate">0.0021316664456582147</double>      
        <double name="min_ms">1132.643211</double>      
        <double name="max_ms">1415.985689</double>      
        <double name="mean_ms">1253.221722818028</double>     
        <double name="median_ms">1132.643211</double>     
        <double name="stddev_ms">140.09224396740174</double>    
        <double name="p75_ms">1415.985689</double>      
        <double name="p95_ms">1415.985689</double>     
        <double name="p99_ms">1415.985689</double>     
        <double name="p999_ms">1415.985689</double>    
      </lst>    <lst name="INDEX.merge.minor.running">     
      <int name="value">0</int> 
      <lst name="INDEX.merge.minor.running.docs">   
        <long name="value">0</long>    </lst>  
      <lst name="INDEX.merge.minor.running.segments">      
        <int name="value">0</int>   
      </lst>  </lst> </lst> 

We can also look at directory related metrics like this:


The response, this time in JSON (because of wt=json in the request) should have this structure:

{  "responseHeader":{    
     "DIRECTORY.total.reads",[        "count",44502443,       
     "DIRECTORY.total.writes",[        "count",28269374,    


Finally, all the metrics that we discussed so far can be reported automatically by Solr. In Solr 6.4, the following out-of-the-box reporters are available:

  • JMX reporter (class org.apache.solr.metrics.reporters.SolrJmxReporter).
  • SLF4J reporter (class org.apache.solr.metrics.reporters.SolrSlf4jReporter).
  • Graphite reporter (class org.apache.solr.metrics.reporters.SolrGraphiteReporter).
  • Ganglia reporter (class org.apache.solr.metrics.reporters.SolrGangliaReporter).

So, instead of retrieving the data on demand, we can just let Solr report to an external place or log file. We include the appropriate configuration in the solr.xml file (in the metrics section) and that is all that needs to be done.

For example, to include all the Solr node query related metrics on the console output of Solr, we would include the following fragment in the solr.xml file:

  <reporter name="queryNodeReporter" registry="solr.node" 
    <int name="period">60</int>   
    <str name="prefix">example_query_stats</str>   
    <str name="logger">CONSOLE</str>   
    <str name="filter">QUERY</str>  
    </reporter> </metrics> </solr>

The above configuration tells Solr that we want a new reporter called queryNodeReporter to be registered (this name needs to be unique) and we want it to look at the core information for the collection called data reported using the SolrSlf4jReporter — so, written to a file. The data will be written to the console file every minute (period number of seconds) and will include information query related metrics (filter value of QUERY). The information written to the console looks as follows:

2017-01-14 20:23:30.940 
INFO  (metrics-logger-reporter-1-thread-1) [   ] 
CONSOLE type=GAUGE, name=QUERY.httpShardHandler.availableConnections, 
2017-01-14 20:23:30.940 
INFO  (metrics-logger-reporter-1-thread-1) [   ] 
CONSOLE type=GAUGE, name=QUERY.httpShardHandler.leasedConnections, 
value=0 2017-01-14 20:23:30.941 
INFO  (metrics-logger-reporter-1-thread-1) [   ] 
CONSOLE type=GAUGE, name=QUERY.httpShardHandler.maxConnections, 
value=10000 2017-01-14 20:23:30.941 INFO  (metrics-logger-reporter-1-thread-1) 
[   ] CONSOLE type=GAUGE, name=QUERY.httpShardHandler.pendingConnections, 
value=0 2017-01-14 20:23:30.941 INFO  (metrics-logger-reporter-1-thread-1) [   ] 
name=QUERY.httpShardHandler.threadPool.httpShardExecutor.running, count=0 2017-01-14 20:23:30.941 INFO  (metrics-logger-reporter-1-thread-1) [   ] CONSOLE type=METER, name=QUERY.httpShardHandler.threadPool.httpShardExecutor.completed, count=0, mean_rate=0.0, m1=0.0, m5=0.0, m15=0.0, rate_unit=events/second 2017-01-14 20:23:30.941 INFO  (metrics-logger-reporter-1-thread-1) [   ] CONSOLE type=METER, name=QUERY.httpShardHandler.threadPool.httpShardExecutor.submitted, count=0, mean_rate=0.0, m1=0.0, m5=0.0, m15=0.0, rate_unit=events/second 2017-01-14 20:23:30.941 INFO  (metrics-logger-reporter-1-thread-1) [   ] CONSOLE type=TIMER, name=QUERY.httpShardHandler.threadPool.httpShardExecutor.duration, count=0, min=0.0, max=0.0, mean=0.0, stddev=0.0, median=0.0, p75=0.0, p95=0.0, p98=0.0, p99=0.0, p999=0.0, mean_rate=0.0, m1=0.0, m5=0.0, m15=0.0, rate_unit=events/second, duration_unit=milliseconds

Of course, we can have numerous reporters defined in the solr.xml and have mixed type of reporters. For example, we can have all core related metrics reported to Graphite and to log file at the same time:

    <reporter name="queryNodeReporter" registry="solr.node" 
      <int name="period">60</int>   
      <str name="prefix">example_query_stats</str>   
      <str name="logger">CONSOLE</str>   
      <str name="filter">QUERY</str>  
    <reporter name="exampleGraphiteReporter" group="solr.core.data" 
      <str name="host">graphite.my.host.com</str>  
      <int name="port">9999</int>   <int name="period">60</int> 
    </reporter> </metrics> </solr>

Which Solr Metrics Reporter to Use

Now that we know that Solr 6.4 can report all these new metrics you’ll need to decide where to write it.  If you already have Ganglia or Graphite in your organization and you use it for monitoring other services you may choose to continue using them.

While Ganglia and Graphite were sysadmins’ darlings at some point roughly a decade ago, they are (have been) showing their age. Thus, if you don’t use Ganglia or Graphite today, I strongly suggest you don’t set them up just to have Solr monitoring. There are better options. Similarly, if you need to set up any infrastructure or extra tooling just so you can monitor your Solr clusters you may want to reconsider.  That tends to be too costly in terms of people’s time and/or infrastructure, although initially it almost always doesn’t feel that way.

You can want to use Sematext SPM for Solr. It’s simple to install (you just need an agent to collect and ship metrics), pre-configured with various out of the box dashboards and alerts and, in short, it lets you troubleshoot Solr performance issues very quickly. What’s more, SPM integrates with Logsene, a hosted log management and analysis service, which makes it even easier to have a single pane of glass for all your operations data, not just Solr, not just metrics, but also metrics for other parts of your infrastructure and other logs. SPM will be soon able to collect and show all the additional metrics introduced with Solr 6.4, so watch out for an announcement from us.

Find the perfect platform for a scalable self-service model to manage Big Data workloads in the Cloud. Download the free O'Reilly eBook to learn more.

big data ,solr ,api ,metrics

Published at DZone with permission of Radu Gheorghe, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}