Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Monitoring Solr with Graphite and Carbon

DZone's Guide to

Monitoring Solr with Graphite and Carbon

· Big Data Zone
Free Resource

Need to build an application around your data? Learn more about dataflow programming for rapid development and greater creativity. 

This blog post requires graphite, carbon and python to be installed on your *ux. I'm running this on ubuntu.

http://graphite.wikidot.com/
https://launchpad.net/graphite/+download

To setup monitoring RAM usage of Solr instances (shards) with graphite, you will need two things:

1. backend: carbon
2. frontend: graphite

The data can be pushed to carbon using the following simple python script.

In my local cron I have:

1,6,11,16,21,26,31,36,41,46,51,56 * * * * \
   /home/dmitry/Downloads/graphite-web-0.9.10\
          /examples/update_ram_usage.sh

The shell script is a wrapper for getting data from the remote server + pushing it to carbon with a python script:

scp -i /home/dmitry/keys/somekey.pem \
    user@remote_server:/path/memory.csv \ 
    /home/dmitry/Downloads/MemoryStats.csv
python \
  /home/dmitry/Downloads/graphite-web-0.9.10\
    /examples/solr_ram_usage.py

An example entry in the MemoryStats.csv:

2013-09-06T07:56:02.000Z,SHARD_NAME,\
  20756,33554432,10893512,32%,15.49%,SOLR/shard_name/tomcat

The command to produce a memory stat on ubuntu:

COMMAND="ssh user@remote_server pidstat -r -l -C java" | grep /path/to/shard

The python script is parsing the csv file (you may want to define your own format of the input file, I'm giving this as an example):

import sys
import time
import os
import platform
import subprocess
from socket import socket
import datetime, time

CARBON_SERVER = '127.0.0.1'
CARBON_PORT = 2003

delay = 60
if len(sys.argv) > 1:
  delay = int( sys.argv[1] )

sock = socket()
try:
  sock.connect( (CARBON_SERVER,CARBON_PORT) )
except:
  print "Couldn't connect to %(server)s on port %(port)d, is carbon-agent.py running?" % { 'server':CARBON_SERVER, 'port':CARBON_PORT }
  sys.exit(1)

filename = '/home/dmitry/Downloads/MemoryStats.csv'

lines = []

with open(filename, 'r') as f:
  for line in f:
    lines.append(line.strip())

print lines
 
lines_to_send = []

for line in lines:
  if line.startswith("Time stamp"):
    continue
  shard = line.split(',')
  lines_to_send.append("system."+shard[1]+" %s %d" %(shard[5].replace("%", ""),int(time.mktime(datetime.datetime.strptime(shard[0], "%Y-%m-%dT%H:%M:%S.%fZ").timetuple()))))

#all lines must end in a newline
message = '\n'.join(lines_to_send) + '\n'
print "sending message\n"
print '-' * 80
print message
print
sock.sendall(message)
time.sleep(delay)

After the data has been pushed you can view it in graphite GWT based UI. The good thing about graphite vs jconsole or jvisualvm is that it persists data points so you can view and analyze them later.

For Amazon users, an alternative way of viewing the RAM usage graphs is with CloudWatch, although at the moment of this writing it allows storing 2 weeks worth of data only.



Check out the Exaptive data application Studio. Technology agnostic. No glue code. Use what you know and rely on the community for what you don't. Try the community version.

Topics:

Published at DZone with permission of Dmitry Kan, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}