Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Feeding Solr With its own Logs

DZone's Guide to

Feeding Solr With its own Logs

· Java Zone
Free Resource

Learn how to troubleshoot and diagnose some of the most common performance issues in Java today. Brought to you in partnership with AppDynamics.

I always looked for a simple way to visualize our log data e.g. from solr. At that time I had in mind a combination of gnuplot and some shellscripts but this session from the lucene revolution changed my idea. (Look here for all videos from lucene revolution.)

I thought: “hey thats it! Just put the logs into solr!” So I coded something which simply reads the log files and named it Sogger. Without sharding, without message queues, … but it should work on real systems without any changes to your system (but probably to sogger).

I hope Sogger doesn’t suck, but it does not come with any warranty, so use it with care! And: It is only a proof of concept – nothing comparable to the guys from loggly.com

To get your logs sogged:

  • Download the ‘Sogger’ code via:
    hg clone http://timefinder.hg.sourceforge.net/hgroot/timefinder/sogger sogger-code
  • Download the Solr from trunk.
    svn co -r  1023329 https://svn.apache.org/repos/asf/lucene/dev/trunk solr-code

    Sogger doesn’t necessarily need the trunk version but I didn’t tested it for others yet

  • compile solr and Sogger with ant
  • cd solr-code/solr/example/
  • copy solrconfig.xml, schema.xml from Sogger into solr/conf
  • copy the *.vm files from Sogger into the files at solr/conf/velocity/
  • start solr
    java -jar start.jar
  • start feeding your logs
    cd sogger-code/
    java -jar dist/Sogger.jar url=http://localhost:8983/solr logFile=data/solr.2010-10-25.log.gz
  • to search your logs do:

    http://localhost:8983/solr/browse?q=twitter

Now you should see something like this

Sogger has several advantages over simple “grep-ing” or scripting with your solr logs:

  • full text search. near real time: ~1min ;-)
  • performance. I hope commiting every minute does not make solr a lot slower
  • filtering by log level: Quickly find warnings and exceptions
  • filtering by webapp: If you have multiple apps or solr cores which are logging into the same file filtering is really easy with solr (with grep too, but you’ll have to re-grep the whole log …)
  • open source: you can change the feeding method I used and take care of your special needs. Tell me if you need assistance!
  • new log lines will be detected and commited ala tail -f
  • besides text files sogger accepts and detects compressed (zip, gzip/gz) files ala zgrep. So you don’t need to change your log handlers or preprocess the files.

to do’s:

  • make the log format customizable within a property file:
    line1=regular expression pattern1
    line2=regular expression pattern2
  • read and monitor multiple log files
  • make it a solr plugin via special UpdateHandler?
  • a xy plot (or barchart) in velocity for some facets or facet queries would be nice. Something like I had done before with wicket.
  • I don’t like velocity … althought it is sufficient for this … but should we use wicket!?

 

From http://karussell.wordpress.com/2010/10/27/feeding-solr-with-its-own-logs/

Understand the needs and benefits around implementing the right monitoring solution for a growing containerized market. Brought to you in partnership with AppDynamics.

Topics:

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}