Over a million developers have joined DZone.

Feeding Solr With its own Logs

· Java Zone

Check out this 8-step guide to see how you can increase your productivity by skipping slow application redeploys and by implementing application profiling, as you code! Brought to you in partnership with ZeroTurnaround.

I always looked for a simple way to visualize our log data e.g. from solr. At that time I had in mind a combination of gnuplot and some shellscripts but this session from the lucene revolution changed my idea. (Look here for all videos from lucene revolution.)

I thought: “hey thats it! Just put the logs into solr!” So I coded something which simply reads the log files and named it Sogger. Without sharding, without message queues, … but it should work on real systems without any changes to your system (but probably to sogger).

I hope Sogger doesn’t suck, but it does not come with any warranty, so use it with care! And: It is only a proof of concept – nothing comparable to the guys from loggly.com

To get your logs sogged:

  • Download the ‘Sogger’ code via:
    hg clone http://timefinder.hg.sourceforge.net/hgroot/timefinder/sogger sogger-code
  • Download the Solr from trunk.
    svn co -r  1023329 https://svn.apache.org/repos/asf/lucene/dev/trunk solr-code

    Sogger doesn’t necessarily need the trunk version but I didn’t tested it for others yet

  • compile solr and Sogger with ant
  • cd solr-code/solr/example/
  • copy solrconfig.xml, schema.xml from Sogger into solr/conf
  • copy the *.vm files from Sogger into the files at solr/conf/velocity/
  • start solr
    java -jar start.jar
  • start feeding your logs
    cd sogger-code/
    java -jar dist/Sogger.jar url=http://localhost:8983/solr logFile=data/solr.2010-10-25.log.gz
  • to search your logs do:


Now you should see something like this

Sogger has several advantages over simple “grep-ing” or scripting with your solr logs:

  • full text search. near real time: ~1min ;-)
  • performance. I hope commiting every minute does not make solr a lot slower
  • filtering by log level: Quickly find warnings and exceptions
  • filtering by webapp: If you have multiple apps or solr cores which are logging into the same file filtering is really easy with solr (with grep too, but you’ll have to re-grep the whole log …)
  • open source: you can change the feeding method I used and take care of your special needs. Tell me if you need assistance!
  • new log lines will be detected and commited ala tail -f
  • besides text files sogger accepts and detects compressed (zip, gzip/gz) files ala zgrep. So you don’t need to change your log handlers or preprocess the files.

to do’s:

  • make the log format customizable within a property file:
    line1=regular expression pattern1
    line2=regular expression pattern2
  • read and monitor multiple log files
  • make it a solr plugin via special UpdateHandler?
  • a xy plot (or barchart) in velocity for some facets or facet queries would be nice. Something like I had done before with wicket.
  • I don’t like velocity … althought it is sufficient for this … but should we use wicket!?


From http://karussell.wordpress.com/2010/10/27/feeding-solr-with-its-own-logs/

The Java Zone is brought to you in partnership with ZeroTurnaround. Check out this 8-step guide to see how you can increase your productivity by skipping slow application redeploys and by implementing application profiling, as you code!


The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}