Over a million developers have joined DZone.

Log Analytics Tools: Open Source vs. Commercial

In this article, I’m going to relay my own experience and that of other engineers at Search Technologies with log analytics tools–Splunk and Elasticsearch, Logstash, and Kibana (ELK) in the Elastic stack. As every article says, you’ll have to decide what works best for you.

· Big Data Zone

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

Several good articles have been written on this subject and we’ve assembled them in links at the end of this blog for your perusal. So rather than do another "bake-off", I’m simply going to relay my own experience and that of other engineers at Search Technologies with log analytics tools–Splunk and Elasticsearch, Logstash, and Kibana (ELK) in the Elastic stack. As every article says, you’ll have to decide what works best for you.

Slings and Arrows of Fortune — My First Time Splunking

I’ve known about Splunk for a long time and have quite an appreciation for how easy it is to use. At my previous company, my website hosting provider switched the configuration of their VM and took my website for down for three days. They fully denied that they caused the problem (of course). But, I happened to be playing around with the free version of Splunk at the time, and in about five minutes I was able to point Splunk at my web server log, search for the word "error" and voila! A screen not unlike the example below showed that the problems obviously started on the day of the switch over to the new VM configuration for the web server.

Image title

In the end, it turned out to be a PHP version incompatibility with my CMS and, needless to say, I changed hosting vendors soon after.

What this illustrated for me was how fast and easy it was in Splunk to go from installation to getting some useful information out of the tool.  What’s more impressive is I’m in marketing now (haven’t coded in over 10 years), so I can see IT management types or semi-technical people being able to use this very easily.

User Experience: Elasticsearch, Logstash, and Kibana (ELK) vs. Splunk

One of the biggest challenges to Splunk is that there are some pretty darn good open source log analytics tools out there, the most well-known of which is Elasticsearch, Logstash, and Kibana (ELK) available from Elastic (fair disclosure, Search Technologies’ business is based in part on building applications around open source technologies).

Similar to Splunk, Elasticsearch has roots in Lucene (though Splunk now has a proprietary implementation).  Logstash serves as a data ingestion engine, and Kibana serves as a dashboard / presentation layer.  While these are all separate open source projects, the fact that they are under one management "roof" at Elastic leads to a very cohesive roadmap for all the components.

As I mentioned, it’s been a long time since I’ve been a developer so keep that in mind. But, what took a user like me five minutes in Splunk might take more like one or two hours initially with the ELK stack. I had to set some JAVA environment variables, get comfortable with a command line interface again, and I also didn’t have tools like Curl installed on my PC. You also have to install three components for the ELK stack versus just one application for Splunk.

But after a little work, your rudimentary starting point looks a lot like Splunk (see example screen below). Whatever log file you pointed the ELK stack at has been quickly indexed and you have a search interface and what amounts to facets to explore your indexed data.

Image title

Developers Love Elasticsearch, Logstash, and Kibana (ELK) for Log Analytics

From the perspective of our technical staff at Search Technologies, however, getting up to speed on ELK was relatively easy. They commented that the developer experience is also very good.

At our last major company kickoff event, we had a hands-on workshop lead by one of our architects where everyone was up and running in a couple of hours on ELK and crafting some cool dashboards using data from our website server logs. They were able to basically plot a map of our web traffic over several months, which looked something like this:

Image title

Like data ingestion, basic data visualization is also becoming somewhat of a commodity.  Our engineers feel, on the surface, that Kibana’s latest visualization and dashboard capabilities are very similar to Splunk.

Splunk Still Rules the Log Analytics Market

For log analytics applications, and with over $500 million in revenues, Splunk is still the undisputed market leader. 

Log analytics tools have really been around for a while. But, it was the creators of Splunk that took the time to understand what problems the traditional users–mostly system admins and developers–faced every day in doing their jobs. Often it was to look for problems in security, application, or server logs, or as one person put it, "looking for a needle in a haystack of haystacks."

Splunk did three things well to make them the current indisputable market leader in commercial log analytics tools:

  • They created an exceptional user experience.
  • They created a community of useful plug-ins to enhance their platform.
  • They toured relentlessly and promoted training to create a large number of disciples of the Splunk Search Processing Language (SPL™).

SPL is an incredible testament to how powerful a search interface can be to visualize, explore, and analyze different types of data sources out there–and you can just about put anything in Splunk thanks to all the available third-party plug-ins. SPL also has a number of sophisticated analytics "commands" (like a macro) that do some interesting time series analytics like drawing regression lines through data and set thresholds for alerting.

But despite all the big data hype, Splunk is still just a tool for analyzing your logs. As one Splunk marketer once admitted to me "Splunk is only as good as the query the user submits," meaning that you have to be somewhat of a power-user of a proprietary technology to really get the most out of Splunk.

How Long Will the Emperor Keep His Clothes?

So what challenges does Splunk face moving forward? What about Splunk bothers the market? And why are many companies who have been long-time users considering actually replacing Splunk with open source log analytics tools like the ELK stack?

Well, for one, the ELK stack delivers a very good developer experience, and the feature gap has narrowed significantly in the last 12 months. There is much more training available now for the Elastic stack's components, and a large and growing developer community that can help.

And then there’s the pricing model–aye, there’s the rub.

Despite having open source bones (Lucene, the search and indexing engine is part of the core technology), and as much as users I know love Splunk, the longer people use it, the more I get the sense that many of them feel held hostage by the company’s pricing model.

Splunk’s pricing is based on how much data you index, which is sort of like paying for your car based on how many miles you’ve driven. Many users find that once they point Splunk to data sources, those sources tend to generate more data at an escalating rate than anyone expected. Even Oracle doesn’t do that with their databases. And, when was the last time you heard the market complain more about a vendor’s variable pricing than they complain about Oracle?

Many commercial and open source software solutions also have variable pricing. And with add-on products, it’s sometimes hard to distinguish a "pure" open source vendor from a commercial one. But these vendors typically charge by the node (loosely related to a virtual server instance), and for some reason this tends to be more palatable to the market.

To be fair to Splunk, not everyone experiences exponential data (and cost) growth, especially if they manage the volume of data indexed. And Splunk Light offers some cost relief, albeit with less functionality. Also, Splunk might offer some aggressive discounts to hang on to their key customers.

But, it’s clear that Splunk feels the heat from very viable open source log analytics tools who are closing the functionality gap while exploiting the market’s continued distaste for Splunk’s pricing model.

What Others Are Saying

The articles below provide a number of additional perspectives on commercial vs. open source log analytics tools.  Most try to be fairly neutral but some are obviously a bit biased, one way or the other. I think you’ll enjoy even more insight found in the comments section for some of the articles.

The bottom line is that replacing Splunk with the ELK stack may or may not be the right thing to do for your organization. But, you would do yourself a big disservice if you don’t at least consider the question.

This article was first published on Search Technologies' Blog.

Learn how you can modernize your data warehouse with Apache Hadoop. View an on-demand webinar now. Brought to you in partnership with Hortonworks.

big data,big data analytics,open source

Published at DZone with permission of Graham Gillen, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}