I recently attended DevOpsDays Boston, which is a great way to get a pulse on the latest issues and technologies in the DevOps community. DevOpsDays is split between organized talks, open sessions and hanging around chatting with the other attendees. For me, the latter is always the most valuable. Over the 2 days, I probably spoke to more than 100 different people. I’m always interested in investigating how people are logging, what they are logging, whether they are rolling their own, using a commercial log management solution, SaaS-based or on-Premise, etc.
Just about everyone I spoke with recognizes the value of their logs, and is either currently using a logging solution of some sort (open source/ on prem/ SAAS) OR is in the process of investigating one.
I learned the main challenges people find with log management solutions today are:
- Too complex to use: Learning a complex log query language can be a pain. While some of these are VERY powerful, they’re not helpful if you need to reeducate yourself every time you want to search or analyse your log data.
- Too expensive: This was a common theme and was said to be especially the case for well know on-premise vendors.
- Too time consuming: A lot of people, put off by the high costs, are looking at alternatives and trying out open source solutions. However, as their log volumes grow, managing and maintaing the environment can take up a few hours per week, which defeats the purpose of using a log management and analytics solution, which should ultimately make you more efficient and better at your job
The de-facto open source tool of choice is also ELK (i.e. a combination of Elasticsearch, Logstash, Kibana). Many of the people I spoke with were using or evaluating this alongside SaaS or on-prem commercial solutions as they think about strategically and cost-effectively managing their their infrastructure.
In light of this, I thought I’d give a quick overview of some of the pros and cons of open source log management services.
What do you get?
ELK is a combination of Elasticsearch, Logstash and Kibana.
- Elasticsearch is a distributed, real-time search and analytics engine.
- Logstash is a technology for parsing your log data and streaming it into elastic search.
- Kibana is a really nice UI layer that gives you the ability to search and graph your data.
In short, ELK gives you the basics of Log Management, i.e. a single place for all your data where you can easily search it and create some nice graphs on this.
What you do not get?
If you are looking for some of the powerful features that enable intelligent log management and analysis such as real-time alerts, anomaly detection, dynamic search queries and functions, live-tail, dynamic aggregation you will likely want to look at a commercial SaaS or on premise solution.
How much does it cost?
In terms of $$$ you need to put down up front? It’s free. This is often one of the key drivers, especially when faced with a potentially very high license fee. However, there are “costs” associated with open source tools. These costs include paying for the infrastructure to run it on, the time cost associated with setting it up, and the ongoing maintenance costs which I will cover next.
How much does it cost to run?
For low volumes of data, you can spin it up on a relatively small/single AWS instance. So, in the region of $100s per month will cover you. However, for larger volumes, and if you want high availability built-in, can get more expensive.
Check out this post on how to set up a high availability ELK cluster where setup consists of 6 server instances. Once you get into an elastic search cluster, you will often be in the $1000s per month range. In a conversation I had recently with an old Ops friend of mine, he described his set-up, which was processing in the region of 250GB per day across an 8 node ELK cluster and his AWS bill was coming in at $4k per month.
How to set-up/configure?
Initial set-up and configuration on a single instance is relatively straightforward, especially if you have decent Ops skills. Here’s a good how-to which will take you an hour or two to fully run through.
Setting up a clustered, highly available environment however is probably not for the faint hearted, and requires a bit more effort. There’s a nicely put together guide here.
How much time will I spend maintaining this?
Setting up ELK can be a fun, once-off project for one of the Ops team – however from what I have gathered, the real time sink is ongoing maintenance of an elastic search cluster. It’s not uncommon for the Ops team to spend a few hours per week maintaining this, or running into issues/ experiencing down time when they need to grow the size of the cluster when log volumes grow. However this is completely under your control and obviously depends on how well you designed your initial cluster to scale.
ELK has fast become the de-facto open source logging solution and has been a great way for organizations to get set-up without getting hit with a big upfront cost (provided you do not mind having to invest some time in setup and config). It is also a great way to get some basic searching and graphing capabilities and for companies to recognize the value in their log data.
However if you are looking for some of the ‘power features’ (think: real-time notifications, anomaly detection, live tail, powerful search functions, dynamic correlation…) and want to do more with your logs in terms of analysis, building powerful dashboards, building them into your alerting and monitoring infrastructure AND you do not want any of the hassle of setting up and maintaining a log management environment you might want to check out other solutions available today.