Graphite is an open-source real-time graphing system for time series metrics. Graphite does not collect metrics itself, instead Graphite, like a database, receives metrics via its backend which can then be queried, transformed and combined in real-time. Graphite supports a built-in web interface which allows users to browse metrics and graphs.
How Is Graphite Commonly Used?
Graphite is commonly used to monitor infrastructure-level metrics, like cpu, memory, i/o utilization, network throughput and latency, though it works just as well for application- and business-level metrics. Collectd (https://collectd.org/) is the main reason for this common use.
Collectd, a well-known and long standing Linux project for collecting infrastructure-level metrics, comes pre-packaged with a Graphite "write-plugin" since 2012. Collectd comes with many collecting plugins that capture anything from CPU and battery usage up to Java and Redis metrics.
Typically metrics are not send directly to Graphite's backend, but are sent one metric, or sample, at a time to a metric collecting service. StatsD (https://github.com/etsy/statsd), another open-source project, is such a popular metric collection service. It aggregates the samples it receives, calculates counts, averages, standard deviation and a few other statistics, and periodically (e.g. every 10s) flushes these to a metric database. Graphite is the default backend for StatsD.
For visualization purposes Grafana (http://grafana.org/) is the popular choice of today aside from the built-in web interface. It's not hard to see why. Slick looking dashboards are easily created with a well-thought-out user-interface. Grafana gets its information first and foremost from Graphite, but also works with several other popular metric databases like InfluxDB, OpenTSDB, and Prometheus.
Graphite itself does not provide the ability to alert when metrics go out of expected ranges. Again there are several solutions out there that provide this feature. Cabot (https://github.com/arachnys/cabot) seems to be a popular choice, though you will not need it when using StackState. StackState can provide the same functionality. The difference between Cabot and StackState's alerting capabilities is that StackState integrates with multiple monitoring solutions, which gives you the ability to run checks over not only Graphite data, but any kind of monitoring data in any combination, taking the entire stack into account.
What's So Great About Graphite?
- It's very fast. It's architecture is modular and scales
- It's well known, has a big community and there is a lot of support
- There is a lot of open-source tooling to work with Graphite: http://graphite.readthedocs.org/en/latest/tools.html
- It does a single job and does it well
- It's Apache 2.0 licensed
What is Less Great?
- It’s does not have the possibility to shard data, a common solution is having multiple instances of Graphite
- Installing Graphite can be a complex task although we nowadays have complete Docker images to install Graphite including all its dependencies at once.
In our next blog post we will cover the use of Graphite (or similar metric stores) in combination with StackState.