Gnocchi vs. Prometheus
Gnocchi vs. Prometheus
One is probably not inherently better or worse than the other, but depending on your time series database use cases, you might prefer Gnocchi to Prometheus or vice versa.
Join the DZone community and get the full member experience.Join For Free
Databases are better when they can run themselves. CockroachDB is a SQL database that automates scaling and recovery. Check it out here.
The realm of the time series database has been expanding the past few years. Now and then, a new contender appears from the fog. People keep asking me about the difference between Gnocchi and Prometheus. It's time to compare them!
Gnocchi and Prometheus are two open-source projects evolving in the same expertise area: time series handling. They both are licensed under the Apache 2.0 license (see the Gnocchi license file and Prometheus license file. And that's a good thing!
Both Gnocchi and Prometheus offers a bunch of features. Here's a table summary of the differences between the features they both offer — or not.
There's a lot of overlap between the two projects, but there are also some major differences.
First, Gnocchi does not try to solve the metric retrieval problem. Prometheus provides a pull mechanism and takes charge of getting the measurements. Gnocchi developers estimate that there are plenty of tools already doing that and that work just as well, such as collectd.
Secondly, Prometheus offers an alerting engine that is statically configured with a YAML file. It is way better than Gnocchi, which offers nothing in comparison — for now. Gnocchi developers are discussing the feature, and while it's not on the roadmap yet, it will happen. It will, however, leverage a REST API to be controlled, as it seems important to us to be able to define alerts programmatically.
Then, there is a bunch of features where Gnocchi shines compared to Prometheus, and it is the core of its function: storing metrics. Gnocchi has a great storage engine that supports many storage backends (plain files, OpenStack Swift, Ceph, etc.). It helps Gnocchi scaling horizontally and providing native high-availability, whereas Prometheus stays a single point of failure.
Multi-tenancy and authentication are also supported by Gnocchi, allowing a single instance to be shared by multiple accounts. System administrators do not commonly use this kind of feature, but application developers usually need them.
That brings me to the usage and querying of Prometheus and Gnocchi. Prometheus has a small DSL (referred to as PromQL), whereas Gnocchi has a fully featured REST API that tries to expose proper semantics. It does not seem that there are major differences between the two in term of features.
Both Prometheus and Gnocchi support aggregating values over time ranges on query time ("give me the minimum value for every five minutes range over the last day"). Gnocchi always aggregates metrics at writing time, and never at query time (unless doing it cross-metrics). This implies that Gnocchi needs a bit of CPU time at write time to pre-compute those aggregates, but it is blazingly fast at reading time as it has nothing to compute. Prometheus can do the same thing using recording rules.
Prometheus has some limitations inherent to time series database designed around the notion of monitoring: they tend to compute everything relatively to
$NOW. For example, it seems impossible to inject data from the past. The timestamp for a value is the timestamp where Prometheus read that value. If Prometheus misses values for a few hours, don't think about importing it back.
I'm noting this here as it makes it harder to benchmark Prometheus for ingestion. You need tons of fake metrics to polls and build data. I did not find any reference of Prometheus performances online, though it is advertised to ingest "millions of measures from thousands of sources."
Query performances seem to vary on Prometheus, and I did not find any benchmark on that neither. Gnocchi leverages standard RDBMS (MySQL or PostgreSQL is supported) to query indexed data and the metrics retrieval is always O(1), making it always fast.
If you look in different and older areas, there has never been only one HTTP server. Many people use Apache HTTP server, but you'll find plenty of users of Nginx, Tomcat, HAProxy, Node.js, or uwsgi which are also common options nowadays. Same goes for RDBMS if you look at PostgreSQL, MySQL, and other databases solutions. There will never be a project winning all the market share.
It seems to me that time series storage and management are also growing in this category. There will probably be various projects that will enjoy some popularity and growth. Every project addresses the time series problem space with a different view and different trade-offs. There might never be a single project solving all problems at once.
Prometheus seems to be oriented toward monitoring live systems. Gnocchi is oriented to highly available time series storage at massive scale. Not considering performances (I was not able to compare, anyway), both have different tradeoffs in term of features, philosophy, and orientation. Depending on your use cases, one might be a better fit than the other.
Published at DZone with permission of Julien Danjou . See the original article here.
Opinions expressed by DZone contributors are their own.