4 Time Series Databases to Use In 2019
4 Time Series Databases to Use In 2019
See a list of 4 time series databases to use in 2019.
Join the DZone community and get the full member experience.Join For Free
When developing IoT, financial, or industrial applications, choosing the right time series database can be a headache most of the time, especially when choosing between 30+ time series vendors in the industry.
When choosing a time series database, always see what they have to offer and see how they can best suit your needs.Are you more about directly writing SQL, or do you prefer a brand new processing language for your time series? Are you concerned about cloud-based solutions, or do you have your own integration solutions? This article will help you benchmark your different options.
Here is my list of the best time series databases to use in 2019.
You might also like: What the Heck Is Time Series Data (And Why Do I Need a Time Series Database)?
Built by InfluxData in 2013, InfluxDB is a completely open-source time series database working on all current operating systems. InfluxDB supports a very large set of programming languages (yes.. even Lisp and Clojure...). It is optimized for heavy writing load and works amazingly well with concurrency.
InfluxDB is schema-free: it is built on NoSQL flavors and allows for quick database schema modifications. Depending on what you are trying to build, this conceptual choice may or may not be adapted to your needs.
Why should you use InfluxDB?
Play with it in 5 minutes
Five minutes is all it takes from the moment you download it until you are able to play with it. Good technical documentation makes it super easy to install, configure, and launch InfluxDB. As a NoSQL-like database, you don't have to set up your database in any way. You just insert your data and you are good to go.
Integrated TICK stack
InfluxDB is part of the TICK stack: Telegraf, InfluxDB, Chronograf, and Kapacitor. InfluxData provides an out-of-the-box visualization tool (that can be compared to Grafana), a data processing engine that binds directly with InfluxDB, and a set of more than 50+ agents that can collect real-time metrics for a lot of different data sources.
Now let's be fair
InfluxDB is, most of the time, used with Grafana. Chronograf is not (at the moment) as good as Grafana, but InfluxData is trying to turn the ship around. By building Flux, a new processing language, and integrating it directly with Chronograf, they might offer some very unique features in the next months.
Ranked at #15 last year, TimescaleDB is making huge progress in the rankings this year.
Well if you ask me, they provide a very solid and scalable alternative to InfluxDB. TimescaleDB is also open-sourced and based on SQL premises. They also provide a very large set of supported programming languages (incl. Java and Python) for your applications to integrate directly with it.
TimescaleDB is directly tied with PostgresSQL, as it scales the famous relational database to offer a unique set of time-series-related operations (such as fast ingest).
Why should you use TimescaleDB?
One of the greatest assets of TimescaleDB is the fact that it supports the SQL language natively and allows developers to quickly jump the train without having to learn any new language. It is, of course, a very nice aspect for developer productivity, as you can ensure that SQL-experienced developers in your team can be immediately effective with TimescaleDB.
The Guardian did a very nice piece explaining why they went from MongoDB to PostgreSQL in favor of scaling their architecture and encrypting their content at REST. As you can tell, big companies are relying on SQL-constraint systems (with a cloud architecture of course) to ensure system reliability and accessibility. I believe that PostgreSQL will continue to grow and so will TimescaleDB. By belonging to the PostgresSQL ecosystem, TimescaleDB will inherit all the tools and plugins developed by this huge community.
A debatable better performance than InfluxDB
I must emphasize that this is a debatably better performing system that is pretty new to the market and was not tested on all the different cases that the industry has to offer.
As a fair-minded writer, I must point out the fact that Matvey Arye wrote a very good article comparing Flux to SQL and, in a way, InfluxDB to TimescaleDB. His points about query optimization, in particular, should be read carefully and they provide a very solid explanation on why they could be more performant (at least in theory).
Matvey Arye article — SQL vs. Flux
OpenTSDB has been running for quite more time than its competitors and is one of the first technologies to address the need to store time series data at a very large scale. OpenTSDB promises to be able to store hundreds of billions of data rows over distributed instances of TSD servers.
OpenTSDB is a schema free database built on Apache HBase. For those who don't know, HBase is a non-relational management system written to handle big tables storage in an elegant and efficient way.
Why should you use OpenTSDB?
Ted Dunning (Chief Application Architect at MapR) made a quite explicative talk about how time series databases should be built and how horizontal arranging of time ranges could scale a DBMS up to 20 to 30 million writes per second. This is a huge insertion rate considering a single InfluxDB node instance could insert up to one million writes per second.
You might want to give OpenTSDB a shot if you are dealing with such insertion rates in your system.
Reading the documentation, OpenTSDB integrates with a fair amount of tools such as Cassandra, BigTable, CollectD, StatsD, Chef, and even Puppet for deployment management.
Ted Dunning on Time Series Database Architecture
Graphite is an even more established and very widely used time series database system. Graphite is a powerful monitoring tool that stores numeric time series data and displays them on demand via its Graphite-web interface at a fair speed. Graphite is, most of the time, used as a system, network, and application performance metric store. Big companies such as Booking.com, Reddit, and GitHub use it on a daily basis to easily detect outages on their architectures.
Why should you use Graphite?
Graphite does a few things, but it does it well
Graphite is built to deal with numeric data. As it can be a limitation in itself if you are not dealing with numeric data, Graphite provides out-of-the-box tools that make it easy for developers to get started. Graphite Web provides a very nice interface for developers to monitor their application.
A good integration ecosystem
As OpenTSDB, Graphite connects with a lot of tools natively and makes it easy for developers to connect with their existing infrastructure. Graphite is able to easily connect with CollectD, Sensu, Riemann, Windows Server, Logstash, and many more.
Your turn to share!
Do you have experience with those time series databases? If so, which one would you recommend and why?
Also, if you find that a TSMS should be ranked higher or lower, feel free to give your own rankings in the comment section.
Published at DZone with permission of Antoine SOLNICHKIN . See the original article here.
Opinions expressed by DZone contributors are their own.