Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

An Introduction to OpenTSDB

DZone's Guide to

An Introduction to OpenTSDB

In this article, we will look at a tutorial that will help us discover how to deploy the OpenTSDB in a clustered node.

· Database Zone ·
Free Resource

RavenDB vs MongoDB: Which is Better? This White Paper compares the two leading NoSQL Document Databases on 9 features to find out which is the best solution for your next project.  

In our last three articles, we talked about the HDFS, Zookeeper, and HBase Cluster, which is needed for deploying OpenTSDB in a clustered mode. Continuing to the series, in this article, we will finally deploy the OpenTSDB.

OpenTSDB is a distributed, scalable, time series database built on top of Hadoop and HBase.

OpenTSDB can collect, store, and serve billions of the data points without any loss of precision. This makes it a perfect solution for the monitoring system.

OpenTSDB uses HBase for time series data storage and ZooKeeper to get HBase cluster information. It also consists of a Time Series Daemon (TSD) as well as a set of command line utilities. Interaction with OpenTSDB is primarily achieved by running one or more of the TSDs. Each TSD is independent. There is no master and no shared state so we can run as many TSDs as required to handle as much load you throw at it. Each TSD uses an open source database i.e. HBase to store and retrieve time-series data.

Properties of OpenTSDB:

  • Data is stored exactly as you give it
  • Write with millisecond precision
  • Keep raw data forever
  • Runs on Hadoop and HBase
  • Scales to millions of writes per second
  • Add capacity by adding nodes.
  • Generate graphs from the GUI
  • Support for HTTP API is provided
  • Tools like Grafana can be used to visualize data more efficiently.

Why HBase:

Properties of HBase makes it a perfect fit for OpenTSDB:

Scalable: HBase uses HDFS to store data. So if we want to store more data, we just have to add more DataNode in our cluster.

Automatic replication: Your data is stored in HDFS, which by default means 3 replicas upon 3 different machines. You can also enable region replication while creating tsdb tables in HBase. (in our create_tbl script we set REGION_REPLICATION => 2)

High write throughput: The Bigtable design, which HBase follows, uses LSM trees instead of, say, B-trees, to make writes cheaper.

Create Tables in HBase for OpenTSDB:

OpenTSDB uses the following tables in HBase:
tsdb, tsdb-uid, tsdb-tree & tsdb-meta.

The time series data is stored in the tsdb table.

Copy create_tbl.sh file inside any HBase container, which we have deployed in our last article. We will do this in the hbase1 container.

#!/bin/sh


TSDB_TABLE=${TSDB_TABLE-'tsdb'}
UID_TABLE=${UID_TABLE-'tsdb-uid'}
TREE_TABLE=${TREE_TABLE-'tsdb-tree'}
META_TABLE=${META_TABLE-'tsdb-meta'}
BLOOMFILTER=${BLOOMFILTER-'ROW'}
COMPRESSION='GZ'
exec "hbase" shell <<EOF
create '$UID_TABLE', {REGION_REPLICATION => 2},
  {NAME => 'id', COMPRESSION => '$COMPRESSION', BLOOMFILTER => '$BLOOMFILTER'},
  {NAME => 'name', COMPRESSION => '$COMPRESSION', BLOOMFILTER => '$BLOOMFILTER'}
create '$TSDB_TABLE', {REGION_REPLICATION => 2},
  {NAME => 't', VERSIONS => 1, COMPRESSION => '$COMPRESSION', BLOOMFILTER => '$BLOOMFILTER'}
create '$TREE_TABLE', {REGION_REPLICATION => 2},
  {NAME => 't', VERSIONS => 1, COMPRESSION => '$COMPRESSION', BLOOMFILTER => '$BLOOMFILTER'}
create '$META_TABLE', {REGION_REPLICATION => 2},
  {NAME => 'name', COMPRESSION => '$COMPRESSION'}
EOF
docker cp create_table.sh hbase1:/

Login inside container and run create_tbl.sh script

docker exec -it hbase1 bash;

/create_tbl.sh

or

docker exec -it hbase1 /create_tbl.sh;

Deploy OpenTSDB:

To Deploy OpenTSDB we will use open source docker image provided by Peter Grace.

opentsdb.conf

tsd.core.auto_create_metrics = true
tsd.core.auto_create_tagks = true
tsd.core.auto_create_tagvs = true
tsd.core.meta.enable_realtime_ts = false
tsd.core.meta.enable_realtime_uid = false
tsd.core.meta.enable_tsuid_incrementing = false
tsd.core.meta.enable_tsuid_tracking = false
tsd.core.plugin_path =
tsd.core.preload_uid_cache = false
tsd.core.preload_uid_cache.max_entries = 300000
tsd.core.socket.timeout = 0
tsd.core.storage_exception_handler.enable = false
tsd.core.timezone = Asia/Kolkata
tsd.core.tree.enable_processing = false
tsd.core.uid.random_metrics = false
tsd.http.cachedir = /tmp
tsd.http.query.allow_delete = true
tsd.http.request.cors_domains =
tsd.http.request.cors_headers = Authorization, Content-Type, Accept, Origin, User-Agent, DNT, Cache-Control, X-Mx-ReqToken, Keep-Alive, X-Requested-With, If-Modified-Since
tsd.http.request.enable_chunked = true
tsd.http.request.max_chunk = 40960
tsd.http.show_stack_trace = true
tsd.http.staticroot = /opt/opentsdb/opentsdb-2.2.0/build/staticroot
tsd.mode = rw
tsd.network.async_io = true
tsd.network.bind = 0.0.0.0
tsd.network.keep_alive = true
tsd.network.port = 4242
tsd.network.reuse_address = true
tsd.network.tcp_no_delay = true
tsd.network.worker_threads =
tsd.no_diediedie = false
tsd.query.allow_simultaneous_duplicates = true
tsd.query.filter.expansion_limit = 4096
tsd.query.skip_unresolved_tagvs = false
tsd.query.timeout = 0
tsd.rtpublisher.enable = false
tsd.rtpublisher.plugin =
tsd.search.enable = false
tsd.search.plugin =
tsd.stats.canonical = false
tsd.storage.compaction.flush_interval = 10
tsd.storage.compaction.flush_speed = 2
tsd.storage.compaction.max_concurrent_flushes = 10000
tsd.storage.compaction.min_flush_threshold = 100
tsd.storage.enable_appends = false
tsd.storage.enable_compaction = false
tsd.storage.fix_duplicates = true
tsd.storage.flush_interval = 1000
tsd.storage.hbase.data_table = tsdb
tsd.storage.hbase.meta_table = tsdb-meta
tsd.storage.hbase.prefetch_meta = false
tsd.storage.hbase.tree_table = tsdb-tree
tsd.storage.hbase.uid_table = tsdb-uid
tsd.storage.hbase.zk_basedir = /hbase
tsd.storage.hbase.zk_quorum = <zookeeper1 vm IP>,<zookeeper2 vm IP>,<zookeeper3 vm IP>
tsd.storage.repair_appends = false

Replace <zookeeper1 vm IP>,<zookeeper2 vm IP>,<zookeeper3 vm IP> with respective vm Ip. OpenTSDB uses zookeeper to get HBase cluster information. Both HBase & OpenTSDB will use same ZooKeeper cluster.

Run OpenTSDB docker Container

docker run -dit  --name opentsdb -p 4242:4242 -v /root/hadoop/opentsdb.conf:/etc/opentsdb.conf --network generic-class-net -h opentsdb petergrace/opentsdb-docker

You can open http://<vm1 IP>:4242/ and see running OpenTSDB.

In our future article, we will discuss how can we use opentsdb http apito push stats to OpenTSDB & how to use the TCollector to push host level and other services stats to OpenTSDB.

Thanks for reading!

Reference

http://opentsdb.net/

Get comfortable using NoSQL in a free, self-directed learning course provided by RavenDB. Learn to create fully-functional real-world programs on NoSQL Databases. Register today.

Topics:
database ,tutorial ,hbase ,zookeeper ,hdfs ,opentsdb ,create tables in hbase

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}