Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Monitoring CoreOS Clusters

DZone's Guide to

Monitoring CoreOS Clusters

Take full advantage of SPM and Logsene by defining intelligent alerts for metrics and logs, delivered to channels like e-mail, PagerDuty, Slack, HipChat or any WebHook,

· Performance Zone ·
Free Resource

Learn how error monitoring with Sentry closes the gap between the product team and your customers. With Sentry, you can focus on what you do best: building and scaling software that makes your users’ lives better.

[This article was written by Mick Emmett]

In this post you’ll learn how to get operational insights (i.e. performance metrics, container events, etc.) from CoreOS and make that super simple with etcd, fleet, and SPM.

We’ll use:

  • SPM for Docker to run the monitoring agent as a Docker container and collect all Docker metrics and events for all other containers on the same host + metrics for hosts
  • fleet to seamlessly distribute this container to all hosts in the CoreOS cluster by simply providing it with a fleet unit file shown below
  • etcd to set a property to hold the SPM App token for the whole cluster

The Big Picture

Before we get started, let’s take a step back and look at our end goal.  What do we want?  We want charts with Performance Metrics, we want Event Collection, we’d love integrated Anomaly Detection and Alerting, and we want that not only for containers, but also for hosts running containers.  CoreOS has no package manager and deploys services in containers, so we want to run the SPM agent in a Docker container, as shown in the following figure:

SPM_for_Docker

By the end of this post each of your Docker hosts could look like the above figure, with one or more of your own containers running your own apps, and a single SPM Docker Agent container that monitors all your containers and the underlying hosts.

3 Simple Steps

1)  Create a new SPM App of type “Docker” and copy the SPM App Token

2) Set the SPM App Token via etcd. This makes the token instantly available to all SPM agent instances in the cluster:

etcdctl set /sematext.com/myapp/spm/token/SPM_TOKEN YOUR_SPM_APP_TOKEN

Of course, you can change “myapp” part to whatever you want.  This simply acts as a namespace in etcd in case you have multiple SPM Apps (and thus multiple SPM App Tokens).

3) Grab the spm-agent.service fleet unit file and submit it to fleet:

# download service file for spm-agent-docker
wget https://raw.githubusercontent.com/sematext/spm-agent-docker/master/coreos/spm-agent.service
# Load and start the service with
fleetctl load spm-agent.service
fleetctl start spm-agent.service

Fleet unit file

What’s this fleet unit file about?  It simple.  It reads the SPM App Token from etcd and then starts the Docker container with spm-agent-docker inside. This is what it looks like:

[Unit]
Description=SPM Docker Agent
After=docker.service
Requires=docker.service

[Service]
TimeoutStartSec=0
EnvironmentFile=/etc/environment
Restart=always
RestartSec=30s
ExecStartPre=-/usr/bin/docker kill spm-agent
ExecStartPre=-/usr/bin/docker rm spm-agent
ExecStartPre=/usr/bin/docker pull sematext/spm-agent-docker:latest
ExecStart=/bin/sh -c 'set -ex; /usr/bin/docker run --name spm-agent -e
SPM_TOKEN=$(etcdctl get /sematext.com/myapp/spm/SPM_TOKEN) -e HOSTNAME=$HOSTNAME -v /var/run/docker.sock:/var/run/docker.sock sematext/spm-agent-docker' ExecStop=/usr/bin/docker stop spm-agent

[Install]
WantedBy=multi-user.target

[X-Fleet]
Global=true

After about a minute, you should see Docker metrics and events in SPM.

Bildschirmfoto 2015-06-24 um 13.56.39

Open Sourced Everything

Everything described here is open-sourced:

Summary – What this gets you

What we  get after this setup is the following:

Having this little setup let’s you take the full advantage of SPM and Logsene e.g. by defining intelligent alerts for metrics and logs, delivered to channels like e-mail, PagerDuty, Slack, HipChat or any WebHook, as well as making correlations between performance metrics, events, logs, and alerts.

Running CoreOS? Need any help getting CoreOS metrics and/or logs into SPM & Logsene?  Let us know!  Oh, and if you’re a small startup — ping @sematext — you can get a good discount on both SPM and Logsene!

What’s the best way to boost the efficiency of your product team and ship with confidence? Check out this ebook to learn how Sentry's real-time error monitoring helps developers stay in their workflow to fix bugs before the user even knows there’s a problem.

Topics:
performance ,coreos ,os ,monitoring

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}