Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}
Refcard #293

Scaling and Augmenting Prometheus

Prometheus is an open-source infrastructure and services monitoring system popular for Kubernetes and cloud-native services and apps. It can help make metric collection easier, correlate events and alerts, provide security, and do troubleshooting and tracing at scale. This Refcard will teach you how to pave the path for Prometheus adoption, what observability looks like beyond Prometheus, and how Prometheus helps provide scalability, high availability, and long-term storage.

1,316

Brought to you by

Sysdig
Free .PDF for easy Reference

Written by

Jorge Salamero Sanz Director of Technical Marketing, Sysdig
Refcard #293

Scaling and Augmenting Prometheus

Prometheus is an open-source infrastructure and services monitoring system popular for Kubernetes and cloud-native services and apps. It can help make metric collection easier, correlate events and alerts, provide security, and do troubleshooting and tracing at scale. This Refcard will teach you how to pave the path for Prometheus adoption, what observability looks like beyond Prometheus, and how Prometheus helps provide scalability, high availability, and long-term storage.

1,316
Free .PDF for easy Reference

Written by

Jorge Salamero Sanz Director of Technical Marketing, Sysdig

Brought to you by

Sysdig
Table of Contents

Introduction

Section 1

Introduction

Prometheus is an open-source infrastructure and services monitoring system that has become popular for Kubernetes and cloud-native services and applications. Although Prometheus itself is just the metrics server, often it's what the entire monitoring stack is known as. A few pieces build the monitoring system:

  • Exporters: Sidecar containers that collect and expose container and service metrics
  • Prometheus server: Walks through all the exporters' and other metrics' endpoints collecting (pulling) the data
  • AlertManager: Provides alerting capabilities on top of the metrics server
  • Grafana: Dashboarding interface for querying and displaying metrics
  • Prometheus metrics: Metric format for Prometheus; many of the first libraries to implement this format were born within the Prometheus project (has its own project here) While deploying and quick-start monitoring your infrastructure and services with Prometheus is easy and straightforward, there are some areas where it falls short. Common challenges that organizations face with Prometheus once they are in production involve scalability, high availability, long-term storage, and day 2 operational drawbacks. In the last few months, several solutions have been made available. We will discuss which approaches can be taken to run Prometheus at scale.

While many cloud-native services are heavily instrumented with Prometheus metrics, other services and applications are not. Metrics are a precious commodity, but way more rich and powerful insights can be obtained from your infrastructure, services, and applications, which can support essential workflows like troubleshooting or even security. This Refcard will introduce four ideas on how we can see more deeply so we can fix issues faster or even before they happen.

  1. Simplified metric collection

  2. Events correlation and alerts

  3. Troubleshooting and tracing

  4. Security

This is page 1 of the Scaling and Augmenting Prometheus Refcard. To read the full Refcard, you can download the full PDF above.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}