DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Ensuring Data Integrity Through Anomaly Detection: Essential Tools for Data Engineers
  • Machine Learning: A Revolutionizing Force in Cybersecurity
  • AI Advancement for API and Microservices
  • SIEM Volume Spike Alerts Using ML

Trending

  • Why Round-Robin Won't Save You: Load Balancing Challenges in Data Streaming Services With Heterogeneous Traffic
  • Securing the AI Host: Spring AI MCP Server Communication With API Keys
  • Compliance Automated Standard Solution (COMPASS), Part 11: Compliance as Code, the OSCAL MCP Server Way
  • The Hidden Cost of AI Tokens: Engineering Patterns for 10x Resource Efficiency
  1. DZone
  2. Testing, Deployment, and Maintenance
  3. Monitoring and Observability
  4. Building a Simple AIOps Monitoring Dashboard With Prometheus and Grafana

Building a Simple AIOps Monitoring Dashboard With Prometheus and Grafana

Learn to build a simple AIOps dashboard using Prometheus, Grafana, and ML-based anomaly detection to monitor metrics, set alerts, and prevent failures.

By 
Balajee Asish Brahmandam user avatar
Balajee Asish Brahmandam
·
Aug. 05, 25 · Tutorial
Likes (1)
Comment
Save
Tweet
Share
4.2K Views

Join the DZone community and get the full member experience.

Join For Free

Machine learning (ML) is being used by AIOps (Artificial Intelligence for IT Operations) to find problems, predict failures, and automate reactions. This is changing how businesses handle their IT environments. 

This guide will show you how to make a simple monitoring dashboard that uses Prometheus to collect data and Grafana to demonstrate it. We'll also add some basic AIOps tools to the panel to make it better by adding anomaly detection, which will let you keep an eye on things before they go wrong.

Prerequisites

  • Docker and Docker Compose installed on your machine
  • Basic knowledge of monitoring metrics and alerting
  • Prometheus and Grafana Docker images (both are available via Docker Hub)

Step 1: Set Up Prometheus for Metrics Collection

Prometheus is a robust open-source monitoring and alerting tool. At certain intervals, it gathers measurements from configured targets. Here's how to configure it:

Create a Prometheus Docker Container

First, create a Docker Compose file to simplify the deployment. Here’s the docker-compose.yml file for Prometheus:

YAML
 
version: '3'
services:
  prometheus:
    image: prom/prometheus
    container_name: prometheus
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml


Configure Prometheus

Next, configure Prometheus to collect metrics. Create a prometheus.yml file with the following configuration to scrape data from a simple target:

YAML
 
global:
  scrape_interval: 15s
 scrape_configs:
  - job_name: 'simple_app'
    static_configs:
      - targets: ['host.docker.internal:8080']


Replace host.docker.internal:8080 with the address of your target application that exposes Prometheus-compatible metrics.

Start Prometheus

Run the following command to start Prometheus:

Shell
 
docker-compose up -d


You can access Prometheus at http://localhost:9090. To ensure Prometheus is collecting data, try querying a metric like up in the web UI.

Step 2: Set Up Grafana for Visualization

Now, let’s set up Grafana to visualize the data collected by Prometheus.

Create a Grafana Docker Container

Add the following service to your docker-compose.yml file to deploy Grafana:

YAML
 
grafana:
  image: grafana/grafana
  container_name: grafana
  ports:
     - "3000:3000"
  environment:
    - GF_SECURITY_ADMIN_PASSWORD=admin


This will start Grafana on port 3000.

Start Grafana

Run the following command to start Grafana:

Shell
 
docker-compose up -d grafana


You can access Grafana at http://localhost:3000. Log in with the default username admin and the password admin (or the password you specified in the docker-compose.yml file).

Add Prometheus as a Data Source

  • Go to Configuration > Data Sources.
  • Select Prometheus and enter the URL http://prometheus:9090 (or the appropriate URL for your Prometheus container).
  • Click Save & Test to verify the connection.

Create a Simple Dashboard

  • Go to Create > Dashboard and add a new panel.
  • Select Prometheus as the data source and query a metric like up or http_requests_total.
  • Visualize the data using graphs or tables.

Step 3: Implement Basic Alerting

Prometheus offers robust alerting capabilities. You can define alert rules that notify you when certain conditions are met.

Define Alert Rules

Add the following section to your prometheus.yml file to create an alert rule that triggers if the application is down (i.e., if up is 0):

YAML
 
alerting:
  alertmanagers:
    - static_configs:
        - targets: ['localhost:9093']
  rule_files:
  - "alert.rules"


Create an Alert Rule File

Create an alert.rules file with the following content:

YAML
 
groups:
- name: example
  rules:
  - alert: ApplicationDown
    expr: up == 0
    for: 5m
    labels:
      severity: critical
    annotations:
      summary: "Application is down"


Configure Alert Notifications

To receive notifications, you can set up Alertmanager, which sends alerts to email, Slack, or other channels. However, for this simple tutorial, we’ll just observe alerts in the Prometheus web UI.

Step 4: Integrating Basic AIOps for Anomaly Detection

To enhance the dashboard with AIOps features, we can introduce simple anomaly detection using machine learning. Grafana’s Machine Learning plugin allows you to detect outliers in your metrics.

Install Grafana Machine Learning Plugin

In Grafana, go to Configuration > Plugins and search for Machine Learning. Install the plugin to use basic ML features like anomaly detection.

Use ML for Anomaly Detection

Once the plugin is installed, create a panel with Anomaly Detection as the visualization. This will help you automatically identify when a metric deviates from its expected range.

Step 5: Tips for Expanding the Dashboard

Add Node Exporter for System Metrics

Once you’ve successfully created your basic dashboard with Prometheus and Grafana, you can take it further by incorporating additional exporters and services. 

For example, the Node Exporter is a useful tool that exposes system-level metrics like CPU usage, memory consumption, and disk I/O. Simply add the Node Exporter service to your docker-compose.yml file and update your prometheus.yml configuration to scrape metrics from it.

YAML
 
- job_name: 'node_exporter'
  static_configs:
    - targets: ['node-exporter:9100']


This allows you to visualize the health of your infrastructure alongside application metrics.

Using Alertmanager for Notifications

To make alerts actionable, configure Alertmanager to send notifications. You can integrate Alertmanager with various tools like Slack, PagerDuty, or even email. For instance, to set up Slack integration, define a slack_configs section in the alertmanager.yml configuration file.

YAML
 
receivers:
  - name: 'slack-notifications'
    slack_configs:
      - channel: '#alerts'
        send_resolved: true
        username: 'alertmanager'
        api_url: 'https://hooks.slack.com/services/your/webhook/url'


This setup ensures critical issues reach the right teams instantly, enabling faster incident response and closing the loop on your AIOps-driven observability stack.

Conclusion

In this tutorial, we'll create a simple AIOps monitoring dashboard with Prometheus for metrics gathering and Grafana for visualization. We improved the dashboard's AIOps capabilities by combining basic anomaly detection and machine learning, enabling automatic anomaly identification in real time. This dashboard serves as the cornerstone for proactive IT operations, allowing teams to identify issues early on and automate solutions.

Anomaly detection Grafana Machine learning Dashboard (Mac OS)

Opinions expressed by DZone contributors are their own.

Related

  • Ensuring Data Integrity Through Anomaly Detection: Essential Tools for Data Engineers
  • Machine Learning: A Revolutionizing Force in Cybersecurity
  • AI Advancement for API and Microservices
  • SIEM Volume Spike Alerts Using ML

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook