DORA Metrics: Tracking and Observability With Jenkins, Prometheus, and Observe

DORA metrics give you powerful insights into software delivery and the right observability tools will supercharge your software delivery.

Bhargavi Gorantla

Aug. 26, 24 · Tutorial

Likes (6)

Comment

Save

10.3K Views

DORA (DevOps Research and Assessment) metrics, developed by the DORA team have become a standard for measuring the efficiency and effectiveness of DevOps implementations. As organizations start to adopt DevOps practices to accelerate software delivery, tracking performance and reliability becomes critical. DORA metrics help organizations address these critical tasks by providing a framework for understanding how well teams are delivering software and how quickly they can recover from failures. This article will delve into DORA metrics, demonstrate how to track them using Jenkins, and explore how to use Prometheus for collecting and displaying these metrics in Observe.

What Are DORA Metrics?

DORA metrics are a set of four key performance indicators (KPIs) that help organizations evaluate their software delivery performance. These metrics are:

Deployment Frequency (DF): Measures how often code is deployed to production
Lead Time for Changes (LT): Time taken from code commit to production deployment
Change Failure Rate (CFR): The percentage of changes failed in production
Mean Time to Restore (MTTR): The average time it takes to recover from a failure in production

These metrics are valuable because they provide actionable insights into software development and deployment practices. High-performing teams tend to deploy more frequently and have shorter lead times, lower failure rates, and quicker recovery times, leading to more resilient and robust applications.

Tracking DORA Metrics in Jenkins

Jenkins is a widely used automation server to enable continuous integration and delivery (CI/CD). Below is an example of how to track DORA metrics using a Jenkins pipeline, using shell commands and scripts to log deployment frequency, calculate lead time for changes, monitor change failure rate, and determine the mean time to restore.

    Groovy
   
 

   pipeline {
    agent any

    environment {
        DEPLOY_LOG = 'deploy.log'
        FAIL_LOG = 'fail.log'
    }
  
	//Build Application
    stages {
        stage('Build') {
            steps {
                echo 'Building the application...'
                // Run required build commands
                sh 'make build'
            }
        }
      
	// Test Application
        stage('Test') {
            steps {
                echo 'Running tests...'
                // run required test commands
                sh 'make test'
            }
        }
      
	// Deploy application
        stage('Deploy') {
            steps {
                echo 'Deploying the application...'
                // run the deployment steps
                sh 'make deploy'

                // Log the deployment into log file to compute deployment frequency
                sh "echo $(date '+%F_%T') >> ${DEPLOY_LOG}"
            }
        }
    }

    post {
        always {
            script {
                // Computing deployment frequency (DF)
                def deploymentCount = sh(script: "wc -l < ${DEPLOY_LOG}", returnStdout: true).trim()
                echo "# of Deployments: ${deploymentCount}"

   
                // Writing build failures into log for computing CFR
                if (currentBuild.result == 'FAILURE') {
                    sh "echo $(date '+%F_%T') >> ${FAIL_LOG}"
                }

                // Computing Change Failure Rate (CFR)
                def failureCount = sh(script: "wc -l < ${FAIL_LOG}", returnStdout: true).trim()
                def CFR = (failureCount.toInteger() * 100) / deploymentCount.toInteger()
                echo "Change Failure Rate: ${CFR}%"

                // Computing Lead Time for Changes(LTC) using last commit and deploy times
                def commitTime = sh(script: "git log -1 --pretty=format:'%ct'", returnStdout: true).trim()
                def currentTime = sh(script: "date +%s", returnStdout: true).trim()
                def leadTime = (currentTime.toLong() - commitTime.toLong()) / 3600
                echo "Lead Time for Changes: ${leadTime} hours"

            }
        }
		
      	//End if pipeline
        success {
            echo 'Deployment Successful!'
        }

        failure {
            echo 'Deployment failed!'
            // Failure handling
        }
    }
}

  

In the above script, each deployment is logged as a timestamp in the deploy file, which can be used to determine the deployment frequency as you go. Similarly, failures are logged as timestamps in the fail log file and both counts are used to compute change failure rate. Additionally, the time difference between the last commit time and the current time provides the lead time for changes.

Monitoring DORA Metrics With Prometheus and Observe

Prometheus is an open-source monitoring and alerting toolkit commonly used for collecting metrics from applications. Combined with Observe, a modern observability platform, Prometheus can be used to visualize and monitor DORA metrics in real-time.

Install Prometheus on server: Download and install Prometheus from the link.

Configure Prometheus: Set up the prometheus.yml configuration file to define the metrics to be collected and time intervals. Example configuration:

      YAML
     
 

         #setting time interval at which metrics are collected
    global:
      scrape_interval: 30s

    #Configuring Prometheus to collect metics from Jenkins on specific port
    scrape_configs:
      - job_name: 'jenkins'
        static_configs:
          - targets: ['<JENKINS_SERVER>:<PORT>']

    

Expose Metrics in Jenkins: You can use either the Prometheus plugin for Jenkins or a custom script to expose metrics in a format that Prometheus can use to collect. Example Python script:

    Python
   
 

   from prometheus_client import start_http_server, Gauge
import random
import time

# Creating Prometheus metrics gauges for the four DORA KPIs
DF = Gauge('Deployment Frequency', 'No. of deployments in a day')
LT = Gauge('Lead Time For Changes', 'Average lead time for changes in hours')
CFR = Gauge('Change Failure Rate', 'Percentage of changes failures in production')
MTTR = Gauge('Mean Time To Restore', 'Mean time to restore service after failure in minutes')

#Start server
start_http_server(8000)

#Sending random values to generate sample metrics to test
while True:
    DF.set(random.randint(1, 9))
    LT.set(random.uniform(1, 18))
    CFR.set(random.uniform(0, 27))
    MTTR.set(random.uniform(1, 45))

    #Sleep for 30s
    time.sleep(30)
  

Save this script on the server where Jenkins is running and run it to expose the metrics on port 8000.

Add Prometheus Data Source to Observe: Observe is a monitoring and observability tool that provides advanced features for monitoring, analyzing, and visualizing observability data. In Observe, you can add Prometheus as a data source by navigating to the integrations section and configuring Prometheus with the appropriate endpoint URL.
Set up Dashboards in Observe, and create dashboards with widgets to display graphs for these different metrics.
Set up monitoring to configure alerts on set thresholds and analyze trends and patterns by drilling down into specific metrics.

Conclusion

DORA metrics are essential for assessing the performance and efficiency of DevOps practices. By implementing tracking in Jenkins pipelines and leveraging monitoring tools like Prometheus and Observe, organizations can gain deep insights into their software delivery processes. These metrics help teams continuously improve, making data-driven decisions that enhance deployment frequency, reduce lead time, minimize failures, and accelerate recovery. Adopting a robust observability strategy ensures that these metrics are visible to stakeholders, fostering a culture of transparency and continuous improvement in software development and delivery.

Observability Jenkins (software) Metric (unit) Continuous Integration/Deployment DevOps

Opinions expressed by DZone contributors are their own.

Related

Trending