{{announcement.body}}
{{announcement.title}}

Prometheus Metrics Autoscaling in Kubernetes

DZone 's Guide to

Prometheus Metrics Autoscaling in Kubernetes

This set-up demonstrates how we can use the Prometheus adapter to autoscale deployments based on some custom metrics.

· Cloud Zone ·
Free Resource

Introduction

One of the major advantages of using Kubernetes for container orchestration is that it makes it really easy to scale our application horizontally and account for increased load. Natively, horizontal pod autoscaling can scale the deployment based on CPU and Memory usage but in more complex scenarios we would want to account for other metrics before making scaling decisions.

Enter Prometheus Adapter. Prometheus is the standard tool for monitoring deployed workloads and the Kubernetes cluster itself. Prometheus adapter helps us to leverage the metrics collected by Prometheus and use them to make scaling decisions. These metrics are exposed by an API service and can be readily used by our Horizontal Pod Autoscaling object.

Deployment

Architecture Overview

We will be using Prometheus adapter to pull custom metrics from our Prometheus installation and then let the Horizontal Pod Autoscaler (HPA) use it to scale the pods up or down.

Overview


Prerequisite

  • Basic knowledge about horizontal pod autoscaling
  • Prometheus deployed in-cluster or accessible using an endpoint.

We will be using a Prometheus-Thanos Highly Available deployment.

Deploying the Sample Application

Let’s first deploy a sample app over which we will be testing our Prometheus metrics autoscaling. We can use the manifest below to do it

Shell
 




x
59


 
1
apiVersion: v1
2
kind: Namespace
3
metadata:
4
  name: nginx
5
---
6
apiVersion: extensions/v1beta1
7
kind: Deployment
8
metadata:
9
  namespace: nginx
10
  name: nginx-deployment
11
spec:
12
  replicas: 1
13
  template:
14
    metadata:
15
      annotations:
16
        prometheus.io/path: "/status/format/prometheus"
17
        prometheus.io/scrape: "true"
18
        prometheus.io/port: "80"
19
      labels:
20
        app: nginx-server
21
    spec:
22
      affinity:
23
        podAntiAffinity:
24
          preferredDuringSchedulingIgnoredDuringExecution:
25
          - weight: 100
26
            podAffinityTerm:
27
              labelSelector:
28
                matchExpressions:
29
                - key: app
30
                  operator: In
31
                  values:
32
                  - nginx-server
33
              topologyKey: kubernetes.io/hostname
34
      containers:
35
      - name: nginx-demo
36
        image: vaibhavthakur/nginx-vts:1.0
37
        imagePullPolicy: Always
38
        resources:
39
          limits:
40
            cpu: 2500m
41
          requests:
42
            cpu: 2000m
43
        ports:
44
        - containerPort: 80
45
          name: http
46
---
47
apiVersion: v1
48
kind: Service
49
metadata:
50
  namespace: nginx
51
  name: nginx-service
52
spec:
53
  ports:
54
  - port: 80
55
    targetPort: 80
56
    name: http
57
  selector:
58
    app: nginx-server
59
  type: LoadBalancer



This will create a namespace named nginx and deploy a sample Nginx application in it. The application can be accessed using the service and also exposes nginx vts metrics at the endpoint /status/format/prometheus over port 80. For the sake of our setup, we have created a DNS entry for the ExternalIP which maps to .

Shell
 




xxxxxxxxxx
1
78


 
1
root$ kubectl get deploy 
2
NAME               READY   UP-TO-DATE   AVAILABLE   AGE
3
nginx-deployment   1/1     1            1           43d
4
 
          
5
root$ kubectl get pods 
6
NAME                                READY   STATUS    RESTARTS   AGE
7
nginx-deployment-65d8df7488-c578v   1/1     Running   0          9h
8
 
          
9
root$ kubectl get svc
10
NAME            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
11
nginx-service   ClusterIP   10.63.253.154   35.232.67.34      80/TCP    43d
12
 
          
13
root$ kubectl describe deploy nginx-deployment
14
Name:                   nginx-deployment
15
Namespace:              nginx
16
CreationTimestamp:      Tue, 08 Oct 2019 11:47:36 -0700
17
Labels:                 app=nginx-server
18
Annotations:            deployment.kubernetes.io/revision: 1
19
                        kubectl.kubernetes.io/last-applied-configuration:
20
                          {"apiVersion":"extensions/v1beta1","kind":"Deployment","metadata":{"annotations":{},"name":"nginx-deployment","namespace":"nginx"},"spec":...
21
Selector:               app=nginx-server
22
Replicas:               1 desired | 1 updated | 1 total | 1 available | 0 unavailable
23
StrategyType:           RollingUpdate
24
MinReadySeconds:        0
25
RollingUpdateStrategy:  1 max unavailable, 1 max surge
26
Pod Template:
27
  Labels:       app=nginx-server
28
  Annotations:  prometheus.io/path: /status/format/prometheus
29
                prometheus.io/port: 80
30
                prometheus.io/scrape: true
31
  Containers:
32
   nginx-demo:
33
    Image:      vaibhavthakur/nginx-vts:v1.0
34
    Port:       80/TCP
35
    Host Port:  0/TCP
36
    Limits:
37
      cpu:  250m
38
    Requests:
39
      cpu:        200m
40
    Environment:  <none>
41
    Mounts:       <none>
42
  Volumes:        <none>
43
Conditions:
44
  Type           Status  Reason
45
  ----           ------  ------
46
  Available      True    MinimumReplicasAvailable
47
OldReplicaSets:  <none>
48
NewReplicaSet:   nginx-deployment-65d8df7488 (1/1 replicas created)
49
Events:          <none>
50
 
          
51
 
          
52
root$ curl nginx.gotham.com
53
<!DOCTYPE html>
54
<html>
55
<head>
56
<title>Welcome to nginx!</title>
57
<style>
58
    body {
59
        width: 35em;
60
        margin: 0 auto;
61
        font-family: Tahoma, Verdana, Arial, sans-serif;
62
    }
63
</style>
64
</head>
65
<body>
66
<h1>Welcome to nginx!</h1>
67
<p>If you see this page, the nginx web server is successfully installed and
68
working. Further configuration is required.</p>
69
 
          
70
<p>For online documentation and support please refer to
71
<a href="http://nginx.org/">nginx.org</a>.<br/>
72
Commercial support is available at
73
<a href="http://nginx.com/">nginx.com</a>.</p>
74
 
          
75
<p><em>Thank you for using nginx.</em></p>
76
</body>
77
</html>
78
 
          



These are all the metrics currently exposed by the application:

Shell
 




xxxxxxxxxx
1
69


 
1
$ curl nginx.gotham.com/status/format/prometheus
2
# HELP nginx_vts_info Nginx info
3
# TYPE nginx_vts_info gauge
4
nginx_vts_info{hostname="nginx-deployment-65d8df7488-c578v",version="1.13.12"} 1
5
# HELP nginx_vts_start_time_seconds Nginx start time
6
# TYPE nginx_vts_start_time_seconds gauge
7
nginx_vts_start_time_seconds 1574283147.043
8
# HELP nginx_vts_main_connections Nginx connections
9
# TYPE nginx_vts_main_connections gauge
10
nginx_vts_main_connections{status="accepted"} 215
11
nginx_vts_main_connections{status="active"} 4
12
nginx_vts_main_connections{status="handled"} 215
13
nginx_vts_main_connections{status="reading"} 0
14
nginx_vts_main_connections{status="requests"} 15577
15
nginx_vts_main_connections{status="waiting"} 3
16
nginx_vts_main_connections{status="writing"} 1
17
# HELP nginx_vts_main_shm_usage_bytes Shared memory [ngx_http_vhost_traffic_status] info
18
# TYPE nginx_vts_main_shm_usage_bytes gauge
19
nginx_vts_main_shm_usage_bytes{shared="max_size"} 1048575
20
nginx_vts_main_shm_usage_bytes{shared="used_size"} 3510
21
nginx_vts_main_shm_usage_bytes{shared="used_node"} 1
22
# HELP nginx_vts_server_bytes_total The request/response bytes
23
# TYPE nginx_vts_server_bytes_total counter
24
# HELP nginx_vts_server_requests_total The requests counter
25
# TYPE nginx_vts_server_requests_total counter
26
# HELP nginx_vts_server_request_seconds_total The request processing time in seconds
27
# TYPE nginx_vts_server_request_seconds_total counter
28
# HELP nginx_vts_server_request_seconds The average of request processing times in seconds
29
# TYPE nginx_vts_server_request_seconds gauge
30
# HELP nginx_vts_server_request_duration_seconds The histogram of request processing time
31
# TYPE nginx_vts_server_request_duration_seconds histogram
32
# HELP nginx_vts_server_cache_total The requests cache counter
33
# TYPE nginx_vts_server_cache_total counter
34
nginx_vts_server_bytes_total{host="_",direction="in"} 3303449
35
nginx_vts_server_bytes_total{host="_",direction="out"} 61641572
36
nginx_vts_server_requests_total{host="_",code="1xx"} 0
37
nginx_vts_server_requests_total{host="_",code="2xx"} 15574
38
nginx_vts_server_requests_total{host="_",code="3xx"} 0
39
nginx_vts_server_requests_total{host="_",code="4xx"} 2
40
nginx_vts_server_requests_total{host="_",code="5xx"} 0
41
nginx_vts_server_requests_total{host="_",code="total"} 15576
42
nginx_vts_server_request_seconds_total{host="_"} 0.000
43
nginx_vts_server_request_seconds{host="_"} 0.000
44
nginx_vts_server_cache_total{host="_",status="miss"} 0
45
nginx_vts_server_cache_total{host="_",status="bypass"} 0
46
nginx_vts_server_cache_total{host="_",status="expired"} 0
47
nginx_vts_server_cache_total{host="_",status="stale"} 0
48
nginx_vts_server_cache_total{host="_",status="updating"} 0
49
nginx_vts_server_cache_total{host="_",status="revalidated"} 0
50
nginx_vts_server_cache_total{host="_",status="hit"} 0
51
nginx_vts_server_cache_total{host="_",status="scarce"} 0
52
nginx_vts_server_bytes_total{host="*",direction="in"} 3303449
53
nginx_vts_server_bytes_total{host="*",direction="out"} 61641572
54
nginx_vts_server_requests_total{host="*",code="1xx"} 0
55
nginx_vts_server_requests_total{host="*",code="2xx"} 15574
56
nginx_vts_server_requests_total{host="*",code="3xx"} 0
57
nginx_vts_server_requests_total{host="*",code="4xx"} 2
58
nginx_vts_server_requests_total{host="*",code="5xx"} 0
59
nginx_vts_server_requests_total{host="*",code="total"} 15576
60
nginx_vts_server_request_seconds_total{host="*"} 0.000
61
nginx_vts_server_request_seconds{host="*"} 0.000
62
nginx_vts_server_cache_total{host="*",status="miss"} 0
63
nginx_vts_server_cache_total{host="*",status="bypass"} 0
64
nginx_vts_server_cache_total{host="*",status="expired"} 0
65
nginx_vts_server_cache_total{host="*",status="stale"} 0
66
nginx_vts_server_cache_total{host="*",status="updating"} 0
67
nginx_vts_server_cache_total{host="*",status="revalidated"} 0
68
nginx_vts_server_cache_total{host="*",status="hit"} 0
69
nginx_vts_server_cache_total{host="*",status="scarce"} 0



Among these, we are particularly interested in nginx_vts_server_requests_total. We will be using the value of this metric to determine whether or not to scale our Nginx deployment.

Create Prometheus Adapter ConfigMap

Use the manifest below to create the Prometheus Adapter Configmap.

YAML


This config map only specifies a single metric. However, we can always add more metrics.

Create Prometheus Adapter Deployment

Use the following manifest to deploy Prometheus Adapter

YAML


This will create our deployment which will spawn the Prometheus adapter pod to pull metrics from Prometheus. It should be noted that we have set the argument

--prometheus-url=http://thanos-querier.monitoring:9090/. This is because we have deployed a Prometheus-Thanos cluster in the monitoring namespace in the same Kubernetes cluster as the Prometheus adapter. You can change this argument to point to your Prometheus deployment.

If you notice the logs of this container you can see that it is fetching the metric defined in the config file.

Shell
 




xxxxxxxxxx
1


1
I1122 00:26:53.228394       1 api.go:74] GET http://thanos-querier.monitoring:9090/api/v1/series?match%5B%5D=nginx_vts_server_requests_total&start=1574381213.217 200 OK
2
I1122 00:26:53.234234       1 api.go:93] Response Body: {"status":"success","data":[{"__name__":"nginx_vts_server_requests_total","app":"nginx-server","cluster":"prometheus-ha","code":"1xx","host":"*","instance":"10.60.64.39:80","job":"kubernetes-pods","kubernetes_namespace":"nginx","kubernetes_pod_name":"nginx-deployment-65d8df7488-sbp95","pod_template_hash":"65d8df7488"},{"__name__":"nginx_vts_server_requests_total","app":"nginx-server","cluster":"prometheus-ha","code":"1xx","host":"*","instance":"10.60.64.8:80","job":"kubernetes-pods","kubernetes_namespace":"nginx","kubernetes_pod_name":"nginx-deployment-65d8df7488-mwzxg","pod_template_hash":"65d8df7488"}



Creating Prometheus Adapter API Service

The manifest below will create an API service so that our Prometheus adapter is accessible by Kubernetes API and thus metrics can be fetched by our Horizontal Pod Autoscaler.

Shell
 




xxxxxxxxxx
1
25


 
1
apiVersion: v1
2
kind: Service
3
metadata:
4
  name: custom-metrics-apiserver
5
  namespace: monitoring
6
spec:
7
  ports:
8
  - port: 443
9
    targetPort: 6443
10
  selector:
11
    app: custom-metrics-apiserver
12
---
13
apiVersion: apiregistration.k8s.io/v1beta1
14
kind: APIService
15
metadata:
16
  name: v1beta1.custom.metrics.k8s.io
17
spec:
18
  service:
19
    name: custom-metrics-apiserver
20
    namespace: monitoring
21
  group: custom.metrics.k8s.io
22
  version: v1beta1
23
  insecureSkipTLSVerify: true
24
  groupPriorityMinimum: 100
25
  versionPriority: 100



Testing the Set-Up

Let’s check what all custom metrics are available

Shell
 




xxxxxxxxxx
1
27


1
root$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .
2
 
          
3
{
4
  "kind": "APIResourceList",
5
  "apiVersion": "v1",
6
  "groupVersion": "custom.metrics.k8s.io/v1beta1",
7
  "resources": [
8
    {
9
      "name": "pods/nginx_vts_server_requests_per_second",
10
      "singularName": "",
11
      "namespaced": true,
12
      "kind": "MetricValueList",
13
      "verbs": [
14
        "get"
15
      ]
16
    },
17
    {
18
      "name": "namespaces/nginx_vts_server_requests_per_second",
19
      "singularName": "",
20
      "namespaced": false,
21
      "kind": "MetricValueList",
22
      "verbs": [
23
        "get"
24
      ]
25
    }
26
  ]
27
}



We can see that nginx_vts_server_requests_per_second metric is available. Now, let’s check the current value of this metric.

Shell
 




xxxxxxxxxx
1
22


1
root$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/nginx/pods/*/nginx_vts_server_requests_per_second" | jq .
2
 
          
3
{
4
  "kind": "MetricValueList",
5
  "apiVersion": "custom.metrics.k8s.io/v1beta1",
6
  "metadata": {
7
    "selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/nginx/pods/%2A/nginx_vts_server_requests_per_second"
8
  },
9
  "items": [
10
    {
11
      "describedObject": {
12
        "kind": "Pod",
13
        "namespace": "nginx",
14
        "name": "nginx-deployment-65d8df7488-v575j",
15
        "apiVersion": "/v1"
16
      },
17
      "metricName": "nginx_vts_server_requests_per_second",
18
      "timestamp": "2019-11-19T18:38:21Z",
19
      "value": "1236m"
20
    }
21
  ]
22
}



Create an HPA which will utilize these metrics. We can use the manifest below to do it.

Shell
 




xxxxxxxxxx
1
17


1
apiVersion: autoscaling/v2beta1
2
kind: HorizontalPodAutoscaler
3
metadata:
4
  name: nginx-custom-hpa
5
  namespace: nginx
6
spec:
7
  scaleTargetRef:
8
    apiVersion: extensions/v1beta1
9
    kind: Deployment
10
    name: nginx-deployment
11
  minReplicas: 2
12
  maxReplicas: 10
13
  metrics:
14
  - type: Pods
15
    pods:
16
      metricName: nginx_vts_server_requests_per_second
17
      targetAverageValue: 4000m



Once you have applied this manifest, you can check the current status of HPA as follows:

Shell
 




xxxxxxxxxx
1
14


 
1
root$ kubectl describe hpa
2
Name:               nginx-custom-hpa
3
Namespace:          nginx
4
Labels:             <none>
5
Annotations:        autoscaling.alpha.kubernetes.io/metrics:
6
                      [{"type":"Pods","pods":{"metricName":"nginx_vts_server_requests_per_second","targetAverageValue":"4"}}]
7
                    kubectl.kubernetes.io/last-applied-configuration:
8
                      {"apiVersion":"autoscaling/v2beta1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"nginx-custom-hpa","namespace":"n...
9
CreationTimestamp:  Thu, 21 Nov 2019 11:11:05 -0800
10
Reference:          Deployment/nginx-deployment
11
Min replicas:       2
12
Max replicas:       10
13
Deployment pods:    0 current / 0 desired
14
Events:             <none>



Now, let's generate some load on our service. We will be using a utility called Vegeta for this.
In a separate terminal run the following command.

Shell
 




xxxxxxxxxx
1


 
1
echo "GET http://nginx.gotham.com/" | vegeta attack -rate=5 -duration=0 | vegeta report



Meanwhile monitor the Nginx pods and horizontal pod autoscaler and you should see something like this 

Shell
 




xxxxxxxxxx
1
29


1
root$ kubectl get -w pods
2
NAME                                READY   STATUS    RESTARTS   AGE
3
nginx-deployment-65d8df7488-mwzxg   1/1     Running   0          9h
4
nginx-deployment-65d8df7488-sbp95   1/1     Running   0          4m9s
5
NAME                                AGE
6
nginx-deployment-65d8df7488-pwjzm   0s
7
nginx-deployment-65d8df7488-pwjzm   0s
8
nginx-deployment-65d8df7488-pwjzm   0s
9
nginx-deployment-65d8df7488-pwjzm   2s
10
nginx-deployment-65d8df7488-pwjzm   4s
11
nginx-deployment-65d8df7488-jvbvp   0s
12
nginx-deployment-65d8df7488-jvbvp   0s
13
nginx-deployment-65d8df7488-jvbvp   1s
14
nginx-deployment-65d8df7488-jvbvp   4s
15
nginx-deployment-65d8df7488-jvbvp   7s
16
nginx-deployment-65d8df7488-skjkm   0s
17
nginx-deployment-65d8df7488-skjkm   0s
18
nginx-deployment-65d8df7488-jh5vw   0s
19
nginx-deployment-65d8df7488-skjkm   0s
20
nginx-deployment-65d8df7488-jh5vw   0s
21
nginx-deployment-65d8df7488-jh5vw   1s
22
nginx-deployment-65d8df7488-skjkm   2s
23
nginx-deployment-65d8df7488-jh5vw   2s
24
nginx-deployment-65d8df7488-skjkm   3s
25
nginx-deployment-65d8df7488-jh5vw   4s
26
 
          
27
root$ kubectl get hpa
28
NAME               REFERENCE                     TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
29
nginx-custom-hpa   Deployment/nginx-deployment   5223m/4   2         10        3          5m5s



It can be clearly seen that the HPA scaled up our pods as per the requirement, and when we interrupted the Vegeta command, we got the vegeta report. It clearly shows that all our requests were served by the application.

Shell
 




xxxxxxxxxx
1


 
1
root$ echo "GET http://nginx.gotham.com/" | vegeta attack -rate=5 -duration=0 | vegeta report
2
^CRequests      [total, rate, throughput]  224, 5.02, 5.02
3
Duration      [total, attack, wait]      44.663806863s, 44.601823883s, 61.98298ms
4
Latencies     [mean, 50, 95, 99, max]    63.3879ms, 60.867241ms, 79.414139ms, 111.981619ms, 229.310088ms
5
Bytes In      [total, mean]              137088, 612.00
6
Bytes Out     [total, mean]              0, 0.00
7
Success       [ratio]                    100.00%
8
Status Codes  [code:count]               200:224  
9
Error Set:



Conclusion

This set-up demonstrates how we can use the Prometheus adapter to autoscale deployments based on some custom metrics. For the sake of simplicity, we have only fetched one metric from our Prometheus server. However, the adapter Configmap can be extended to fetch some or all the available metrics and use them for autoscaling.

If the Prometheus installation is outside of our Kubernetes cluster, we just need to make sure that the query end-point is accessible from the cluster and update it in the adapter deployment manifest. With more complex scenarios, multiple metrics can be fetched and used in-combination to make scaling decisions.

Feel free to reach out should you have any questions around the set-up and we would be happy to assist you.

This article was originally published on https://appfleet.com/blog/prometheus-metrics-based-autoscaling-in-kubernetes/.

Topics:
cloud native, docker, kubernetes, prometheus, scaling, serverless

Published at DZone with permission of Sudip Sengupta . See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}