Resource Management in Kubernetes

This article explores how resource management works on Kubernetes and walks through technical details and examples.

Harsha Patil

Jun. 18, 24 · Tutorial

Likes (3)

Comment

Save

3.6K Views

Kubernetes is a container orchestration platform that automates application management like deployments, scaling, etc. While there are many benefits of Kubernetes, one key feature is its resource management. This blog walks through how resource management works on Kubernetes and includes technical details and examples.

Primary Resource Types

Kubernetes has two resources, mainly:

Central processing unit (CPU): This is measured in cores.
Memory (RAM): This is measured in bytes.

Requests and Limits

Kubernetes lets you specify requests and limits for every resource on each container within a pod.

Requests: This describes the bare minimum of memory or CPU that the container may operate on. To schedule the pods onto the nodes, the scheduler makes use of these values.
Limits: This is the upper bound. If it exceeds the limits, the container either throttles or terminates.

Example Specification for Requests and Limits

Resource requests and limits can be annotated in the pod specification. Below is a YAML example of the pod spec to show resources.

    YAML
   
   apiVersion: v1

kind: Pod

metadata:

 name: test-app

spec:

 containers:

 - name: test-app-container

   image: nginx

   resources:

     requests:

       memory: "100Mi"

       cpu: "500m"

     limits:

       memory: "512Mi"

       cpu: "1"

From the above pod spec, it can be seen that the resources requested are 500Millicores and 100MiB for CPU and memory (respectively) by the test-app-container.

How Pod Scheduling Works

Kubernetes verifies the addition of all the requests of each pod's containers, and only if there is any lever over of a node that fulfills the whole container's needs, they will be scheduled on that. Plus, it checks the sum of requests for all the resources from all the pods; this sum should not exceed the node’s capacity so the node can operate and schedule all the pods.

Overcommitment Resources

A user may set up overcommitment so that the sum of the requested resources is less than the node’s capacity while the sum of the resource limits is more than that. Since not every container on a node will need the maximum resource at once, this is permitted.

Example of Overcommitment

Here is an example of the resources on a node:

CPU: 2

Memory: 4Gi

You can run multiple pods with the following:

    YAML
   
   # Pod 1

resources:

 requests:

   memory: "1Gi"

   cpu: "500m"

 limits:

   memory: "2Gi"

   cpu: "1"

# Pod 2

resources:

 requests:

   memory: "1Gi"

   cpu: "500m"

 limits:

   memory: "2Gi"

   cpu: "1"

The total CPU and memory requests (1 and 2Gi) are within the node’s resources. The total limits (2 and 4Gi) are the same as the node’s resources so we can overcommit.

Quality of Services (QoS) Classes

Pods are given QoS classes by Kubernetes based on their resource constraints and requests:

Guaranteed: These pods are given the highest priority and have equal requests and limits.
Burstable: This is assigned to the pods that have limits greater than requests.
BestEffort: Pods are classified under this class when there are no limits or requests.

Example of QoS Classes

    YAML
   
   # Guaranteed QoS

resources:

 requests:

   memory: "512Mi"

   cpu: "1"

 limits:

   memory: "512Mi"

   cpu: "1"

# Burstable QoS

resources:

 requests:

   memory: "128Mi"

   cpu: "500m"

 limits:

   memory: "512Mi"

   cpu: "1"

# BestEffort QoS

# No requests or limits specified

Requests vs. Limits: Memory and CPU

To make Kubernetes resource management work, understanding the difference between requests and memory and CPU limits is really important.

Requests and Limits for CPU

CPU requests: This is the minimum CPU a container will get. The scheduler will consider this when scheduling the pod on a node.
CPU limits: This is the maximum CPU that a container can use. If a container uses more CPU than it can handle, Kubernetes throttles the CPU rather than terminating the container.

Memory Requests vs Limits

Memory requests: This is the amount of memory a container is sure to get. The scheduler utilizes this parameter to find out the exact node to place the pod on.
Memory limits: The maximum amount of memory a container can use. If a container goes beyond its memory limit, it is killed (OOMKilled), and then potentially made to start again.

Example for CPU and Memory Throttling and Termination

Let’s say we have a container running a web server with the following:

    YAML
   
   apiVersion: v1

kind: Pod

metadata:

 name: test-demo

spec:

 containers:

 - name: test-app-container

   image: nginx

   resources:

     requests:

       memory: "64Mi"

       cpu: "250m"

     limits:

       memory: "128Mi"

       cpu: "500m"

CPU Requests and Limit Behavior

If the container requests more than 250 millicores of CPU, Kubernetes will try to give it more CPU if available. What this means is that the container will use a bit more CPU than it has requested but not get throttled until it reaches its limit.
If the container goes over 500 millicores of CPU, Kubernetes will throttle the CPU. This means the container’s CPU will be limited to 500 millicores and may slow down but won’t use more than its fair share of CPU.

Exceeding Memory Requests and Limits

Kubernetes will try to give the container more memory if it asks for more than 64MiB. This is not a promise, and at times of high memory pressure, the container might get less than it needs.
Kubernetes will kill and restart the container if the memory goes over 128MiB. This is called OOMKill (out of memory kill). The container application will crash and you will have downtime or reduced functionality until the container is restarted.

Resource Management and Monitoring

There are several tools available to monitor and manage resources:

kubectl top: Shows node and pod memory and CPU usage
Measurements server: HPA and VPA and gathers resource measurements.

Example of kubectl top Usage

kubectl top node <node_name>
kubectl top pod <pod_name>

ResourceQuotas and Limits

Kubernetes administrators can limit resource usage by setting resource quotas and limits at the namespace level.

ResourceQuota: These can be used as guardrails to limit the total resource per application.
LimitRange: This can be used as a policy to restrict resources on specific object kinds on an application namespace.

Example of ResourceQuota and LimitRange

    YAML
   
   apiVersion: v1

kind: ResourceQuota

metadata:

 name: resource-quota

spec:

 hard:

   requests.cpu: "4"

   requests.memory: "8Gi"

   limits.cpu: "8"

   limits.memory: "16Gi"

---

apiVersion: v1

kind: LimitRange

metadata:

 name: limit-range

spec:

 limits:

 - default:

     cpu: "500m"

     memory: "512Mi"

   defaultRequest:

     cpu: "200m"

     memory: "256Mi"

   type: Container

Conclusion

Resource management is key to running applications reliably in Kubernetes. By understanding and configuring resource requests, limits, and QoS classes you can make sure your applications perform well under load and use cluster resources efficiently. Using kubectl top, ResourceQuota, and LimitRangeto monitor and enforce resource usage policies will give you a balanced and optimal Kubernetes environment.

Kubernetes YAML pods Requests

Opinions expressed by DZone contributors are their own.

Related

Trending