DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • Azure, AWS, and GCP: A Multicloud Service Cheat Sheet
  • AWS to Azure Migration: A Cloudy Journey of Challenges and Triumphs
  • 12 Expert Tips for Secure Cloud Deployments
  • Leveraging Apache Airflow on AWS EKS (Part 1): Foundations of Data Orchestration in the Cloud

Trending

  • Building Reliable LLM-Powered Microservices With Kubernetes on AWS
  • Data Quality: A Novel Perspective for 2025
  • Next Evolution in Integration: Architecting With Intent Using Model Context Protocol
  • Endpoint Security Controls: Designing a Secure Endpoint Architecture, Part 2
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. How To Optimize Kubernetes Costs?

How To Optimize Kubernetes Costs?

This article shows the challenges and a few best practices to optimize Kubernetes costs. Examples: 1. QoS for Pods, 2. LimitRanges, 3, ResoureQuotas.

By 
Sindhu Priya user avatar
Sindhu Priya
·
Jul. 22, 23 · Tutorial
Likes (1)
Comment
Save
Tweet
Share
6.9K Views

Join the DZone community and get the full member experience.

Join For Free

Cloud containers come with the flexibility to lift and shift applications to any environment, cloud or virtual or bare metal without worrying about the virtual OS, hypervisors, etc.  Simplified management, paced-up delivery and agility compel cloud developers to hail containerization. Kubernetes aka k8s (if you are wondering what k8 means, it’s just a replacement of 8 letters “Kubernetes”) is a popular open-source containerization platform cloud developers adopt widely. According to a recent report by CNCF, there is a 67% increase in Kubernetes developers worldwide which manifests the popularity.

The sad news is the surge in adoption and usage comes with a compromise in the IT infrastructure budget. Enterprises could be wasting nearly 80% of Kubernetes's expenses on unintentional resources that are not helping organizations to hit their goals as planned. Let’s see in this blog what are the challenges and ways to optimize.

Challenges in Kubernetes Cost Optimization

Charged as a Whole

A cluster contains multiple nodes within. Each node will in turn contain a varying number of containers. Every node that is present inside one cluster is not necessarily part of the same application as others. Each node may be handled by 20 different teams for different applications. But the cluster is charged as a whole or clusters together by the cloud service providers. Billing starts right when a container is deployed into a node. The additional cost of $2.4/day is charged for Kubernetes cluster maintenance, software license, disaster recovery, etc.

Showback and Chargeback – Near Impossible

These two processes are vital for enterprises to bring financial accountability. (Showback is the process that gives visibility to a business unit’s expenses on cloud resources usage for a particular period but not charged. Whereas chargeback is where the unit is informed as well as charged based on its utilization.) Tagging, which can aid the engineers in cost tracking is not possible in Kubernetes clusters.

Every penny we invest becomes worthy when it checks the list of features/output we planned. When they just stay at the expense side mapped to no production, then the ultimate source for the anomalies should be identified. But it is not as simple as it sounds for Kubernetes clusters. Spotting the team responsible for most of the expenses is daunting as each container may be utilized by different teams in the enterprise working on different deliverables. Each team’s Infrastructure budget and resource cost allocation varies on the other hand.

Multi-Cloud Adoption

In Gartner’s recent survey, 81% of the respondents stated that they use 2 or more cloud service providers to avail of various benefits like overcoming vendor- lock-in, resource discounts, disaster recovery, etc. Kubernetes clusters will contain workloads from different cloud service providers like AWS, Azure, GCP, etc. which further twists the cost anomalies detection and Kubernetes cost optimization process.

Dynamic Autoscaling

One of the key reasons for cloud engineers to choose the Kubernetes cluster is its autoscaling feature. Depending on the usage demand, Kubernetes scales up or down so the resources suffice the compute requirement during peaks and valleys. Further in horizontal autoscaling, containers scale out which may reach from 2 to 20 within a day. Kubernetes cost optimization turns complicated because of this unpredicted autoscaling.

Kubernetes Cost Optimization Best Practices

Let’s look at the optimal ways to trim the inadvertent expenses.

Quality of Services for Pods (QOS)

Kubernetes cluster offers cloud practitioners the flexibility to set different QoS classes to Pods. Based on this the Pods’ scheduling and removal become easier for cloud practitioners. It is of 3 types – Guaranteed, Burstable, and Best-Effort.

Guaranteed: When cloud engineers need to have a Pod, whose instances are sufficient to handle a highly critical app, they can configure it to be a guaranteed QoS class. So, both the CPU & memory limits and requests are the same and set. As the name suggests it guarantees minimal resources (Requests are the minimum number of resources and limits are the maximum resources it can use)

Burstable: Burstable QoS class is assigned for a Pod when the vCPU or memory limit is more than the requests or not essentially the same. So, when the need arises for a spiked compute requirement, they can be utilized.

Best-Effort: When both the limits and requests are not set, it is classified as a best-effort QoS class. Cloud engineers can avail for non-critical applications.

As we can clearly witness, Best-effort pods are the first choice for removal followed by Burstable and then Guaranteed.

ResourceQuotas & LimitRange

As we saw earlier, the Kubernetes cluster is shared by various teams. There is a possibility that one team may consume most of the Pods leaving other teams scarce. Namespace-level Pod restriction can mitigate this issue (Namespace is a virtual cluster). ResourceQuota is the means through which we can limit Pods’ usage within a Namespace.

Kubernetes administrators create a ResourceQuota for each Namespace. Whenever a user creates or updates a Pod within a Namespace, the ResourceQuota system checks if the limits are exceeded,  If it is crossing the limits, it returns a “403 FORBIDDEN” error denying the action.

LimitRange is another policy that helps with resource constraints individually. As the name goes limits the range of “limits & requests” within a Namespace.

Kubernetes AWS Cloud computing Cloud database Cloud engineering azure

Published at DZone with permission of Sindhu Priya. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Azure, AWS, and GCP: A Multicloud Service Cheat Sheet
  • AWS to Azure Migration: A Cloudy Journey of Challenges and Triumphs
  • 12 Expert Tips for Secure Cloud Deployments
  • Leveraging Apache Airflow on AWS EKS (Part 1): Foundations of Data Orchestration in the Cloud

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!