DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • 3 Best Tools to Implement Kubernetes Observability
  • The Production-Ready Kubernetes Service Checklist
  • Demystifying Kubernetes in 5 Minutes
  • Strengthening Your Kubernetes Cluster With Pod Security Admission

Trending

  • Immutable Secrets Management: A Zero-Trust Approach to Sensitive Data in Containers
  • Understanding IEEE 802.11(Wi-Fi) Encryption and Authentication: Write Your Own Custom Packet Sniffer
  • Cookies Revisited: A Networking Solution for Third-Party Cookies
  • Ethical AI in Agile
  1. DZone
  2. Testing, Deployment, and Maintenance
  3. Monitoring and Observability
  4. Optimizing Prometheus Queries With PromQL

Optimizing Prometheus Queries With PromQL

Count worker nodes and track resource changes in Prometheus using PromQL. Explore queries, best practices, and dynamic thresholds for Kubernetes monitoring.

By 
Ganesh Bhat user avatar
Ganesh Bhat
·
Jan. 20, 25 · Tutorial
Likes (4)
Comment
Save
Tweet
Share
7.4K Views

Join the DZone community and get the full member experience.

Join For Free

Prometheus is a powerful monitoring tool that provides extensive metrics and insights into your infrastructure and applications, especially in k8s and OCP (enterprise k8s). While crafting PromQL (Prometheus Query Language) expressions, ensuring accuracy and compatibility is essential, especially when comparing metrics or calculating thresholds. 

In this article, we will explore how to count worker nodes and track changes in resources effectively using PromQL.

Counting Worker Nodes in PromQL

To get the number of worker nodes in your Kubernetes cluster, the kube_node_info metric is often used. However, this metric includes all nodes, such as master, infra, and logging nodes, in addition to worker nodes. To filter only the worker nodes, you can refine your query using label matchers.

Here is a query to count only worker nodes:

Plain Text
 
count(kube_node_info{node=~".*worker.*"})


Explanation

  • kube_node_info is the metric that provides information about all nodes.
  • {node=~".*worker.*"} filters nodes whose names contain the substring "worker."
  • count() calculates the total number of matching nodes.

This query ensures that only worker nodes are counted, which is often required for scaling metrics or thresholds in PromQL.

Tracking Changes in Resource Usage

A common use case in Kubernetes monitoring is tracking the change in the number of pods over time. For example, you might want to detect if pods have increased significantly within the last 30 minutes. Combining this with the worker node count allows you to set thresholds that scale with your cluster's size.

Consider the following query:

Plain Text
 
max(apiserver_storage_objects{resource="pods"}) - max(apiserver_storage_objects{resource="pods"} offset 30m) > (20 * count(kube_node_info{node=~".*worker.*"}))


Breakdown

1. Left-Hand Side

  • max(apiserver_storage_objects{resource="pods"}) gets the maximum number of pods currently in the cluster.
  • max(apiserver_storage_objects{resource="pods"} offset 30m) retrieves the maximum number of pods 30 minutes ago.
  • Subtraction changes the number of pods over the last 30 minutes.

2. Right-Hand Side

  • count(kube_node_info{node=~".*worker.*"}) counts the number of worker nodes.
  • Multiplying this by 20 sets a dynamic threshold based on the number of worker nodes.

3. Comparison

  • The query checks if the change in pod count exceeds the calculated threshold.

Addressing Syntax Issues in PromQL

While crafting PromQL expressions, syntax errors or mismatched types can lead to unexpected results. In the example above, the left-hand side of the query might return multiple time series, while the right-hand side is a scalar. To ensure compatibility, you can wrap the left-hand side in a max() function to reduce it to a scalar:

Plain Text
 
max(max(apiserver_storage_objects{resource="pods"}) - max(apiserver_storage_objects{resource="pods"} offset 30m)) > (20 * count(kube_node_info{node=~".*worker.*"}))


Why Use max()?

The max() function ensures that the result of the subtraction is a single scalar value, making it compatible with the right-hand side.

General Best Practices

  1. Understand your metrics: Always familiarize yourself with the metrics you are querying. Use label_values() or the Prometheus UI to inspect available labels and their values.
  2. Test incrementally: Start with smaller queries and validate their results before building complex expressions.
  3. Ensure scalar compatibility: When comparing values, ensure both sides of the comparison are scalars. Use aggregation functions like max(), sum(), or avg() as needed.
  4. Dynamic thresholds: Use cluster-specific metrics (e.g., node count) to set thresholds that scale dynamically with your infrastructure.

Conclusion

PromQL is a powerful tool, but crafting accurate and efficient queries requires careful attention to detail. By using refined expressions like count(kube_node_info{node=~".*worker.*"}) to count worker nodes and dynamic thresholds based on cluster size, you can create robust monitoring solutions that adapt to your environment. Always test and validate your queries to ensure they provide meaningful insights.

Feel free to use the examples and best practices discussed here to enhance your monitoring setup and stay ahead of potential issues in your Kubernetes cluster.

Kubernetes Query language Time series cluster pods Observability

Opinions expressed by DZone contributors are their own.

Related

  • 3 Best Tools to Implement Kubernetes Observability
  • The Production-Ready Kubernetes Service Checklist
  • Demystifying Kubernetes in 5 Minutes
  • Strengthening Your Kubernetes Cluster With Pod Security Admission

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!