Understanding Prometheus Metric Types: A Guide for Beginners

A beginner-friendly guide to Prometheus metric types, like counters, gauges, histograms, and summaries, with practical examples and common mistakes to avoid.

Suleiman Dibirov

CORE ·

Dec. 09, 24 · Tutorial

Likes (2)

Comment

Save

2.6K Views

Prometheus is a tool that helps you track how your systems are working. Think of it as a tool that collects numbers about your applications and servers. This guide will help you understand the different types of metrics and how to use them.

The Four Basic Types of Prometheus Metrics

1. Counters - Numbers That Only Go Up

A counter is a number that only goes up or resets to zero on restart, just like a car's odometer that keeps adding miles. It's perfect for tracking things that only increase, like total API requests, error counts, or tasks completed. When a counter resets to zero (like during a system restart), Prometheus can detect this reset and handle calculations correctly. Counters are the simplest metric type and should be used whenever you're counting the total occurrences of something.

    Plain Text
   
   # Example of a counter
http_requests_total{method="POST", endpoint="/api/users"} 2387

What to Know

Only increases or resets to zero
Used for counting total events
Common uses: counting requests, errors, completed tasks

Basic Rules

Always add _total to counter names
Use only for numbers that increase
Never use for numbers that need to go down

Real Examples

    Plain Text
   
   # Wrong way: Using a counter for current users
active_users 23
# Why it's wrong: Current users can go up OR down, but counters can only go up

# Right way: Using a counter for total logins
user_logins_total{status="success"} 10483
# Why it's right: Total logins only increase, perfect for a counter

2. Gauges - Numbers That Go Up and Down

A gauge is a number that can go both up and down, like a thermometer or fuel gauge in your car. It represents a current value at any point in time, such as memory usage, active requests, or CPU temperature. You can think of a gauge as taking a snapshot of something that changes frequently. Unlike counters, gauges are perfect for metrics that can increase or decrease based on system behavior.

    Plain Text
   
   # Example of a gauge
node_memory_usage_bytes{instance="server-01"} 1234456789

What to Know

Can increase or decrease
Shows current value at any time
Good for measuring current state

Basic Rules

Use for values that change up and down
Good for usage and saturation metrics
Don't use for counting total events

Real Examples

    Plain Text
   
   # Right way: Tracking CPU temperature
cpu_temperature_celsius{core="0"} 54.5
# Why it's right: Temperature naturally goes up and down

# Right way: Current database connections
db_connections_current{database="users"} 47
# Why it's right: Active connections change both up and down

3. Histograms - Tracking Value Ranges

A histogram groups measurements into ranges (called buckets), like sorting test scores into A, B, C, D, and F grades. It automatically tracks how many values fall into each range, plus keeps a count of all values and their sum. Histograms are especially useful for measuring things like request duration or response size, where you want to understand the distribution of values. The key feature of histograms is that they let you calculate percentiles later using the histogram_quantile function.

    Plain Text
   
   # Example of a histogram
http_request_duration_seconds_bucket{le="0.1"} 24054  # Requests faster than 0.1s
http_request_duration_seconds_bucket{le="0.5"} 33444  # Requests faster than 0.5s
http_request_duration_seconds_bucket{le="1.0"} 34001  # Requests faster than 1.0s

What to Know

Groups values into ranges (buckets)
Creates count and sum automatically
Helps calculate percentiles

Basic Rules

Pick ranges that make sense for your data
Good for response times and sizes
Don't create too many ranges (it uses more memory)

Real Examples

    Plain Text
   
 

   # Wrong way: Too many buckets
api_response_time_bucket{le="0.1"} 100
api_response_time_bucket{le="0.2"} 150
api_response_time_bucket{le="0.3"} 180
# Why it's wrong: Too many small buckets use extra memory and don't add value

# Right way: Meaningful bucket sizes
api_response_time_bucket{le="0.5"} 1000   # Half second
api_response_time_bucket{le="1.0"} 1500   # One second
api_response_time_bucket{le="2.0"} 1700   # Two seconds
# Why it's right: Buckets match meaningful response time targets
  

4. Summaries - Calculating Percentiles

A summary is similar to a histogram but calculates percentiles directly when collecting the data, like having a calculator that immediately tells you your test score's ranking in the class. It tracks the total count and sum like a histogram, but instead of buckets, it stores exact percentile values (like 50th, 90th, 99th percentile). Summaries are more resource-intensive than histograms because they calculate percentiles on the fly, but they provide more accurate percentile calculations. Use summaries when you need exact percentiles and can't calculate them later.

    Plain Text
   
   # Example of a summary
http_request_duration_seconds{quantile="0.5"} 0.05   # 50% of requests
http_request_duration_seconds{quantile="0.9"} 0.1    # 90% of requests
http_request_duration_seconds_count 34010            # Total count

What to Know

Calculates exact percentiles
Includes total count and sum
Uses more computer resources than histograms

Basic Rules

Use when you need exact percentiles
Consider histograms for most cases
Be careful with labels (they use memory)

Common Mistakes to Avoid

1. Counter vs. Gauge Confusion

    Plain Text
   
   # Wrong way: Using counter for temperature
temperature_total{location="room"} 25
# Why wrong: Temperature goes up and down, counters can't go down

# Right way: Using gauge for temperature
temperature{location="room"} 25
# Why right: Gauges can show current temperature properly

2. Too Many Labels

    Plain Text
   
   # Wrong way: Too much detail
http_requests_total{user_id="12345", path="/api/users", method="GET", status="200", browser="chrome"}
# Why wrong: Creates too many combinations, uses lots of memory

# Right way: Important details only
http_requests_total{path="/api/users", method="GET", status="200"}
# Why right: Keeps useful information without too many combinations

Simple Recipes for Common Tasks

Calculating Rates

    Plain Text
   
   # Request rate per second over 5 minutes
rate(http_requests_total[5m])

# Error rate percentage
(rate(http_errors_total[5m]) / rate(http_requests_total[5m])) * 100

Tracking Resource Usage

    Plain Text
   
   # Average memory usage by application
avg(process_memory_bytes) by (app_name)

# Maximum CPU usage in the last hour
max(cpu_usage_percent[1h])

Key Points to Remember

1. Use Counter when:

Counting total events
Tracking errors
Measuring completed tasks

2. Use Gauges when:

Measuring current values
Tracking things that go up and down
Showing resource usage

3. Use Histograms when:

Measuring response times
Looking at value ranges
Need approximate percentiles

4. Use Summaries when:

Need exact percentiles
Willing to use more computer resources
Can't calculate percentiles later

Start with counters and gauges for basic monitoring. Add histograms and summaries when you need to track response times or understand how your values are spread out. Remember, good monitoring starts with choosing the right type of metric for what you want to measure.

Metric (unit) Tool

Opinions expressed by DZone contributors are their own.

Related

Trending

Understanding Prometheus Metric Types: A Guide for Beginners

A beginner-friendly guide to Prometheus metric types, like counters, gauges, histograms, and summaries, with practical examples and common mistakes to avoid.

The Four Basic Types of Prometheus Metrics

1. Counters - Numbers That Only Go Up

What to Know

Basic Rules

Real Examples

2. Gauges - Numbers That Go Up and Down

What to Know

Basic Rules

Real Examples

3. Histograms - Tracking Value Ranges

What to Know

Basic Rules

Real Examples

4. Summaries - Calculating Percentiles

What to Know

Basic Rules

Common Mistakes to Avoid

1. Counter vs. Gauge Confusion

2. Too Many Labels

Simple Recipes for Common Tasks

Calculating Rates

Tracking Resource Usage

Key Points to Remember

1. Use Counter when:

2. Use Gauges when:

3. Use Histograms when:

4. Use Summaries when:

Related

Partner Resources