DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • How to Build an Agentic AI SRE Co-Pilot for Incident Response
  • The Death of "Text-Only" ChatOps: Why Google's A2UI Matters for DevOps and SRE
  • AI in SRE: What's Actually Coming in 2026
  • Securing Error Budgets: How Attackers Exploit Reliability Blind Spots in Cloud Systems

Trending

  • Compliance Automated Standard Solution (COMPASS), Part 11: Compliance as Code, the OSCAL MCP Server Way
  • The Hidden Cost of AI Tokens: Engineering Patterns for 10x Resource Efficiency
  • Data Contracts as the "Circuit Breaker" for Model Reliability
  • Building a High-Throughput Distributed Sequence Generator Using the Hi-Lo Algorithm
  1. DZone
  2. Culture and Methodologies
  3. Methodologies
  4. Understanding the relationships between SLO, SLI, and SRE

Understanding the relationships between SLO, SLI, and SRE

An SLI is a measure of compliance with an SLO. This means there is no SLI without SLO. This article looks into the importance of SLIs and SLOs in SREs and how to implement them.

By 
Alireza C user avatar
Alireza C
·
Nov. 15, 21 · Opinion
Likes (4)
Comment
Save
Tweet
Share
12.7K Views

Join the DZone community and get the full member experience.

Join For Free

Even after delivering a project to a client, the software engineer’s job is not complete. The next phase is ensuring service reliability. In site reliability engineering (SRE) practice, there are two key concepts that the engineer should know, service level objective (SLO) and service level indicator (SLI).

This article looks into the importance of SLIs and SLOs in SREs and how to implement them.

What are service level objectives?

A service level objective is an agreement about a specific metric like uptime or response time. In other words, SLOs are the individual promises made by a service provider to the client and used to set expectations of the service. SLOs also let the IT and DevOps teams have a goal or metric to measure themselves against for a view of how well they are performing.

A service may have more than one SLOs, and they apply to both paying and non-paying customers and even internal clients in the same organization. For example, when a customer-facing team uses tools provided by another team in the same organization, the two teams need to have clearly defined service level objectives so that the customer-facing teams can meet their contractual obligations.

For an SLO to be effective, it must not be vague, very complicated, or impossible to measure. Only the relevant SLOs should be in the document and be spelled out in plain language to provide clarity. It is also essential to factor in other issues like delays from the client.

Using an online service that is called by clients an example, SLOs can include system availability, how long it takes for a request to get a response, the error rate or how often an error is encountered expressed as a fraction, and the number of requests the service can handle per second.

What are service level indicators?

An SLI is a measure of compliance with an SLO. This means there is no SLI without SLO.

Returning to the example of online service, if the service level agreement (SLA) promises availability of 99.95 percent, then your SLO is 99.95 percent. Your SLI is then the actual availability reported by your system.

If your SLI is above 99.95 percent, then you have met your obligation to your client. While 100 percent availability is not possible, the goal is to get as close as possible.

Some of the challenges of SLIs are choosing the relevant metrics to track and implementing how to track them as accurately as possible. Tracking metrics just because you can and not because they are essential to the client is a waste of resources.

How does SRE benefit from SLOs and SLIs?

Having excellent and practical SLOs and SLIs is fundamental to seamlessly transitioning from development to operations. SLOs help the team prioritize their work, while SLIs indicate areas where attention is needed to meet client expectations.

Now that you know what SLOs and SLIs stand for, we will look at the best practices of implementing them to improve your SRE.

Best practices for SLOs and SLIs

When formulating your SLOs within your SLA, it is important to pay attention to these points:

Take customers’ expectations into account

When drafting your SLA, it is important to know what your customers expect from your service or product. With the understanding of what matters to your clients, your team can craft what is practical and that the customer can work with.

Use the plainest language possible in your SLA

Your client might not read the document in your presence where they can ask you for clarification. If any part of your SLA, which includes the SLOs, is ambiguous, you and your client will probably have disagreements on expectations down the line.

Not every metric is an SLO

You will avoid lots of troubles by limiting your SLOs to only practical and essential ones. Use as few SLOs as possible, do not cram in as much as you can to impress with your metric tracking capabilities.

Don’t promise the moon even if you can deliver it

While setting your SLOs, you do not need to promise clients your total capacity. For example, if your system can maintain an uptime of 99.99 percent, you do not have to set your SLO at 99.99 percent. It is better to have a wiggle room by underpromising and over-delivering. This way, you can take care of unforeseen issues that can affect the service you provide.

Have a sounds disaster recovery plan

Before committing to an SLO, prepare a detailed plan of what to do when your SLI drops below your SLOs. Failure to do this will result in an uncoordinated response that only wastes your team’s time, instead of fixing the problem.

Site reliability engineering

Opinions expressed by DZone contributors are their own.

Related

  • How to Build an Agentic AI SRE Co-Pilot for Incident Response
  • The Death of "Text-Only" ChatOps: Why Google's A2UI Matters for DevOps and SRE
  • AI in SRE: What's Actually Coming in 2026
  • Securing Error Budgets: How Attackers Exploit Reliability Blind Spots in Cloud Systems

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook