Uptime is for Amateurs - SRE on Dev Interrupted
SRE expert Brian Murphy joins me on this episode of Dev Interrupted to explain how SRE teams operate, what success looks like, and why uptime isn't the important metric.
Join the DZone community and get the full member experience.
Join For FreeIs the #1 SRE success metric uptime?
No! It's actually customer happiness. Why? Because your website can be up, but all of it is crap. Parts aren't loading, or it's really degraded, and it' just very bad. But if your prometheus metric is for being up is 1, it's technically doing what it's supposed to do. Even if you customer is having a bad time. So uptime isn't what you always want to measure.
This week on the Dev Interrupted podcast I'm joined by my friend Brian Murphy. As the SRE Manager at G-Research, Brian has spent his career building and managing successful SRE teams. He joins me on this episode of Dev Interrupted to talk about:
- What success looks like for SRE teams
- What kind of engineer is best for an SRE role
- The best success metrics for SRE teams
- How to get started with SRE at your company
Join us for an AMA with Brian Murphy on Friday (Dec. 4th)
The Dev Interrupted discord community is hosting a live AMA with Brian Murphy on Friday, December 4th at 7 AM PST. Join our dev leader community today!
Interested in learning about Dev Metrics?
I've put together a couple of my favorite metrics related articles:
The Metrics Runbook for Dev Team Success
Dev Team Health: Data-driven Vital Signs to Watch
How To Run A Data-Driven Dev Team Without Being A Performance Tyrant
Published at DZone with permission of Dan Lines, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments