DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • The Golden Signals of Monitoring
  • Embracing the Future With Hybrid and Cloud-Native Observability: An In-Depth Exploration of Observability With Architectural Examples and Best Practices
  • Three Habits of Highly Effective Observability Teams
  • Telemetry Pipelines Workshop: Installing Fluent Bit From Source

Trending

  • How AI Is Rewriting Full-Stack Java Systems: Practical Patterns with Spring Boot, Kafka and WebSockets
  • Code Quality Had 5 Pillars. AI Broke 3 and Created 2 We Can’t Measure
  • Why Your QA Engineer Should Be the Most Stubborn Person on the Team
  • Building a Skill-Based Agentic Reviewer with Claude Code: A Practical Guide Using Skills.MD, MCP Servers, Tools, and Tasks
  1. DZone
  2. Testing, Deployment, and Maintenance
  3. Monitoring and Observability
  4. O11y Guide, Cloud-Native Observability Pitfalls: Focusing on "The Pillars"

O11y Guide, Cloud-Native Observability Pitfalls: Focusing on "The Pillars"

Continuing in this series examining the common pitfalls of cloud-native observability, take a look at how to avoid the trap of focusing on The Pillars.

By 
Eric D.  Schabell user avatar
Eric D. Schabell
DZone Core CORE ·
Feb. 06, 24 · Analysis
Likes (4)
Comment
Save
Tweet
Share
3.0K Views

Join the DZone community and get the full member experience.

Join For Free

Are you looking at your organization's efforts to enter or expand into the cloud-native landscape and feeling a bit daunted by the vast expanse of information surrounding cloud-native observability? When you're moving so fast with Agile practices across your DevOps, SREs, and platform engineering teams, it's no wonder this can seem a bit confusing.

Feet with arrows pointing to Pillars and Phases
Unfortunately, the choices being made have a great impact on both your business, your budgets, and the ultimate success of your cloud-native initiatives that hasty decisions upfront lead to big headaches very quickly down the road.

In the previous article, we looked at the problem of controlling cost in cloud-native observability. In this article, you'll find the next pitfall discussion that's another common mistake organizations make. By sharing common pitfalls in this series, the hope is that we can learn from them.

After laying the groundwork in the previous article, it's time to tackle a pitfall where we need to stop focusing: The Pillars. I've spent some time in the past talking about Three Phases to Better Observability Outcomes and published an initial take on why Cloud Native Observability Needs Phases, but this article will be a more in-depth dive into the topic.

Focusing on "The Pillars"

For a few years now vendors have been marketing the idea that you need to focus on certain signals or pillars to achieve what you desire in the world of cloud-native observability.

If you look more closely at this, they are pushing hard for you to concentrate on three pillars: metrics, logs, and tracing, with a few even sliding in events to make it sound all-encompassing. These are touted as things you can tangibly check a box on in your observability stack. What they end up doing is creating a focus on functionality and technology features while completely ignoring the problem at hand.

It's like we have a very nice and expensive car that we cherish and it's started to make funny sounds while emitting smoke when we are driving. We rush to our favorite garage and the mechanic listens to our issues, then proceeds to drag out their toolboxes to show off all the great tools they have to fix issues just like ours. While this is going on and on, we look out the window and see that our car is now not just smoking, but it's on fire!

When we ask our on-call engineers who are the front line in the war of keeping our cloud-native business thriving, they will describe a process they have to go through to achieve that in all the various areas that they consider worth monitoring in our business.

When talking about the process and how it's important to our business goals, we hear the business talking in phrases like:

  • Better business outcomes
  • Faster remediation of problems that occur
  • Easier problem detection
  • Greater revenue generation
  • Engineering teams focused on delivering business value

These are all in a language the business understands and describes more the process that needs to be designed for, not the features the tooling needs to have. When we bring this back to cloud-native observability, we want a solution for our on-call engineers that walks them through the following three phases:

  1. Knowing: We start by discovering something is happening as fast as possible, maybe even leading to a quick fix in this phase.
  2. Triaging: If unable to fix immediately, then we start triaging based on specifically targeted information that is directly related to the problem at hand, which then quickly leads to fixing it.
  3. Understanding: Finally, possibly at a later time and slower investigative pace, we need to have a very deep understanding of the issues encountered to ensure it never happens again.

We don't want to be confronted with visualizations that have been designed and grouping information as categorized signals or as The Pillars. For example, here is something that was actually designed without much thought towards the process needed to solve any kind of issue, but it does capture the signals for you:

Visualization of Metrics, Tracings, and Logs

Good luck with this when you are on-call.

We really want to have clean, concise, and effective visualizations that present focused insights and put just enough information at our fingertips to make informed decisions quickly. We don't care if one metric, 3 labels, 1 span in a trace, and 3 log lines are the basis of the exact informational view -  we need to solve the reason our beeper went off:

Sharply-focused dashboard

Sharply-focused insights to get you through the phases.

The road to cloud-native success has many pitfalls and understanding how to avoid The Pillars, and focusing instead on solutions for the phases of observability will save much wasted time and energy.

Coming Up Next

Another pitfall organizations struggle with in cloud-native observability is underestimating cardinality issues. In the next article in this series, I'll share why this is a pitfall and how we can avoid it wreaking havoc on our cloud-native observability efforts.

Observability Signal Visualization (graphics) Cloud native computing

Published at DZone with permission of Eric D. Schabell. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • The Golden Signals of Monitoring
  • Embracing the Future With Hybrid and Cloud-Native Observability: An In-Depth Exploration of Observability With Architectural Examples and Best Practices
  • Three Habits of Highly Effective Observability Teams
  • Telemetry Pipelines Workshop: Installing Fluent Bit From Source

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook