DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Embracing the Future With Hybrid and Cloud-Native Observability: An In-Depth Exploration of Observability With Architectural Examples and Best Practices
  • Three Habits of Highly Effective Observability Teams
  • Telemetry Pipelines Workshop: Installing Fluent Bit From Source
  • OpenTelemetry vs. Prometheus: Which One’s Right for You?

Trending

  • Doris: Unifying SQL Dialects for a Seamless Data Query Ecosystem
  • Stateless vs Stateful Stream Processing With Kafka Streams and Apache Flink
  • *You* Can Shape Trend Reports: Join DZone's Software Supply Chain Security Research
  • AI-Based Threat Detection in Cloud Security
  1. DZone
  2. Testing, Deployment, and Maintenance
  3. Monitoring and Observability
  4. O11y Guide, Cloud-Native Observability Pitfalls: Underestimating Cardinality

O11y Guide, Cloud-Native Observability Pitfalls: Underestimating Cardinality

Continuing in this series examining the common pitfalls of cloud-native observability, take a look at how underestimating cardinality is the silent killer.

By 
Eric D.  Schabell user avatar
Eric D. Schabell
DZone Core CORE ·
Feb. 12, 24 · Analysis
Likes (3)
Comment
Save
Tweet
Share
2.9K Views

Join the DZone community and get the full member experience.

Join For Free

Are you looking at your organization's efforts to enter or expand into the cloud-native landscape and feeling a bit daunted by the vast expanse of information surrounding cloud-native observability? When you're moving so fast with agile practices across your DevOps, SREs, and platform engineering teams, it's no wonder this can seem a bit confusing.

Unfortunately, the choices being made have a great impact on both your business, your budgets, and the ultimate success of your cloud-native initiatives that hasty decisions upfront lead to big headaches very quickly down the road.

In the previous article, we talk about how focusing on cloud-native observability pillars has outlived its usefulness. Now it's time to move on to another common mistake organizations make: underestimating cardinality. By sharing common pitfalls in this series, the hope is that we can learn from them.

This article takes a look at how underestimating cardinality in our cloud-native observability can very quickly ruin our day.

Cardinality, the Silent Killer

The biggest problem facing our developers, engineers, and observability teams is that they are flooded with cloud-native data from their systems. Collecting all that we need versus exactly what we use to monitor our organization's applications and infrastructure is a constant struggle. This struggle leads to ever-increasing stress and frustration within teams trying to maintain some sense of order in the constant chaos.

The following quote is from an online forum where an SRE asked the community about dipping his feet into the tracing pond. This is one form of gathering observability information across service calls, tracing the entire process end to end. He stated that he did not yet use tracing, but wanted to ask those that do, why they did and those that don't, why not. Simple question, right?

Woman dev sitting at laptop with hands on forehead in frustration

Frustrations mounting?

The first answer was dripping with the frustration and sarcasm you find in any organization that is struggling with cloud-native observability:

"I don't yet collect spans/traces because I can hardly get our devs to care about basic metrics, let alone traces. This is a large enterprise with approx. 1000 developers."

Next thing you know, a new deployment in our organization happens to trigger an explosion of data by adding a small metric change triggering an exponential increase in cardinality. By collecting some unique value, this change has caused a huge spike in observability data that is now causing delays in all queries and dashboards. On-call support staff are now being paged and it plays out just like the 2023 Chronosphere Observability Report research showed. We spend on average 10 hours per week trying to triage and understand incidents: that's a quarter of our 40-hour work week!

Cardinality spikes are another example of the cloud-native complexity that is leading our developers and engineers to feel like they are drowning. From the same observability report, 33% surveyed said those above-mentioned issues spill over and disrupt their personal lives. 39% felt frequently stressed out.

The only way to get a handle on this is to have more insights and control over how we are going to deal with high volumes of cloud-native observability data before it has the negative impact described above. We need a way to analyze, refine, and operate on our observability data live during ingestion and be able to take immediate action when problems arise, like cardinality issues.

Real-time insight and transformation of observability data graphic

When we have access to something like the control plate shown above, we can stop incoming cardinality issues before they overload our systems, preventing that data from flooding our backend storage until we can sort out what is going on. This ease of control provides on-call engineers with powerful capabilities for taking immediate and decisive action resulting in better outcomes.

The road to cloud-native success has many pitfalls, and understanding how to avoid the pillars, focusing instead on solutions for the phases of observability, will save much wasted time and energy.Man with box over his head thinking "If they can't see me...they can't hurt me..."

Ignoring existing landscape?

Coming Up Next

Another pitfall organizations struggle with in cloud native observability is ignoring their existing landscape. In the next article in this series, I'll share why this is a pitfall and how we can avoid it wreaking havoc on our cloud-native observability efforts.

Observability Cardinality (data modeling) Cloud native computing

Published at DZone with permission of Eric D. Schabell, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Embracing the Future With Hybrid and Cloud-Native Observability: An In-Depth Exploration of Observability With Architectural Examples and Best Practices
  • Three Habits of Highly Effective Observability Teams
  • Telemetry Pipelines Workshop: Installing Fluent Bit From Source
  • OpenTelemetry vs. Prometheus: Which One’s Right for You?

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!