DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Power BI Embedded Analytics — Part 1.1: Power BI Authoring Data Federation
  • Power BI Embedded Analytics — Part 1: Introduction and Power BI Authoring Overview
  • Building a Cost-Effective ELK Stack for Centralized Logging
  • Optimizing Vector Search Performance With Elasticsearch

Trending

  • The Role of Retrieval Augmented Generation (RAG) in Development of AI-Infused Enterprise Applications
  • Ensuring Configuration Consistency Across Global Data Centers
  • Grafana Loki Fundamentals and Architecture
  • Java's Quiet Revolution: Thriving in the Serverless Kubernetes Era
  1. DZone
  2. Data Engineering
  3. Big Data
  4. Uber's Success With Apache Pinot: Cost Savings and Performance Boost

Uber's Success With Apache Pinot: Cost Savings and Performance Boost

Uber migrated from Elasticsearch to Apache Pinot and saw a 70% reduction in costs, an 80% reduction in CPU cores, and a decrease in page load times and query timeouts.

By 
Kristin Crosier user avatar
Kristin Crosier
·
Apr. 29, 24 · Review
Likes (3)
Comment
Save
Tweet
Share
1.6K Views

Join the DZone community and get the full member experience.

Join For Free

For a company like Uber, real-time data is the lifeblood of both their customer-facing and internal services. Customers rely on real-time data to get rides and grab food at their convenience. Internal teams also depend on up-to-date data to power the infrastructure behind their customer-facing applications, such as for their in-house tool that monitors mobile app crash analytics.

Uber migrated to Apache Pinot to power this in-house tool and experienced significant improvements compared to their previous analytics engine (Elasticsearch). By moving to Pinot, a true real-time analytics platform, Uber saw benefits including:

  • 70% reduction in infrastructure costs (a savings of $2M+ yearly)
  • 80% reduction in CPU cores
  • 66% reduction in data footprint
  • 64% reduction in page load times (from 14 seconds to under 5 seconds)
  • Reduced ingestion lag to <10 milliseconds
  • Decline in query timeouts and elimination of data loss issue

70%

Watch Now

The content of this blog is based on an in-person meetup that featured Apache Pinot user stories. We also reference a blog from Uber’s engineering team that covers how they use Pinot to serve real-time analytics for mobile app crashes. Watch the meetup here:

Or continue reading to learn how Uber achieved these results with Apache Pinot.

How Uber Delivers Real-Time Analytics on Mobile App Crashes

Uber has an automated ingestion pipeline that tracks app crashes and collects investigative data. Some of this data is ingested into Apache Flink for transformations, which is then put back into Kafka topics for downstream consumption. These raw and processed events in Kafka are then consumed by Apache Pinot, which then runs analytical queries whose results are delivered to internal users via Grafana and internal visualization tools. Their pipeline ingests both real-time and offline data (not depicted) to create a complete view of users, known as hybrid tables in Apache Pinot.

mobile app crash data integration

Real-Time App Crash Analytics With Apache Pinot

Uber releases ~11,000 new code and infrastructure changes every week, and they rely on an in-house tool (Healthline) to help detect and solve crash issues. Healthline enables Uber to better measure and meet its Mean Time To Detect (MTTD). For example, they might roll out a new feature that causes unexpected app crashes and must be able to quickly target the source of the crashes by digging into crash data.

The dashboard below shows a week’s worth of crash data for one mobile app and one version of the operating system. In this example, session events occur up to hundreds of thousands of times per second, and crashes measure between 15,000 to 20,000 events per second. Uber combines these metrics to calculate the fatal-free rate, which indicates the health of the application (the goal is to be as close to 100% as possible).

dashboard

With Elasticsearch, a general-purpose search engine, spikes in crash rates would cause ingestion lag and delay the team’s response to identifying issues. By moving to Apache Pinot, which is designed specifically for real-time analytics at a massive scale, the team has seen a decline in the number and severity of ingestion lags.

Analyzing In-Depth Crash Data

In addition to the high-level overview of crash data, Uber also provides in-depth crash-level analytics. They aggregate crash metrics across various dimensions such as the number of crashes per operating system and per version, and the distribution of crashes per version. This use case leverages several Pinot indexes (range, inverted, and text) to share when a type of crash has occurred, which versions are impacted, the number of occurrences, and the number of users and devices affected.

For the in-depth analysis, it was critical that Uber have text search capabilities to read crash error messages. Pinot’s text index is built on top of Lucene and gives them the ability to search crashes by crash message, class name, stacktrace, and more.

total issues

Measuring Sessions at Scale

Uber also uses Pinot to measure unique sessions by device, version, operating system, and hour at scale. Pinot offers real-time processing with high-scale throughput capable of ingesting Uber’s 300,000 analytic events per second. The team has a hybrid setup that includes a real-time table with 10-minute granularity and 3-day data retention, as well as an offline table with hourly and daily granularity and 45-day data retention.

Leveraging Apache Pinot’s HyperLogLog allowed the team to reduce the number of events being stored and perform fewer unique aggregations across events. Pinot also provided very low latencies — under 100 milliseconds for p99.5 latency.

total

Infrastructure Cost Savings

By Uber’s calculations, they’ve saved more than $2 million in annual infrastructure costs by migrating to Pinot. Their Pinot setup led to a 70% decrease in infrastructure costs compared to Elasticsearch. They also saw an 80% reduction in CPU cores and a 66% decrease in their data footprint.

cost savings

With Elasticsearch, Uber was using 22,000 cores of CPU. With Pinot, they’ve reduced that number by 80%. Here’s a snapshot of their Pinot setup:

elasticsearch

Improved Query Performance and User Experience

With Apache Pinot, Uber is able to deliver a better user experience with faster page loading and improved reliability. Migrating to Pinot reduced page load times by 64%, from 14 seconds to under 5 seconds. Pinot also has a better tolerance for spikes in load, leading to quicker lag recovery. Even if the team sees an ingestion lag, Pinot is able to recover quickly within a few minutes.

Compared to Elasticsearch, Pinot has also shown significant improvements across query timeouts and data loss. If a disaster in a mobile application occurred while using Elasticsearch, queries pertaining to that index would time out. Uber solved this issue with Pinot by controlling for segment size. The team also has no data loss issues with Pinot, compared to frequent data issues when Elasticsearch dealt with increased ingestion throughput.

The Next Iteration of Uber’s Pinot Setup

Next up, Uber plans to migrate to native text indexing for their mobile crash data. Their mobile crash data contains a lot of structured data, making it feasible for the team to migrate all of their use cases to native text indexes. This transition will provide cost savings from data storage and reduce the time spent on querying data.

Uber Isn’t the Only Organization Seeing Success by Migrating From Elasticsearch to Pinot

Uniqode (formerly Beaconstac) saw a 10x improvement in overall query performance by switching from Elasticsearch to Pinot. Cisco Webex also migrated their real-time analytics and observability to Pinot after facing high latencies. The Webex team found that Apache Pinot provided 5x to 150x lower latencies than Elasticsearch.

Analytics Data loss Elasticsearch Data (computing) mobile app

Published at DZone with permission of Kristin Crosier. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Power BI Embedded Analytics — Part 1.1: Power BI Authoring Data Federation
  • Power BI Embedded Analytics — Part 1: Introduction and Power BI Authoring Overview
  • Building a Cost-Effective ELK Stack for Centralized Logging
  • Optimizing Vector Search Performance With Elasticsearch

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!