Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Why WhatsApp Went Down

DZone's Guide to

Why WhatsApp Went Down

WhatsApp's recent outage proved the DevOps adage of fail fast and iterate - however, the media and your users will still take notice.

· Performance Zone ·
Free Resource

SignalFx is the only real-time cloud monitoring platform for infrastructure, microservices, and applications. The platform collects metrics and traces across every component in your cloud environment, replacing traditional point tools with a single integrated solution that works across the stack.

WhatsApp's recent outage freaked out users of this globally popular app. But context below from Dave Anderson, a digital experience expert at Dynatrace, shows just how fragile the process of continuously releasing new features is when millions rely on your service.

According to Dave, “You’ve got to feel for WhatsApp today – they’ve got one of the toughest jobs in the world. One in seven people on the planet use the application and expect it to be constantly updated with new features and always performing perfectly. So, when the chat service crashed for UK users yesterday, the company has endured widespread, public criticism.

“It’s hard for consumers to understand just how difficult the job of software development is these days. Amazon is known to release new software updates every 11 seconds and we could assume WhatsApp would be cracking a similar pace – releases and updates in minutes or hours, plus we can see they push a new version of the app to the store every 3-4 days. These are very rapid release cycles to fix bugs, optimize the app and make sure security is good. The process is ongoing (even for a free service!) but then one day, a new update breaks the delivery chain and everything stops. The media picks it up and users get vocal.

“In this case, it’s been reported that WhatsApp was testing a new feature where you pin a conversation to the top of the menu – a back-end feature change/update that’s aimed at satisfying users. But pushing new updates through the development and production cycle is always risky, which is why testing and monitoring how the changes will impact an app’s performance is so important. At the first sign of a problem, the developer team needs to be able to take swift action and roll back and fix, or abandon the change if it’s looking like it will impact the user experience or ultimately bring a service down for millions of users.”

SignalFx is built on a massively scalable streaming architecture that applies advanced predictive analytics for real-time problem detection. With its NoSample™ distributed tracing capabilities, SignalFx reliably monitors all transactions across microservices, accurately identifying all anomalies. And through data-science-powered directed troubleshooting SignalFx guides the operator to find the root cause of issues in seconds.

Topics:
performance and monitoring ,performance ,bug fixes ,whatsapp

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}