Over a million developers have joined DZone.

Amazon Search Outage Offers Two Lessons

DZone's Guide to

Amazon Search Outage Offers Two Lessons

The recent Amazon search outage is a reminder to monitor everything and communicate proactively, implementing simple tactics to keep performance issues from hindering business.

· Performance Zone ·
Free Resource

SignalFx is the only real-time cloud monitoring platform for infrastructure, microservices, and applications. The platform collects metrics and traces across every component in your cloud environment, replacing traditional point tools with a single integrated solution that works across the stack.

The recent Amazon search outage offers two object lessons on the digital customer experience: monitor everything and communicate proactively. Retailers chasing the industry leader can implement a pair of simple tactics to keep performance issues from turning away buyers.

First, the ultimate goal of End-User Experience Monitoring (EUM) is to find a problem before your customers do and fix it before it impacts their experience. Too many sites rely on basic uptime monitoring—sometimes limited to just a home page—to detect slowdowns and outages. This approach would have completely missed an issue like Amazon’s.

Transaction monitoring, on the other hand, would have identified the problem right away. In fact, it did. Here is what the outage looked like from Catchpoint’s vantage point:

On June 2, 2016, users from multiple locations faced issues searching for a product on Amazon’s desktop as well as mobile websites. The problem started around 5:15am EDT and lasted for more than 3 hours, until 8:47am EDT.

While the homepage for the Seattle-based eCommerce company was accessible during this time period, there were problems accessing the ‘Categories’ pages and when searching for products. As you can see below, the HTTP status code returned when accessing the page was 503, or service temporarily unavailable.

AMZN Search_3_705

Adding transaction monitoring to your EUM system—the first lesson here—lets you see and troubleshoot interruptions in processes like these immediately.

The second lesson stems from what customers see during outages. During this time period, Amazon users were greeted with the following error messages:

AMZN Search_4_705

AMZN Search_5_705

These error messages reassure users that Amazon knows about the issue, suggest actions that customers can take to continue shopping, and apologize for a potentially frustrating experience. Communicating like this can prevent complaints from flooding the customer service system, keep buyers on the site and transacting, and avoid a flurry of social media criticism.

But transaction monitoring is the key to identifying problems like these so that customer expectations can be managed. Even better, it can help you resolve problems before customers see them in the first place.

SignalFx is built on a massively scalable streaming architecture that applies advanced predictive analytics for real-time problem detection. With its NoSample™ distributed tracing capabilities, SignalFx reliably monitors all transactions across microservices, accurately identifying all anomalies. And through data-science-powered directed troubleshooting SignalFx guides the operator to find the root cause of issues in seconds.

outage ,experience ,monitoring ,error ,search ,ecommerce ,end-user ,amazon

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}