Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

"Something is Technically Wrong" #TwitterDown

DZone's Guide to

"Something is Technically Wrong" #TwitterDown

"Something is technically wrong". That’s what Twitter said on Tuesday morning January 19 2016. Millions of Twitter users all over the world were blocked from the social network. How could this outage happen?

· Performance Zone
Free Resource

Download our Introduction to API Performance Testing and learn why testing your API is just as important as testing your website, and how to start today.

According to DownDetector, a site that tracks internet sites and mobile apps in real time, users were experiencing the most trouble with Twitter’s website, smartphone app and tablet apps. Also third-partyservices, such as TweetDeck, were intermittently unavailable. It turns out that Twitter experienced an issue ‘related to an internal code change’ that caused the outage for a long time. On Tuesday afternoon, Twitter said they reverted the change, which fixed the issue.

TwitterOutage.png

This application downtime had a huge impact on Twitter’s business. The average hourly cost of a critical application failure is $500,000 - $1 million. Tuesdays outage lasted for more than six hours and the stock price reached a new low, losing 7% and almost $700 million market value.

More importantly: how could this outage happen and why did it take so long to fix the issue? Probably someone wrote or edited the code, deployed it and as a result everything went down. It seems like the problem-finding process at Twitter is a hell of a job. They didn’t know who changed the code, what was changed and how this affected critical business services. They had to start a time-consuming investigation between DevOps teams to find and resolve the problem. The better way to deal with outages is to fully automate the problem-finding process across teams. Every DevOps team should be aware of what’s happening in the full IT stack. Providing business services is and always will be a multiple team effort. To prevent future outages Twitter has to step up their game and take a proactive visual approach for smooth IT operations. They can't wait for the next big incident to happen. 

Let’s hope Twitter will learn from these outages. Eventually you and I, the customers, are suffering the most. We can’t tweet and have to login on Facebook to complain about our problems. ;-)

Find scaling and performance issues before your customers do with our Introduction to High-Capacity Load Testing guide.

Topics:
devops ,devops best practices ,outage ,cloud

Published at DZone with permission of Mark Bakker. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}