1. High availability is hard
After seven years in the business you might think Twitter has operations and scalability nailed. I wouldn’t blame you for hoping, but here’s one thing they said in their IPO filing:
“We are not currently serving traffic equally through our co-located data centers”
What does this mean exactly? Let’s think of your daily drive to work. Remember that one intersection that’s always congested? Could the city designers have envisioned that 50 or 100 years ago? Probably not. In the present day, with all the buildings and roads, can we redesign around it? Not easily. So we adapt, evolve, and deal with the day-to-day realities of an evolving city.
James Urquhart says that these are complex systems. The internet, the cloud and your startup infrastructure are by nature brittle.
2. Fail whale is part of the DNA
The graphic above is a whimsical remake of Twitter’s own by Shanna Banan. Consider, though, that someone at Twitter was tasked with designing a graphic for when the site fails. The DevOps team then built a page for failure, and have it at the ready for when there’s an outage, not if. It’s symbolic of the many other things your operations team does behind the scenes in expectation of that fateful day.
As Eric Ries argues, design for failure. Then manage it.
3. Investors, Wall Street: we’re working on it
What Twitter is really saying is, "Hey investors, we understand that five nines is extremely difficult, we’re vulnerable in certain ways and want to disclose that."
ReadWrite argues Twitter has not banished the fail whale and is “surprisingly vulnerable”. Readwrite, I ask you… who has? Google? Nope. Facebook? Nope. Not AirBNB or Reddit, either.
These are world class firms. They’ve got the deep pockets to do it right, and the engineering talent to match. They still have failures.