Vince Lombardi once said, "It's not whether you get knocked down, it's whether you get up." But in today's world of high velocity development and release practices, it's also about how fast you can get up.
Today, we are seeing more news about sites from major online retailers and service providers like HP, Target, PayPal, and Neiman Marcus going down. High site traffic, poor configurations, scalable architectures, and well planned recovery practices all come into play.
But the best organizations have been failing repeatedly for some time now...on purpose. Organizations like Amazon, Google and Netflix practice regularly for application, system and site outages. In fact, Netflix has baked continuous failures into its operations through an internally developed Simian Army. While it might seem unconventional at first, failing all the time allows them to succeed.
Our CTO here at Sonatype, Josh Corman, often reminds us that "All Systems Fail*". For those that take this to heart and prepare for the inevitable, they can be better prepared to recover. In fact, for the best prepared, recovery can be so seamless, it is a continuous part of their operations.