I recently flew from Boston to the UK through the Heathrow airport. It just happened to be on the day that the UK got about 1.5 inches of snow (sorry, 3.8 centimetres, according to Weather Underground though, just 15mm, not sure about that). I spent a little more than four hours sitting on the runway at Heathrow before I was able to get out of the airplane. It was a frustrating and tiring experience, but it made me think about disaster recovery.
Having a Disaster Recovery Plan
Most of us have a Disaster Recover (DR) plan. Of course we do. Well, probably. Well, we take backups. Well, we occasionally take some backups. Well, we’re pretty sure someone within the organization may have taken a backup once… somewhere… probably.
Heathrow Airport has a plan for how to deal with snow. They just start cancelling flights. Well, they have a plan. It just wasn’t in evidence on the day I flew. We landed and along with at least 43 other airplanes (as reported by my pilot, more on that in a bit), we sat and waited until someone, somewhere, figured out what to do with us. The problem? Again, according to my pilot, the planes at the gates were waiting to be deiced and until they moved out, we couldn’t move in. It was surreal when, after four hours, we slowly rolled up to a terminal (the wrong one, by the way) navigating through all these airplanes scattered all over the runways like it was some kind of disaster film (realization: I never want to be on an airplane when the zombiepocalypse breaks out). Apart from cancelling flights, there was no evidence of any sort of disaster plan, at least none from the passenger seat of an airplane.
So, let’s assume you actually do have a disaster recovery plan. When was the last time you updated it? When was the last time you read through it? When was the last time that you actually dusted it off and ran through it?
That last question is the most important. Heathrow airport knew the snow was coming. It was forecast days out. Verified a few days out. Re-verified the day before, and then actually arrived and was slightly BETTER than forecast (up to 3 inches, 7.62 centimetres, was expected). You are not going to receive that kind of notification most of the time (although you may, if your server is in a 20 year flood zone and heavy rain is forecast). The primary problem that Heathrow experienced wasn’t even that they didn’t implement their DR plan. No, it was that their DR plan assumed that everyone could show up to work. However, the highway department (or whatever the UK equivalent is) didn’t put down salt or sand, so people couldn’t drive to work (my driver took five hours to get to Heathrow and went off the road once). On top of that, the trains also shut down (how 3.8 centimeters of snow stops a train I’m a little at a loss to understand, but there it is).
So, Heathrow had a plan, but it was based on incorrect assumptions. How’s yours?
As to my pilot, and the crew of the plane, they were outstanding. I was flying British Air and couldn’t have been happier with them. The situation was completely beyond their control. However, they communicated regularly with us. They let us know what they knew, when they received the information. The flight attendants were friendly and courteous throughout. They were just as frustrated and angry at the situation (both the flight attendant I was sitting near and the pilot said they hadn’t seen anything like this in 27 years of flying). They approached the situation as professionals and kept their cool. Well done BA.
I know, this is an IT blog, so talking about weather, planes and airports is not generally in my wheelhouse. However, I think you can easily see where I’m going with this. Not only do you need to have a real, tested, DR plan, you need to also test the assumptions that drive it. Further, you have to be able to have the attitude of my excellent BA crew. You can’t panic. You have to communicate clearly and regularly during the DR process. All of this requires preparation and practice. If you haven’t done either, you’re going to look a lot more like Heathrow Airport and a lot less like BA.