As a consequence of the Dyn attack, many major websites were down, including twitter – the browsers could not resolve an IP address of the servers because the authoritative name server (Dyn) was down. Whether that could be addressed globally, I don’t know – there was an interesting discussion on Reddit about my proposal to increase TTL – how the resolution policy and algorithms can be improved, why a lower TTL is not always applicable, etc.
But while twitter.com was down, the mobile app was also not working. And while we have no control over the browser, we certainly do have control on the mobile app (same goes for desktop applications, of course, but I’ll be talking mainly about mobile apps as more dominant). The reason the app was also down is that is most likely uses the twitter.com domain as well (e.g. api.twitter.com). And that’s the right way to do it, except in these rare situations when the DNS fails.
In these cases, you can hardcode a list of server IP addresses in your app and fall back to them if the domain-name-based requests fail (e.g. after 3 attempts). If you change your server IP addresses, you just update the app. It doesn’t matter that it will have an unpredictable delay (until everyone updates) and that some clients won’t have a proper IP – it is an edge-case fallback mechanism.
It’s not, of course, my idea – it’s been used by distributed systems like Bitcoin and BitTorrent – for example, when trying to join the network, the Bitcoin or a BitTorrent DHT client tries to connect to a bootstrap node in order to get a list of peers. There is a list of domain names in the client applications that resolve using a round-robin DNS to one of many known bootstrap nodes. However, if DNS resolution fails, the clients also have a small set of hardcoded IP addresses.
As this post points out, you can have multiple fallback strategies. Instead of, or better – in addition to hardcoding the IP, you can have a fallback domain name. foo.com and foo-fallback.com, managed by different DNS providers. That way your app will have a 3 step fallback mechanism – 1. try primary domain 2. try fallback domain 3. try fallback IPs.
Events like the Dyn attack are (let’s hope) rare, but can be costly to businesses. Adding the fallback mechanisms to at least some of the client software is quick and easy and may reduce the damage.