Over a million developers have joined DZone.

Level 3 Outage: A Performance Butterfly Effect

Telecommunications company Level 3 was hit with a massive service outage. This performance nightmare didn't only affect customers. Read about the huge impact.

· Performance Zone

Download Forrester’s “Vendor Landscape, Application Performance Management” report that examines the evolving role of APM as a key driver of customer satisfaction and business success, brought to you in partnership with BMC.

[This article was written by Ryan Pelette]

This morning, beginning at 8:40 am UTC (4:40 am EDT), customers of the telecommunications giant Level 3 experienced serious connectivity problems for two hours that obviously had a dramatic impact on their sites’ performance. To make matters worse, as a major backbone ISP, the impact was not just felt by Level 3 customers, but spread to any web traffic that passed through their network. The issue was caused by Telekom Malaysia, whose prefix hijacking caused a route leak which resulted in global routing problems.


As you can see, the greatest impact is seen in Oceania, but there is also significant impact in Europe as well as repercussions in Asia and North America. In our tests of the Level 3 network, the majority of problems in Oceania manifested as time outs and connection failures, which in turn led to packet loss greatly increased round trip times due to certain routers being unavailable.

Here is a traceroute from Sydney showing significant packet loss at Global Crossing (Level 3) in hop 7.

1    2 ms    1 ms    1 ms xxx.xxx.xxx.xxx
2    *       *       *    Timed Out
3    1 ms   <1 ms   <1 ms lag30.sglebinte01.aapt.net.au[]
4    1 ms    1 ms    2 ms po41.sglebbrdr11.aapt.net.au[]
5   <1 ms   <1 ms   <1 ms 203-219-106-153.tpgi.com.au[]
6    2 ms    3 ms    3 ms syd-gls-har-int2-be-20.tpgi.com.au[]
7  272 ms  316 ms  272 ms globalcrossing1-10g.hkix.net[]
8    *       *       *    Timed Out
9    *       *       *    Timed Out
10    *       *       *    Timed Out

Meanwhile, Europe was experiencing more problems with DNS than anything else, as seen in this DNS Traversal from Frankfurt highlighting more increased RTT and packet loss, which in turn resulted in DNS resolution timeouts.


Stepping back from the details for a moment, this issue shows the potential butterfly effect of global networks. An ISP flaps its wings in Malaysia, causing a chain reaction that results in site failures and time outs on the other side of the world. It also highlights the fact that despite all of the precautions that can be taken, sometimes our sites’ web performance is completely out of our hands. Even if Level 3 customers had a redundancy in place, the scope of the company’s network is so vast that you can’t guarantee that another ISP wouldn’t peer with them at some point. The end result is an outage that could only have been prevented at the source.

Updates : Corporate responses from both carriers.

See Forrester’s Report, “Vendor Landscape, Application Performance Management” to identify the right vendor to help IT deliver better service at a lower cost, brought to you in partnership with BMC.

high-perf,outage,performance,level 3

Published at DZone with permission of Mehdi Daoudi, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}