DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Google BGP Leak Impacts Hundreds of Websites' Performance

A few days ago, Google started having connectivity issues. See how this affected thousands of sites and services.

Kameerath Abdul Kareem user avatar by
Kameerath Abdul Kareem
·
Nov. 16, 18 · News
Like (1)
Save
Tweet
Share
6.35K Views

Join the DZone community and get the full member experience.

Join For Free

there are multiple protocols and components that keep the complex internet engine running. and just like any other well-oiled machine, it is important to regularly check whether it is functioning efficiently and delivering optimum performance.

the internet is basically a circuit relaying data signals/packets across different paths. one of the most important processes that make up the internet is ip routing. several protocols manage the flow of data, border gateway protocol or bgp governs how data is transmitted between autonomous entities in the network.

there is never enough stress on the need for ramping up security protocols as well as implementing proactive measures to identify performance degradation across a network quickly. this was highlighted by the bgp routing issue faced by google yesterday. although the issue was quickly sorted out, it still had a significant impact on user experience across multiple platforms.

issue analysis

at 16:30 est on november 12th, google noticed connectivity issues across multiple services including apis, load balancers, and even their cloud services.

catchpoint triggered performance alerts as soon as the issue surfaced. the charts below show some of the different google services that were impacted.

looking at the performance data from multiple customers, we realized this was a routing problem. for example, in the instance illustrated below, traffic was routed from germany to russia.

the ripestat data shows the routing path. as37282 was advertised as the route to google prefixes. this route information was then accepted by as4809 (china telecom) and then picked up by as20485 (transtelecom russia).

initial reports of the incident coupled with the suspicious routing paths pointed to a potential bgp hijack. but a google representative clarified to arstechnica that this was an accident and not malicious.

the nigerian isp, mainone cable company-identified as the origin of the issue, also tweeted that this was an error that occurred during a planned maintenance.

within 30 minutes the issue was resolved. google issued this statement on their cloud status dashboard:

"throughout the duration of this issue google services were operating as expected and we believe the root cause of the issue was external to google. we will conduct an internal investigation of this issue and make appropriate improvements to our systems to help prevent or minimize future recurrence."

not just another third-party

we are constantly discussing the performance tax that comes with integrating third-party tags. such incidents are a testament to the fact that third-party monitoring should never be overlooked.

the routing issue brought down google services which had an immediate impact on performance; multiple websites had unusually high page load time. this was mainly due to the google ajax libraries that are referenced by many websites. the outage brings the focus right back to third-party tag management and how performance issues introduced by these tags lead to downtime.

customers using the ajax libraries provided by google ( ajax.googleapis.com ) had a noticeable drop in performance throughout the duration of the routing attack. websites that relied on the google ajax library did not load properly leaving the page blank. for example, this website was blank for over 31 seconds.

the waterfall graph shows the unusually high wait time for the google apis resource which pushed the page load time to 54 secs.

multiple features make up an online application so dependencies on such third-party services are inevitable. proactive and constant monitoring of these services is key to mitigating the impact on performance. it is even more important to be prepared to handle such incidents, we shared tips on how you can do this in our blog " 5 lessons for managing a third party outage ."

performance monitoring is no longer about the uptime or downtime of an application. advanced monitoring provides you with all the data and tools necessary to identify issues quickly as well as predict and prevent potential performance issues.

Google (verb)

Published at DZone with permission of Kameerath Abdul Kareem, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Visual Network Mapping Your K8s Clusters To Assess Performance
  • DevOps Roadmap for 2022
  • Why Does DevOps Recommend Shift-Left Testing Principles?
  • Secrets Management

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: