DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
The Latest "Software Integration: The Intersection of APIs, Microservices, and Cloud-Based Systems" Trend Report
Get the report
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. Windows Azure Outages in South Central US Data Center

Windows Azure Outages in South Central US Data Center

Roger Jennings user avatar by
Roger Jennings
·
Apr. 20, 12 · Interview
Like (0)
Save
Tweet
Share
7.78K Views

Join the DZone community and get the full member experience.

Join For Free

My OakLeaf Systems Azure Table Services Sample Project (Tools v1.4 with Azure Storage Analytics) demo application, which runs on two Windows Azure compute instances in Microsoft’s South Central US (San Antonio) data center incurred an extraordinary number of compute outages on 4/19/2012. Following is the Pingdom report for that date:

imageimage

image

The Mon.itor.us monitoring service showed similar downtime.

This application, which I believe is the longest (almost) continuously running Azure application in existence, usually runs within Microsoft’s Compute Service Level Agreement for Windows Azure: “Internet facing roles will have external connectivity at least 99.95% of the time.”

The following table from my Uptime Report for my Live OakLeaf Systems Azure Table Services Sample Project: March 2012 post of 4/3/2012 lists outage and response time from Pingdom for the last 10 months:

image

The Windows Azure Service Dashboard reported Service Management Degradation in the Status History but not problems with existing hosted services:

image

[RESOLVED] Partial Service Management Degradation

19-Apr-12
11:11 PM UTC We are experiencing a partial service management degradation in the South Central US sub region. At this time some customers may experience failed service management operations in this sub region. Existing hosted services are not affected and deployed applications will continue to run. Storage accounts in this sub region are not affected either. We are actively investigating this issue and working to resolve it as soon as possible. Further updates will be published to keep you apprised of the situation. We apologize for any inconvenience this causes our customers.

20-Apr-12
12:11 AM UTC We are still troubleshooting this issue and capturing all the data that will allow us to resolve it. Further updates will be published to keep you apprised of the situation. We apologize for any inconvenience this causes our customers. Further updates will be published to keep you apprised of the situation. We apologize for any inconvenience this causes our customers.

1:11 AM UTC The incident has been mitigated for new hosted services that will be deployed in the South Central US sub region. Customers with hosted services already deployed in this sub region may still experience service management operations failures or timeouts. Further updates will be published to keep you apprised of the situation. We apologize for any inconvenience this causes our customers.

1:47 AM UTC The repair steps have been executed and successfully validated. Full service management functionality has been restored in the affected cluster in the South Central US sub-region. We apologize for any inconvenience this has caused our customers.

However, Windows Azure Worldwide Service Management did not report problems:

image

I have requested a Root Cause Analysis from the Operations Team and will update this post when I receive a reply.

Data (computing) azure

Published at DZone with permission of Roger Jennings. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • What Are the Benefits of Java Module With Example
  • Monolithic First
  • The 5 Books You Absolutely Must Read as an Engineering Manager
  • Tracking Software Architecture Decisions

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: