DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. Databases
  4. The DBA is NOT Infallible or How I Learned to Stop Worrying and Love Automation

The DBA is NOT Infallible or How I Learned to Stop Worrying and Love Automation

In order to get to that zero failure rate and avert catastrophes in database change or any other IT discipline, we need to embrace the autopilot. We need to shuck some manual rituals, stop thinking our limited human capacity is always superior to that of a computer, and trust automation.

Robert Reeves user avatar by
Robert Reeves
·
Apr. 18, 16 · Opinion
Like (6)
Save
Tweet
Share
3.91K Views

Join the DZone community and get the full member experience.

Join For Free

Standard practice for updating the production database is to have a human review the proposed change and implement it. We have done it that way for a long time. We trust the human, or better yet the expert database administrator (DBA), to properly make the change and avoid mistakes caused by people not familiar with the specifics around a unique production database.

To get uber-technical with a theoretical example, we count on that DBA to look for errors like the benign addition of a column with a default value set. That may appear to be an innocuous change at first glance, but when the production table row count is very large (greater than 500K, for instance), that simple mistake can cause a data manipulation language (DML) lock on the table. That, in turn, would extend the maintenance window and impact the SLA.

We count on individuals to make proper changes and catch errors, and most of the time they get it right. However, as Gliffy experienced recently, humans make mistakes. On March 21, an administrator deleted the production database. I repeat: an administrator deleted the entire production database. It wasn’t until a very long three days later that all data was restored. A single human error with the database resulted in the entire system being down and customers feeling the impact for three painful days.

Let that sink in for second. The individual trusted to make the change and consider all possible consequences of the change, made a mistake that led to three whole days of lost business, not to mention long-term impact on customer trust and relationships.

I don’t mean to kick the poor soul while he or she is down. I’m presenting this incident as evidence of a systemic issue. A breakdown in process and technology. I’ve been rocking a command line since the 1980s. In that time, I’ve seen the same pattern emerge over and over. I call that pattern "The Hand on the Rudder." Actually, I’m going to start calling it an anti-pattern.

As humans, we have false confidence that since we (personally) are the ones making the changes, that process is somehow superior to a machine. "Let’s not trust the autopilot; I’m a human." Thing is, unlike a machine, humans become tired, sick, or hungry. We become distracted thinking about our weekend plans or our sick child at home. Yet time and time again, we believe that there is something inherently valuable in us pushing the "Enter" key on the keyboard.

I’d posit that we often choose to perform these tasks manually because we do not have confidence in automation. Garbage in, garbage out. If we can go through steps manually, we valiantly believe we can catch errors on the fly and respond appropriately. I’m certain the administrator at Gliffy thought the same thing. They were wrong, and it cost them. Big time.

If you need a more mainstream argument for trusting automation over humans, let’s consider the self-driving car. Since 2009, Google’s Self Driving Cars (SDC) have logged 1,452,177 miles. In that time, the cars have experienced one lone accident while in autonomous mode. All other accidents occurred while a human was driving the SDC. (You can read the monthly reports here: http://www.google.com/selfdrivingcar/reports/.)

We’ve seen these types of repetitive tasks successfully taken over by automation systems in IT as well. There was a time when I actually performed manual builds on my workstation and used sneaker-net to copy it to a test server using a CD-R. (Kids, sneaker-net is when you walk the file over.) Since Cruisecontrol was first released, I’ve never performed a manual build. There was a time when the "webmaster" would update webpages using Notepad and FTP them to a server. Now, we use a webhost’s admin console to make changes.

I know what you’re about to say. Yes, we still need a person to create the build process. We still need someone to design the webpages. Humans aren’t exiting IT anytime soon. But the boring, repetitive tasks in the process are a recipe for disaster because the human brain simply was not made to perform boring, repetitive tasks. The human mind specializes in creatively solving problems. This is why we are the most successful species on the planet. (No offense, ants… we won on quality, not quantity.) What humans fail at miserably is completing the same task over and over again with a zero failure rate.

In order to get to that zero failure rate and avert three-day catastrophes in database change or any other IT discipline, we need to embrace the autopilot, not hold it at arm’s length. We need the ability to restrict bad behavior and prevent DBAs and other ordinary mortals from making innocent but inevitable mistakes. We need to shuck some manual rituals, stop thinking our limited human capacity is always superior to that of a computer, and trust automation.

Database

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Real-Time Stream Processing With Hazelcast and StreamNative
  • Express Hibernate Queries as Type-Safe Java Streams
  • AWS Cloud Migration: Best Practices and Pitfalls to Avoid
  • API Design Patterns Review

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: