DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
View Events Video Library
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Mobile Database Essentials: Assess data needs, storage requirements, and more when leveraging databases for cloud and edge applications.

Full-Stack Observability Essentials: Explore the fundamentals of system-wide observability and key components of the OpenTelemetry standard.

Monitoring and Observability for LLMs: Datadog and Google Cloud discuss how to achieve optimal AI model performance.

Automated Testing: The latest on architecture, TDD, and the benefits of AI and low-code tools.

Related

  • Three Ways AI Is Reshaping DevSecOps
  • DevSecOps: Integrating Security Into Your DevOps Workflow
  • What Technical Skills Can You Expect To Gain From a DevOps Course Syllabus?
  • How To Start a Successful Career in DevOps

Trending

  • Distributed Tracing Best Practices
  • [DZone Research] Join Us for Our 5th Annual Kubernetes Survey!
  • Resistance to Agile Transformations
  • Difference Between High-Level and Low-Level Programming Languages
  1. DZone
  2. Testing, Deployment, and Maintenance
  3. DevOps and CI/CD
  4. Oyster’s Underground Nightmare: When DevOps Kills Retail

Oyster’s Underground Nightmare: When DevOps Kills Retail

How to avoid the mistakes and lost revenue suffered by Starbucks and a host of other retail customers in 2015.

Ron Gidron user avatar by
Ron Gidron
·
Jan. 05, 16 · Opinion
Like (8)
Save
Tweet
Share
9.06K Views

Join the DZone community and get the full member experience.

Join For Free

How great it must have been for more than 100,000 passengers who enjoyed free rail, bus and tube travel last week after London’s ticketing system failed and station barriers had to be left open.

How nice it was for the Starbucks customers who got free coffee on April 24 last year, and how annoying for the hundreds of thousands of customers at Co-op food stores in the UK that were double charged for their shopping in July.

These are just a few examples of embarrassing errors made by large retailers in 2015. So what caused them and how can they be avoided? To answer this, we need to consider the bigger picture: It seems 2015 marked a turning point in the enterprise software world, one that presents retailers across the globe with both the threat of costly error and an exciting opportunity.

DevOps Done Badly Can Harm Your Business

Development and operations are embracing DevOps practices, which means increasing the pace of development. It also means reacting quickly to customer demands and competitive pressures by shipping out updates to retail and customer service software in smaller and more frequent batches. Indeed, speed and agility are key competitive advantages that every retailer needs these days.

The question is, do you really have the practices and tools in place to ensure that as you make frequent, small updates, you don’t end up killing your company like in this old Knight Capital example from the investment world?

In short, you can’t just do what you always did but faster. Instead, take these three key steps to ensure you can transform your software delivery chain with confidence:

  • Don’t rely on key people for production deployments
    • Too many organizations rely on a few key people to do "sensitive" deployments. This is usually because our applications are complex and unique, so understanding how to change them is often very specialized knowledge that only a few engineers possess. The trouble is we no longer have enough contingency time to allow for rogue updates. With the growing spread of end points in large retail companies and the faster rate of change, the risk of human error increases and becomes inevitable no matter how good your key people are.
    • Automation workflows take the human error factor out of deployments without losing visibility and control. Over time your experts can continuously add pieces of specialized knowledge into an automated deployment workflow that is continuously tested in lower environments and ensures consistency and efficiency when it’s time to execute in production. After all, a computer doesn’t ever mix up the order of things, never forgets to copy a file and does not get tired and make mistakes.
  • Enable efficient rollback or redeploy (automation is key for speed and control)
    • If you’ve followed #1 then this stage is a natural evolution. While the automated workflows themselves are tested in lower environments and run consistently every time, things can go wrong in other places just as they did for Knight Capital. For this reason, automated rollback is an optional yet essential part of every workflow. It provides safety and reduced MTTR (mean time to repair), which in many cases can be even more critical then finding the root cause itself.
  • Investigate and improve
    • Deployments of massive end points and complex systems usually span multiple technical touch points, which is often the reason for failures and glitches. Understanding failures comes second to recovering operations, but is essential nonetheless. The way to understand root causes in hindsight is by recreating and reenacting processes. If your process is manual and performed by people, then it’s usually almost impossible to understand the exact order of things and see things exactly as they played out during the failed deployment. An automated process can be of great help here too, especially if the underlying platform collates outputs from all touch points into a single sequential "run log" that you can review later.
DevOps

Opinions expressed by DZone contributors are their own.

Related

  • Three Ways AI Is Reshaping DevSecOps
  • DevSecOps: Integrating Security Into Your DevOps Workflow
  • What Technical Skills Can You Expect To Gain From a DevOps Course Syllabus?
  • How To Start a Successful Career in DevOps

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: