Agile Scrum and Infrastructure
Agile Scrum and Infrastructure
Scrum has come to take the surprise out of your infrastructure.
Join the DZone community and get the full member experience.Join For Free
[Latest Guide] Ship faster because you know more, not because you are rushing. Get actionable insights from 7 million commits and 85,000+ software engineers, to increase your team's velocity. Brought to you in partnership with GitPrime.
An increasing number of the learners in my Agile Scrum class are from non-software environments. Many are from infrastructure. Coaching teams from infrastructure through the Agile framework can be challenging. Certainly, there are many notable differences from software development that require special consideration. Let’s look at some of those differences and see how one company implemented Agile Scrum for a new data center.
If I generalize about infrastructure and Agile projects, most infrastructure projects have an overall budget with fixed scope and sequence, and a goal to minimize cost. In software development, most Agile projects have an overall budget with variable scope and sequence, and a goal to maximize value. Infrastructure projects tend to have well-defined requirements with few unknowns; Agile software projects often have uncertain requirements with many unknowns.
Infrastructure projects are defined by rigid configuration-driven dependencies between tasks. Also adding complexity is schedule-driven tasks like those that must occur during a planned outage, and unplanned work that may upend a perfectly good sprint.
Although the differences are compelling, Agile Scrum is all about breaking a project into pieces and delivering in order of priority to build the solution over time. It is about demonstrating work-in-process to receive early and regular feedback. I have seen teams break infrastructure projects into increments and leverage feedback to inspect and adapt. Let’s consider adding a storage array and see how we can break the work into increments.
If we define a new storage array as an epic, which is a group of stories often spanning more than one sprint, then we might have the following backlog items to complete the epic:
- Datacenter space allocation
- Power/cooling considerations
- Storage array cabling
- Storage array configuration
- Other network and storage configuration
Detailing one of these backlog items, if storage array cabling were written as a story, then the following might be tasks under the story:
- Analysis: Plan the cabling
- Build: Pull and terminate the cables
- Test: Run a packet transfer test
In this way we defined the storage array cabling story as a demonstrable, testable increment that delivers value and, in many cases, would be small enough to complete in one sprint. Next, we examine a real-world infrastructure project where Agile Scrum was used in a similar manner.
The project was the construction of a new data center for a large global company. Development, infrastructure, and operations were separately managed and funded silos. It was common for development and infrastructure to work in isolation on a project, and to integrate at the last phase of the waterfall project to move the application to operations. Surprises would often surface at the last phase of the project, and running over on schedule and budget was routine. The data center project had to be different. It had to be on time and stable. The project window was very short. There was too much at stake.
Management was tired of finding surprises late in projects, and demanded a change. IT leaders had heard that Agile Scrum was good at uncovering surprises early, and the company announced a team to pilot Agile Scrum for the data center project. A team was launched. The team knew that waiting for all the hardware components to be delivered would take too long; they had to be quicker. They asked, “What is a minimal working environment that could be demonstrated after one sprint?”
The team began to design an environment that could host the new applications until the final data center components were available. It would not be the final solution, but it would be a working system – an increment. The team pulled experts from servers, networking, monitoring, storage, and operations. They wanted all the required skills to deliver on the project engaged and operating as one Scrum team.
There were many challenges to building a minimal working environment, as significant hardware pieces were still on order. One by one, they started to visualize how to overcome the dependencies. They populated the backlog, conducted the sprint planning meeting and started sprinting.
The team noted the plan to remediate challenges as follows: “Missing DNS servers will be compensated by using host files. Servers will do the routing instead of routers. VLAN tagging will be leveraged on interfaces to overcome the lack of network ports. Load balancing and SSL termination will be performed in software instead of hardware. Internal disks will be used for data to make up for missing storage arrays.” These solutions were not final and would be replaced once their hardware parts became available. However, it was a minimal working environment.
The delivery of the first sprint was tested by the deployment of the first application. It was a disaster. “Several configuration files were missing. The developers were working on another version of the database, and there was no monitoring.” To be honest, the result was no better than the waterfall phased approach. However, they were only one sprint into a four-sprint release window.
They had found the surprises early. There was time to adapt and try the deployment again in the next sprint. In the meantime, the infrastructure could be improved in parallel. “After three more test deployments sprints, the benefits were clear. There were a lot fewer integration problems.”
The application went live very smoothly, and they delivered on time and within budget. They continued to deliver improvements every sprint — both software and infrastructure. The team had discovered the value of increments and inspect and adapt. It had a better outcome.
In summary, I recommend the following to those who are considering Agile Scrum for infrastructure projects:
- Reduce the size of the infrastructure piece you are trying to build. Small is good. It reduces risk and uncertainty. It enables parallel work.
- Build a team with all the skills required to deliver on the commitment. It is common for infrastructure groups to be working in silos and integrating at the last possible phase to move the application to operations. A Scrum team requires all members to be committed and active for the entire project.
- Have one single Product Owner for the product. Infrastructure groups are often pulled and jerked by competing and conflicting priorities. It is necessary to have one single voice of priority for every product.
- Reduce the size of the sprint to protect planned work from unplanned emergencies. Infrastructure sprints are often upended by code orange, code blue, and other outages. We all know that focus is important to a successful sprint. If we shrink the window of the sprint to one week, we reduce the risk of interruption.
Of course, using Agile Scrum for infrastructure can be complicated by many other factors and influences, but these are some ideas that I have found helpful in finding early success using Agile Scrum in infrastructure.
Opinions expressed by DZone contributors are their own.