Curing System Blindness
Curing System Blindness
Join the DZone community and get the full member experience.Join For Free
Engineers build business. See why software teams at Atlassian, PayPal, TripAdvisor, Adobe, and more use GitPrime to be more data-driven. Request a demo today.
I’ve been writing about seeing systems, and got to thinking about a company I did some work for a few years ago–because they were a great example of how focusing on events leads to blame and prevents people from seeing patterns.
Here’s the story. The customer service organization in this company had serious problems with availability of a whole slew of systems that their reps relied on.
It was such a problem that they created a new role and a new department to deal with the issue. The “availability analysts” were charged with collecting data, analyzing the data and supporting a solution to the problem(s).
And collect data they did. They had data on system performance, run times, crashes, errors, and abnormal conditions; server up time, server down time, software outages (by application and system component); problem escalations, helpdesk calls, and trouble tickets.
When they weren’t collecting data, they were busy creating “the deck” for the (dreaded) management meeting. The deck was pages and pages thick. Page one listed the lost productivity figures for the month, in “productive FTE minutes” lost. Page two listed the number of incidents.
But page five was the big event: a pie chart that showed all the sources of lost productive FTE minutes for the month. The availability analyst walked the group pointing out that 25% of the outage minutes were due to network problems, 20% due to Mainframe outages, and so forth.
At the end of the pie chart report, the highest ranking manager would demand, ”What are you going to do about this?” Everyone else at the table tried to look small. After some squirming, one of the lower ranking people would put forth an idea. ”I’ll expect a progress report next month,” the top manager would say, sounding stern. And that was the end of the meeting.
Then, the availability analysts scrambled out to start chasing the problem of the month.
And it all started again the next month when the new pie chart was published.
Both analysts and managers were firmly focused on a snap shot of events, and missed the patterns. The way they were presenting information helped hide the patterns and keep the focus on the latest hot issue.
I worked to help them see the patterns, which lead to understanding structure and dynamics–and taking meaningful action.
Now, they did need to respond to events–bring the network back up, or swap out a server for example. They needed to adapt to some of the patterns. For example, figuring out how to deal with certain types of outages more effectively until deeper changes took hold.
But as long as they only focused on short-term events–the monthly outage minutes–there was little chance of improving the overall situation. They were system blind.
Published at DZone with permission of Esther Derby , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.