DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. Data
  4. Mining the Ground Truth of Enterprise Toolchains

Mining the Ground Truth of Enterprise Toolchains

See what we can learn about enterprise agile and DevOps environments from research like the State of DevOps reports.

Mik Kersten user avatar by
Mik Kersten
·
Nov. 08, 18 · Analysis
Like (2)
Save
Tweet
Share
5.69K Views

Join the DZone community and get the full member experience.

Join For Free

To learn more about what works and what doesn't in large-scale DevOps and agile deployments, we need data. The problem is, that data is notoriously difficult to get ahold of because much of it lies hidden across numerous private repositories.

Efforts such as The State of DevOps reports have helped us gain some understanding by using survey data to answer questions about practices such as the frequency of deployments in a team or organization. However, survey data has its limitations, as Nicole Forsgren and I described in "DevOps Metrics," which we wrote to clarify the trade-offs of system and survey data collection. 1 Today, our understanding of DevOps practices is based largely on this survey data and on anecdotal evidence. Is there a way to expand our view of DevOps to include studies of system data of DevOps at work?

One approach is to examine publicly available repositories, such as those hosted by GitHub or the Eclipse and Apache foundations. However, the conclusions from this research are limited to how open source projects work. Large-scale and enterprise software delivery differs considerably from open source delivery in terms of the scale, scope, and type of work.

In my PhD research, I initially studied open source developers. 2 Gail Murphy, my supervisor, pushed me to expand my study to professional developers in enterprise settings. Having spent most of my career doing open source development, I was shocked at how different the work was. The most interesting thing I learned was the additional complexity with which the professional developers worked on a daily basis. The number of applications, systems, processes, and requirements dwarfed anything I encountered in the much more elegant world of open source.

In my article " The End of the Manufacturing Line Analogy", I discussed how advanced car manufacturing relates to software production. 3 One of the amazing things about car manufacturing is that the "ground truth" of production is visible on the factory floor. Walking the assembly line provides an instant view of the workflow. Where can we find the ground truth of enterprise software delivery? How might that ground truth change our understanding of what works and what fails in software delivery at scale?

Exploiting the Cambrian Explosion of Tools

In my last post, I summarized how a "Cambrian explosion" has led to the proliferation of hundreds of DevOps tools. 4 One key reason for this explosion is how specialized the tools have become for various stakeholder needs. For example, a large enterprise might have a dozen different specialists involved in software delivery, such as Java experts, AWS (Amazon Web Services) experts, design experts, or support staff. There's now a specialized tool for each role.

That presents an interesting opportunity. The more that these tools are used, the more those particular practitioners' work is captured within them. If only we could get at that data, we would have a unique chance to better understand how DevOps, and software delivery in general, works in practice.

The challenge is that such end-to-end system data is inaccessible. It's hidden behind organizations' firewalls or locked in private repositories. Occasionally, a vendor will have a slice accessible-for example, a software-as-a-service support desk tool vendor might have cross-company information on support tickets. However, that's only one slice of the value stream; it misses all the development and other upstream data and does not provide an end-to-end view.

In my study of open source and professional developers, the trick was to use the developers' access to tool repositories as a proxy for what was happening in those repositories. But, again, that was just a slice of the value stream. However, through that experiment, I realized I had access to exactly the people who had visibility into the end-to-end set of repositories: the enterprise IT tool administrators.

Value Stream Integration Diagrams

My company Tasktop works closely with many enterprise IT tool administrators responsible for the agile and DevOps toolchain. Each engagement that our solutions architects undertake results in the creation of a value stream integration diagram. The first time I looked at these diagrams in aggregate, I realized I had a data-set that was as interesting as the one Gail and I collected during my PhD studies. These diagrams depict each of the tool repositories in the value stream, each artifact type stored in those repositories, and, most important, how the artifact types are related. These diagrams were collected not through an academic study but through a data collection process put in place for working with the enterprise IT tool administrators and their tools. The data is biased toward Tasktop's customers and prospects, who tend to be 500 enterprise IT organizations seeking integration across one or more tools.

Tasktop has collected 308 of these diagrams. Figure 1 shows some of them. They're a fascinating window into the ground truth of enterprise toolchains. As such, they might inform future efforts in the collection of software delivery data in interesting ways. Here, I provide a very high-level overview of what we learned from them. A more detailed analysis will appear in my upcoming book Project to Product.

The diagrams provide a moment-in-time summary of each tool in the value stream and information on what the key artifacts captured in each tool are, as well as how they are or should be connected. The diagrams do not exhaustively list all the tool repositories in an organization or all the artifact types. Nor do they provide information about the data in those tools-for example, the number and types of defects. But they do provide the ground truth about the composition of these organizations' enterprise IT toolchains.

What the Data Revealed

There could be relevant tools outside this set. For example, these organizations have only recently been reporting vulnerability tracking tools as part of their DevOps tool-chains. A tool's absence from the results doesn't mean that it wasn't present, just that it wasn't considered for inclusion in the organization's view of the connected value stream at that time.

The diagrams were sourced from Tasktop customers and prospects defining what tools and artifacts they wanted to connect. The majority of the diagrams came from enterprise IT organizations in the Fortune 1000. Table 1 shows the industry breakdown.

Table 2 lists the types of tools used. As expected, agile-planning and application lifecycle management (ALM) tools dominated, but IT service management, project portfolio management, and requirements management also formed a key part of the toolchains. Requirements management tools continued to see significant use, even in the age of agile and DevOps. In contrast, initiatives to connect customer relationship management (CRM) and security tools were still rare. Altogether, the dataset included the use of 55 tools.

Even more interesting is what information was tracked in the tools. Table 3 provides insight into the artifacts created and thus the types of work. At a high level, imagine these artifacts corresponding to the widgets that flow through the various tools that perform software delivery. In an upcoming article, I'll discuss the relevance of these various types of artifacts.

Combining the data from Tables 2 and 3, we observed that the artifacts spanned multiple tools. For example, features were tracked across agile, ALM, requirements management, and sometimes IT service management tools. We interpreted this as another indication that the number of tools and their specialization in large-scale agile and DevOps environments are growing. However, the types of artifacts being stored in those tools (see Table 3) is considerably smaller, and the artifacts tend to span multiple tools. For example, a single defect can span agile, ALM, requirements management, and IT service management tools.

Some of the most interesting findings are in Table 4. We see that only 1.3 percent of organizations used a single tool. More interestingly, 69.3 percent of the organizations were connecting artifacts across three or more tools. The more surprising finding was that more than 42 percent of the organizations needed to integrate four or more tools, indicating the complexity involved in developing large-scale enterprise software. It also supports the notion that specialization of roles in software development is common.

References

This is the fifth blog in a series promoting the genesis of my book Project To Product If you missed the previous blogs, . To ensure you don't miss any further content, you can receive future articles and other insights delivered directly to your inbox by signing up to the Project To Product newsletter.

A version of this article was originally published in the May/June 2018 issue of IEEE Software: M. Kersten, "Mining the Ground Truth of Enterprise Toolchains," IEEE Software, vol. 35, no. 3, pp. 12-17, ©2018 IEEE doi: 10.1109/MS.2018.2141029 - Original article

  1. N. Forsgren and M. Kersten, "DevOps Metrics," , vol. 15, no. 6, 2018; queue.acm.org/detail.cfm?id53182626.
  2. M. Kersten and G.C. Murphy, "Using Task Context to Improve Programmer Productivity," Proc. 14th ACM SIGSOFT Int'l Symp. Foundations of Software Eng. (SIGSOFT/FSE 06), 2006, pp. 1-11; dl.acm.org/citation.cfm?id51181777.
  3. M. Kersten, "The End of the Manufacturing-Line Analogy," IEEE Software, vol. 34, no. 6, pp. 89-93.
  4. M. Kersten, "A Cambrian Explosion of DevOps Tools," IEEE Software, vol. 35, no. 2, pp. 14-17.
Software development Open source Data (computing) Artifact (UML) DevOps Application lifecycle management IT Diagram agile

Published at DZone with permission of Mik Kersten, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • (Deep) Cloning Objects in JavaScript
  • Distributed Stateful Edge Platforms
  • Why It Is Important To Have an Ownership as a DevOps Engineer
  • How To Validate Three Common Document Types in Python

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: