While the proliferation of open source software such as that found on npm is a great thing, what drawbacks can it present to developers?
Join the DZone community and get the full member experience.Join For Free
After 20 Years, 200 Billion
Just as open source software turns 20 years old this week, these numbers are a testament to the incredible magic that happens when communities of developers openly share innovations.
Software Supply Chains at Work
What we’re witnessing here is a software supply chain in full effect. Open source projects contribute packaged code to the community, place it in public warehouses, and it’s then consumed by development teams around the globe to create new front-end, back-end, or mobile applications for all of us. Every development team on the planet now utilizes a software supply chain that operates at insane speeds with massive throughput.
In a second tweet yesterday, Laurie wrote, “Since we have ~12 million users, that means the average user installed the neatly round number of 1000 packages in 7 days. In reality, most individuals installed very few packages, and a bunch of CI build boxes installed 10s or 100s of thousands of packages each.”
Essentially, there is an army of robots (automated CI tools) downloading billions of components from the web. While this type of automation is efficient, by itself it does not serve teams well.
Repositories to the Rescue
At Sonatype, we saw similar behavior by developers in the early days of Maven. Individual developers would download Java components directly from Maven Central. Maven users sitting right next to one another in the same room and on the same team would download the same versions of the same components needed for their builds, repeatedly. It was for this reason, among others, that our founders invented the Nexus repository manager. Nexus provided instant benefits to development teams while streamlining their software supply chains. These include:
- Caching. Nexus acts as a local parts warehouse for development teams. New components are downloaded once and stored locally. This eliminates the need to download the same package 1000 times.
- Sharing. Components shared locally can be shared infinitely. Developers don’t need to go out to the internet to get their packages, and teams can begin to standardize on common versions to reduce content-switching and technical debt.
- Privatizing. Not all packages can be shared out on the internet. Teams often build their own packages that need to be shared locally but kept proprietary. Nexus also solved this issue.
While repository managers have been used for quite some time in the Java community, Laurie’s post tells us that adoption of the technology still lags in younger, less mature, packaged code ecosystems.
Not All Components Are Created Equal
I have often said that if you have 100 developers, you have 100 front doors open into your organization. 1000 developers? 1000 front doors.
If you dissect Laurie’s comment further, you will also realize that any developer can bring any package into your organization at any time. While this does make them more efficient, it also points to security and governance concerns. Every component was developed by someone else, donated out to the internet, and consumed in the millions. But you have no true idea of the source or origin of many of the components.
Research I have done in previous years at Sonatype demonstrated that 5.5% (1 in 18) of components downloaded by repository managers had known security vulnerabilities. While the percentage is not very large, keep in mind that a repository manager will only download a component once. Once cached, future downloads of that component are unnecessary, and the components can be reused infinitely by that team.
A Sonatype analysis of over 40,000 Nexus repositories reveals that the average repository holds over 1,600 components. A deeper analysis of the 1600 components housed in the average repository manager found 192 security vulnerabilities were present among the components (some components having more than one security vulnerability).
Within software supply chains, repository managers and private container registries represent procurement gates into the development organization. The gates can be left wide open where component flows are not governed or they can represent opportunities for quality and security checkpoints that ensure defects are not passed downstream.
Understanding the importance of secure development practices, Sonatype’s Nexus took local warehousing of parts one step further in 2012 by identifying known security vulnerabilities for components in its caching repositories. Development teams using its Repository Health Check function could identify components that had gone bad over time, and then choose to utilize safer versions. In late 2015, we took that another step further with automated policies in Nexus Firewall -- a subject that is covered well in my other posts.
What’s in Your Repository Manager Is in Production
"Alarmingly, many sites continue to rely on npm packages like YUI and SWFObject that are no longer maintained. In fact, the median website in [NU’s] dataset is using a library version 1,177 days older than the newest release, which explains why so many vulnerable libraries tend to linger on the Web."
What Is Wrong With 200 Billion Downloads?
Nothing from an innovation standpoint. It’s awesome and we can all celebrate this achievement with Laurie.
At the same time, it gives us pause to reflect upon other concerns of our day, including cybersecurity threats, technical debt, and wasteful context switching.
If you are interested in learning more about universal repository managers, I invite you to download Nexus Repository OSS today. It’s free, fully functional, and used by over 150,000 development teams worldwide.
Opinions expressed by DZone contributors are their own.