Many applications today are like the human body*: a relatively small proportion of “in-house” code, leveraged by dozens if not hundreds of third-party libraries — everything from object-relational mappers to a single function that left-pads a string. And that leads to a conundrum: how do you pick the libraries that you include in your project? Or in other words, is it OK to download something from the Internet and make it a fundamental part of your business?
*The reference is to the amount of bacteria and other organisms that don't share your DNA but live on or in your body. You'll often find a 10:1 (bacteria:human) ratio quoted, but see this article for commentary on the history and validity of that ratio (tl;dr, it's more like 60:40).
Sometimes, of course, you don't have a choice. If you use the JUnit testing framework, for example, you are going to get the Hamcrest library along with it (and maybe you'll feel some concern that the hamcrest.org domain is no longer registered). But what criteria did you use to pick JUnit?
I was recently faced with that very question, in looking for a library to parse and validate JSON Web Tokens in Java. There are several libraries to choose from; these are the criteria that I used to pick one, from most important to least.
- Following the crowd
I believe that care in documentation is a good proxy for care in implementation. Things I want to see are complete JavaDoc and examples. For projects hosted on GitHub, I should be able to understand the library based solely on the README (and there's no reason that a non-GitHub project should omit a README, although I admit to being guilty in that regard).
- Author credibility
This can be difficult, especially where the “author” is a corporation (although large corporations tend to do their own vetting before letting projects out under the corporate name). In the case of a sole maintainer, I Google the person's name and see what comes up. I'd like to see web pages that demonstrate deep knowledge of the subject (especially for security-related libraries). Even better are slides from a conference, because that implies tha the author has at least some recognition in the community.
- Issue handling
Every library has issues. Does the maintainer respond to them in a reasonable timeframe? A large number of outstanding issues should raise a red flag, as should a maintainer that responds in a non-professional manner. You wouldn't accept that from a coworker (I hope), and by using a package you make the maintainer your coworker.
Once I have decided on a candidate library (or small number of candidates) I try it out for my use case. If it looks good, it becomes part of my application. One thing that I do not do is dig into the library source code.
The promise of open-source software is that you can download the sources and inspect them. The reality is that nobody every does that — and nobody could, because it would be more than a full-time job. So we choose as best we can, and hope that there isn't a dependency-of-a-dependency-of-a-dependency that's going to hurt us.