Why Open Source Is Much More Than Just a Free Tier
With a deeper focus on the data infrastructure industry, where open source has been very successful.
Join the DZone community and get the full member experience.Join For Free
Open source has been on the rise for the past few decades. From small startups to large enterprises, open source has now become a crucial part of the software development process. While open source is often thought of as simply a free alternative to proprietary software, it is actually so much more than that.
In this article, we will explore the reasons why open source has been so successful, the areas where it has not been as successful, and the differences between open source and free tiers of software, with a deeper look on the data infrastructure industry.
How Open Source Has Been Eating Software
The term "open source" was first coined in 1998 by the Open Source Initiative (OSI). Since then, open source has taken the software world. Unix was replaced by Linux, Oracle DB by Postgres and MySQL, Bigkeeper by Git, even VS code has become the IDE standard. According to a 2020 report by the Linux Foundation, open-source software has grown so much that more than 96% of global organizations are now using it, and the number is increasing year over year. In a 2020 survey by GitHub, 98% of respondents said they use open-source software, and 96% of respondents said they contribute to open-source projects.
As a16z said, as software eats the world, open source has been eating software. One of the most notable areas where open source has been making significant gains is in the data infrastructure industry. Tools such as Hadoop, Spark, Kafka, Airflow, Grafana, Kibana, Metabase, and more have been widely adopted by companies and organizations for big data processing and analytics.
Let’s dive into explaining the reasons behind the success of open source.
Why Open Source Is So Successful
Open source has been extremely successful for a number of reasons.
Control, Flexibility, and Agility
One of the main reasons is control, which refers to the flexibility and agility that open source provides. This allows developers to customize software to meet their specific needs, which gives them a competitive advantage and can be especially important for large enterprises, with their custom needs.
Accelerated Time to Value
Open-source software is often easier to start with and deploy than proprietary software. It is most often deployed and hosted within a company’s servers or virtual private cloud, which enables developers to avoid going through long and cumbersome security compliance processes. This is reinforced by the absence of budget validation processes to go through too, as there is no charge for open-source software.
Developers can literally start testing a technology the day they discover it. In comparison, proprietary software might not be accessible to developers for several months because of those security compliance and budget validation hurdles.
Better Reliability and Security
There are several reasons why open-source software tends to be more reliable and secure.
For one, it is deployed within the company’s infrastructure. This means companies are in full control over their data.
Then, having direct access to the code enables teams to look for any potential issues directly without depending on external parties whose priorities might not align with those of their company.
Last but not least, the community-driven development model allows for more feedback, contributions, and, most importantly, trust. It allows fast iteration cycles with multiple participants who contribute, and everything is version controlled and transparent. This improves the quality and reliability of a product, as issues are fixed faster and features added more quickly.
This makes open-source software more reliable and secure than proprietary software.
Open-source software tends to see higher innovation velocity, thanks to all the community contributions. Another reason for that velocity is actually an often overlooked point: open source also attracts technical talent, as most engineers prefer working on open source and value the open-source philosophy. And stronger teams innovate faster and better.
No Vendor Lock-In
The last aspect driving open-source adoption is that proprietary software always comes with some lock-in aspect that makes it hard for companies to switch from one solution to another in the event their needs are not addressed. With open source, if you’re not satisfied with a paid version of an open-source solution, you can just go back to hosting the open-source version; you’re not locked in to that paid solution.
Where Open Source Is Not Successful
While there are some industries where open source has become the norm, some industries haven’t been touched by it. Here are a few criteria that are common to all of the industries that do not (yet) use it:
Users of the Open-Source Technology Are Not Technical
Open-source software adoption is mainly driven by engineers attracted by the accessibility and visibility of the code. But because the rest of the organization doesn’t work with software code, the value of open source doesn’t matter much to them. This explains the lack of adoption of open source in marketing, sales, and finance tools.
Lack of Overlap Between Use Cases, Too Much Customizability Needed
Another issue that might occur with open source is the use case addressed needs too much customizability, meaning that every company uses the software for a different use case. While open source is great to address custom needs, as users have access to the code itself, a lack of common ground will make it difficult to build a valuable one-size-fits-most foundation to start from.
Lack of Overlap Between User Profile and Contributor Profile
Another challenge facing open source is the lack of overlap between the developer community and the user community. In many cases, the potential contributors of open-source software are not the same people who are using the software. This limits the contribution potential of the community. Developers are very motivated to make an open-source software work if they use it. It actually becomes part of their job to make that software work for their company.
One could argue that this criterion is more indicative of the community contribution potential of a project.
Not a Known Problem
Finally, an open-source software will have less adoption if the problem it addresses is not well known, as it will require a lot of education of the market to make it successful. For example, open-source solutions for IoT, edge computing and many others are still relatively new, and it will take time for the industry to mature and for users to understand the benefits of open source.
On the other hand, for example, generating and collecting logs is a very old problem, which might be a good reason why OpenTelemetry became the second highest velocity project in the CNCF ecosystem. Pretty much any company has data integration challenges, so no education on the value this open-source solution provides is necessary.
For all the industries where open source is not successful, we see a lot of adoption of free tiers. Without considering the difference in audience, let’s see how they compare.
How Open Source Is Different from a Free Tier
Open-source software is often thought of as a free alternative to proprietary software, but it is important to understand that open source is not the same as a free tier. As mentioned above, Open source offers more than free tier in several key aspects:
- Control and customizability: Open source is not just free to use — it's also free to modify, distribute, and share.
- Engaged community for better reliability and innovation velocity: An open-source community is much more engaged than a community of free users. This leads to a more robust and stable product compared to a free tier, which may not have the same level of support and maintenance.
- Time to value, when security compliance is involved: For software that requires some security compliance process, the accessibility of the open-source code makes that process a lot simpler.
- No vendor lock-in: If you are not satisfied when using the paid version of an open-source software, you can always switch back to the open-source version.
The one time a free tier is more convenient than open source is when the main concern is accessibility, when there is no need for security compliance.
So deciding whether to use a free tier or open-source solution will depend on the specific needs of the project.
If a company is just starting out and wants to test out a technology without committing to a long-term contract, a free tier may be the best option. Keep in mind, however, that a free tier — by its very nature — is designed to offer a limited set of features in hopes of drawing you in to a paid version with greater scale and flexibility.
Hence, if a company requires a more advanced set of features, more customizability, and support from a large community of developers, open source may be a better choice. Additionally, if a company has specific security or compliance requirements, open-source solutions may be more easily compliant, as the code is transparent and can be audited.
It's important to consider the criteria of each solution and weigh the pros and cons to determine which one is the best fit for the specific project or use case.
What About the Data Infrastructure Industry?
In the data infrastructure industry, depending on the problem, open source or free tiers might be a better fit.
For example, when it comes to data warehousing, a free tier may be the best option. Many vendors offer a limited version of their data warehouse for free, which can be a great way to get started and test out the technology. This allows end users to experiment with different data warehousing solutions without committing to a long-term contract or paying for a full-featured version.
On the other hand, when it comes to data visualization, a free tier may not be the best option. Data visualization is often a critical part of data analysis and decision-making, and a free version may not have all the features and capabilities that a company needs to drive the right decision. In this case, open-source solutions such as Apache Preset may be a better option, as they offer a wide range of features, flexibility, and customizability.
Orchestration users usually have a deep need for customization, as they might want to trigger custom Python jobs, or jobs on other third-party tools. A limited free tier would prevent users from getting the value of the orchestrator. This explains why the three top data orchestration tools — Airflow, Dagster, and Prefect — are all open source!
Data Integration, Specifically ETL
Last but not least, what about data integration and ETL?
While most connectors are common to a lot of companies, most companies also have custom connector needs.
Also, a free tier will most likely be limited in volume of data, and you might want to test the solution for a big source of data, such as a database, in order to see how that ETL solution performs with high volume. A free tier won’t help you address these concerns.
This makes data integration another industry where open source is thriving.
In conclusion, open source is much more than a free tier. While free tiers can be a great way to get started and test out new technologies, open-source solutions offer much more control, flexibility, and community support. They are also more likely to be adopted in areas where customizability and adaptability are important such as visualization, orchestration, and ELT processes. In the data infrastructure industry, only the data warehousing segment hasn’t been eaten by open source yet.
Published at DZone with permission of John Lafleur. See the original article here.
Opinions expressed by DZone contributors are their own.