The Shift from Polyrepos to Monorepos
Is shifting from polyrepos to monorepos really the answer? How can such transition be done smoothly?
Join the DZone community and get the full member experience.Join For Free
As code repositories have become more complex to reflect the nature of the intricate microservices architecture that they support — and teams have started using multiple repos to manage different projects — challenges have started to appear in their management.
For starters, it is not always possible to maintain coding best practices in code repos when multiple projects are structured differently. There are also bottlenecks in packaging and deploying codes from multiple repositories.
Organizations are starting to transition from polyrepos to monorepos as a way to bring the complex structure back to a manageable level. Having multiple codebases for web and mobile applications alone can be quite complex. Is shifting from polyrepos to monorepos really the answer? How can such transition be done smoothly?
The Logic Behind Polyrepos
The simple reason behind keeping services, modules, and libraries in separate repositories is actually very easy to understand. You are dealing with different components, and many of them – while maybe interchangeable or used by multiple frontend components – are developed by different teams. Keeping them in separate repositories keeps things tidy.
However, problems start to occur when you try to integrate components from multiple repositories as the development cycle comes to its conclusion. A change in UI element, for example, can have a significant effect on how the entire app functions, and keeping track of that specific change is not easy when there are multiple codebases to track.
Other issues are starting to become more common too. The overhead needed to maintain multiple repositories can be huge when the project is not managed for efficiency. A set of services may require the same modules and dependencies, and each repo needs to configure those dependencies separately for the development to work.
Slowdowns become quite common. Simple tasks such as fixing bugs require thorough assessment, and the process itself becomes incredibly slow at times. Even with microservices in multiple repos linked correctly, it is still not always possible to assess the effects of a major change to the rest of the app.
The biggest concern with polyrepos is the complexity involved in tracking revisions and package updates. You cannot pinpoint which particular revision solves a specific bug, which means rollbacks and annotations are unnecessary complexes too. These small difficulties add up, and they result in polyrepos becoming unsuitable for some organizations.
To make matters worse, teams with their own repositories don’t always use standardized tools and frameworks. Each team could have their own preferred tools, and they often introduce unnecessary differences in configurations; when I say differences, I mean variations that still require adjustments by the time you have to integrate microservices and push updates.
The Mono Approach
Centralizing the entire development project using a single repository, the monorepo approach makes sense when the complexities mentioned earlier starts to affect your development cycle. You cannot just combine everything and be done with it though. Some adjustments are still needed before the monorepo can be effective for the whole team.
Simplifying component sharing is another thing that needs to be done when transitioning to a monorepo. Fortunately, you have tools like Bit helping you with the process. The bit, in particular, automates this step by immediately isolating components and their dependencies. When changes are made to one component, Bit makes sure that its dependents are updated too.
The three tools can be used in tandem. Bit + Lerna + Yarn Workspaces are notoriously reliable. Lerna can manage larger packages, while Bit takes care of the small components. Overhead is kept at a minimum when components and resources are shared between multiple packages. Yarn Workspaces, on the other hand, optimizes workflows and ties everything together.
Naturally, transitioning to monorepo is not without its challenges, the biggest of which is keeping testing streamlined and efficient. You can test specific packages when using multiple repositories. You can easily run tests on individual microservices since microservices are already separate and have their own dependencies configured. With monorepo, further steps are needed.
How you test specific packages and components in monorepo depends on the packages that you want to test and the tools you use. PHPUnit, which handles repos for PHP-based development, has functions that allow for package testing individually. The same can be said for NX, another popular monorepo tool; however, NX does not support the publishing of individual packages.
Bit uses a slightly different approach. It associates each package with a compiler, and then link that compiler to a tester that handles everything on a component level. Every component needs to be associated with a tester, but once you configure everything, you can run granular testing across the repository without hassle.
Deployment in another story. Most of the tools are not designed to deploy each incremental change. However, you don’t have to bring the entire system down just to update a single component. As long as you have the right compiler and deployment pipeline configured, you can still push selective updates based on which parts of the repository are changed.
The Real Advantage of Monorepos
Transitioning from polyrepo to monorepo does not always require refactoring, but actually conducting a thorough review and refactoring your packages are steps worth taking if you really want to gain the most advantage from the transition. Sure, you can move multiple repositories into one and continue development, but it will not be efficient.
This is very similar to when refactoring was needed during cloud migration. Monolithic apps can run just fine in any cloud environment, but you gain so much more flexibility and substantially increase efficiency in resource usage by making the transition to smaller components, microservices, and cloud-native packages.
There are some appealing advantages to gain indeed.
- All configurations and tests are now stored in one repository, so your CI/CD pipeline can be leaner. Configurations can be reused to build all packages for deployment. All tests are executed using the same standardized configurations too.
- Atomic commits to become the norm, especially with tools like Bit simplifying the process. You can focus on the features you’re trying to improve rather than having to juggle updating multiple packages for the sake of consistency.
- A single command takes care of all dependencies. With packages being housed in a single repository, there is no need to worry about updating multiple packages when a component is updated. The same is true for config changes.
- Speaking of dependencies, you don’t have to create multiple package.json files for multiple repositories. This definitely simplifies things, since you also don’t need to reinstall dependencies on multiple repos.
- Last but certainly not least, you can really take advantage of shared packages. Codes can be reused, functions can be integrated across the repo, and local symlinks are all that you need to reference resources.
So, is it time to make the shift from polyrepos to monorepos? The answer still depends on the nature of your application and development project. In most cases, the answer is YES. Besides, you always have the option to make the transition only for certain packages.
Published at DZone with permission of Juan Ignacio Giro. See the original article here.
Opinions expressed by DZone contributors are their own.