How to Build Your Software Distribution CDN Using Mirrorbits
Code repositories like GitHub don't handle binary distribution very well. With Mirrorbits, you can create your software distribution CDN.
Join the DZone community and get the full member experience.Join For Free
Lots of independent developers use GitHub or its equivalents to host their code repositories. And since they're simple to use and well understood by the public, that's a great idea. But as anyone who's ever created a project that attracted a whole lot of attention can attest, the platforms aren't well-suited to distributing binaries and other software releases.
In most cases, GitHub projects that post binaries do so by having a single server or VPS get triggered by a webhook to compile whatever binaries the project requires. And that means every time a user tries to download the resultant binaries, they’ll be making requests to that same single server instance. That's both inefficient and a recipe for download problems.
But GitHub, for its part, provides a solution to this problem in the form of its GitHub Pages offering. With it, users gain access to CDN services that make hosting binaries a snap — and dramatically improves download performance for worldwide users. But doing so can hand an awful lot of data about your users over to Microsoft, which is an outcome some FOSS developers can't abide.
The same is true if you opt to use one of the many commercial CDN services to solve your distribution issues. But there's a solution already available from some well-known developers that faced — and overcome — the same problem. And although it's not new, it remains relatively unknown in the developer community. It's called Mirrorbits, and here's everything you need to know about using it.
What are Mirrorbits?
Mirrorbits is an application written in Go that makes it possible to create your software distribution CDN. It's built to efficiently redirect download requests to the nearest geographical server location to the downloader. It's a solution dreamed up by the developers of the famous VLC cross-platform video player, and that's currently in use by countless large and popular open-source software projects.
That said, Mirrorbits doesn't have much in the way of documentation, which might be why more developers don't use it. But the software isn't that complex to set up and use, as long as you get some of the basics right. To help, here's an overview of what you have to do.
Begin by Getting Organized
Since Mirrorbits will require you to replicate your distribution files across multiple server instances, you need to get your project organized before setting it up. For simplicity's sake, make sure everything that you'd like to replicate is contained within a single main folder and its subfolders. That will simplify you to create a synchronization process using Rsync or your preferred file transfer utility.
Determine your Coverage Needs
Before you can use Mirrorbits, you have to figure out what geographic areas your traffic typically comes from. That will allow you to set up an appropriate amount of mirror servers in the right places. This is most easily accomplished by examining your server's analytics data. If you don't already have analytics configured, you'll need to do so and collect at least one months' worth of traffic data.
Alternatively, if you're unable or unwilling to track your existing download traffic, you can conduct a user survey to find out what regions your users hail from. You don't need to get any more specific than the country level, as you'll likely need to set up bicoastal mirrors in large nations like the US and China (if you have users in the area).
Select Suitable Mirror Servers
The good news about a solution like Mirrorbits is that it's pretty lightweight and has almost no overhead on the servers it operates on. For that reason, you can get away with using simple DigitalOcean instances or any other VPS service you prefer. Remember, however, that your users' privacy (and your own, as the developer) rests on the practices of the hosting services you utilize. So, if anonymity is an issue, look for offshore hosting that won't retain traffic logs and doesn't require you to identify yourself to set up an account. And keep in mind that you can allow third-party mirrors to aid your project, as VLC itself has done for years.
Set up Synchronization on Mirrors
Once you've got your additional server instances up and running (preferably with NGiNX, a custom domain, and SSL certificates), you're ready to get to work. Begin by configuring your synchronization process to pull data from your origin server at preset intervals. If you have a build time on your origin server that's predictable, the best choice is to set your mirrors to download around 15 minutes or so after that time. And once you trigger the first sync, make sure that all of your mirrors have a complete copy of your files before you proceed.
Install Mirrorbits on Your Origin Server
With your mirrors in place, you're ready to get Mirrorbits itself up and running on your origin server. The process — although lightly documented — is straightforward. That's because the software's configuration file, typically located at /etc/mirror bits.conf contains decent explanations of what everything does. The only thing about the setup that's a bit tricky relates to the GeoIP databases the software uses to determine user location.
The databases you need are available for free from Maxmind, but there's a catch — they're provided under a proprietary license. For developers who find that unacceptable, you can still see the last copies released under the Creative Commons Attribution-ShareAlike 4.0 International License via the Internet Archive links posted here.
Add Your Mirrors to the Installation
Last but not least, you'll need to add the mirrors you set up in the previous steps to your Mirrorbits instance. This is accomplished with a simple command:
Mirror bits add -HTTP https://Your.Mirror.Server
-Rsync rsync://Your.Mirror.Server/Base_Folder Server_Name
Mirror bits scan --enable Server_Name
Mirror bits refresh
Ready to Serve
At this point, you should have a functioning software distribution system that routes your binary download traffic to the server most appropriate for each user. All you have to do is pay attention to see that you've deployed sufficient resources and make changes as necessary. And because it's a system under your complete control — you can custom-tailor it to your project's needs.
Opinions expressed by DZone contributors are their own.