Over a million developers have joined DZone.

Synchronize Bitbucket to GitHub Automatically

DZone's Guide to

Synchronize Bitbucket to GitHub Automatically

· DevOps Zone
Free Resource

Learn more about how CareerBuilder was able to resolve customer issues 5x faster by using Scalyr, the fastest log management tool on the market. 

Introducing BitSyncHub

Since I'm an automation nut, when I found Travis CI, I was understandably excited - automatic running of my testcases for hgapi from the repository as opposed to a pre-push hook (as I have had it set up since the beginning of time) would avoid the oh-so embarrassing mistakes of forgetting to add a new file to the repository and having a non-working version in the repo. I just have to set up some service to synch to the GitHub mirror and all will... be... well?

Turns out there was no such service. A hundred advices on how to mirror using push-hooks in your local repository, but since I don't always commit from the same computer, I would need to keep all instances (including future) set up properly, and never again could I be a tad lazy and accept a pull request instead of pushing it from my local repo. This, to me, is not an acceptable state of affairs.

So last week I spent a couple of hours setting up a new service, dubbed BitSyncHub, that will accept POST requests from Bitbucket and synchronize a (Mercurial) repository with it's Github mirror. It is set up using UWSGIhgapi with hg-git, and Celery for job control. It's a bit rough in that it does not report errors (since it does not run synchronously), and always pushes to Github using the same certificate and user, but I've not been able to break it (recently), and it only requires a one-time setup and it will keep your branches in synch!

Updated/added on request:

The code is _really_ simple, when the request is done I first get the Mercurial and Git URLs:

gitrepo = post['gitrepo'].value
payload = json.loads(post['payload'].value)
hg_url = "ssh://hg@bitbucket.org%s" % (
git_url = 'git+ssh://git@github.com/%s' % (gitrepo,)
branches = [b.split(':') for b in post['branches'].value.split(',')]

Then I use Celery to delay the task and the worker clones, bookmarks, and pushes the repository. This does not work in the PyPI version of hgapi (which lacks clone support), but will as soon as I upload the next version later today:

repo = hgapi.hg_clone(source, sync_repo_path)
for branch in branches:
    repo.hg_command('bookmark', '-r', branch[0],  branch[1])
    repo.hg_command('push', '--config=extensions.hggit=', dest)

I removed error handling and cleanup for clarity here, otherwise it's a straight copy from the running source.

Find out more about how Scalyr built a proprietary database that does not use text indexing for their log management tool, allowing customers to search 1TB of data in under a second. 


Published at DZone with permission of Fredrik Håård, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}