I recently helped a team to switch SVN servers and found a few gotchas along the way. This is a short guide on what worked for me and some stuff I tried that didn’t.
The reason for migrating to a new Subversion server was that Atlassian shut down their hosted Subversion service. The team had been dragging out on the migration for far too long (there was more than a year’s notice). When they asked me for help, there were only a few days left before the old service was shut down. They said that they had considered moving to Git and that there was probably a server set up somewhere by the operations department.
I had a quick look at the repo and asked the team about their situation and found out a number of facts to take into consideration for a move.
- A gziped demp of the repo was about 1.3GB
- The repo had not followed the standard trunk/branches/tag structure. A few levels below a
trunkdirectory I found another set of directories: trunk, branches, tags.
- The repo contained code from different projects. Some legacy that were no longer needed, some being actively developed.
- There was no time for a long service window. I basically had to get everything moved from one working day to another.
Based on those facts I first tried to use svnsync to make a new copy of the repo, while keeping the old one alive. It got stuck after a few hundred commits and then claimed the sync was broken. I reset the target repo and tried again, with the same result.
I gave up on using svnsync and reverted to the basic plan: download a backup and import to a repo on the target server. That worked.
These are the steps I did run to move the repo. Please note that the
svnadmin commands need to be run on the source and destination servers. They cannot be run from a client.
- Dump the repo on the old server:
svnadmin dump source | gzip > dump.gz
- Copy the dump file to the new server
- Create a new empty repo on the new server:
svnadmin create dest
- Import the dump file to the new repo:
gunzip -c dump.gz |svnadmin load dest
One thing that really surprised me was the size of the dump file vs the size of the repo. When I did this, I downloaded a .gz dump file. I didn’t create it myself, so I didn’t know how big the source repo was. What I knew was that the dump file was 1.3GB. Uncompressing it on my local computer gave 5GB and I only had about 5.5GB left on the destination server… However it worked out just fine: The final repo was just about 600 MB in size.
The reason for this is that the dump file contains a lot of duplicate information. When branching, the repo only contains a pointer back to the original commit, but the dump file contains the entire contents of the branch.
For some people, “Git” is the answer to any question about source control. In this case there were of course some people arguing for a move to Git. While I really do like Git, this wasn’t the right time for it. (I was thinking about writing “love Git” but that would be wrong. I may like or dislike tools, but they are just tools and I try hard to not get so attached to a tool that I love it and start seeing it as the ONLY TRUE ANSWER)
The reasons that I didn’t recommend Git were:
- The team was not ready. Nobody on the team had Git experience and they didn’t have enough understanding of how different Git is in some ways.
- The source repo was quite messy. It was not possible to make a clean import to Git with git-svn.
- The source repo was far too large to import into a single Git repo. It contained things like build outputs that should not be imported into a new repo.
- Subversion is good enough for the needs of the team and it is less complex than Git.