Over a million developers have joined DZone.

Bash Script to Convert Subversion to Git

· Java Zone

Easily build powerful user management, authentication, and authorization into your web and mobile applications. Download this Forrester report on the new landscape of Customer Identity and Access Management, brought to you in partnership with Stormpath.

For fun, and practice with bash scripting, I thought I’d see what it would look like to make a script to convert Subversion repos to Git. Mine does a fairly good job of converting the files in a trunk in thirty or so lines of code:


if  [ ! -d .git ]; then echo "no .git folder - do 'git init'"; exit 10; fi
if  [ ! -d .svn ]; then echo "no .svn folder - checkout the trunk of some subversion repo"; exit 10; fi
[ ! -d svn_to_git_commits ] && mkdir svn_to_git_commits
echo -e ".svn\nsvn_to_git_commits\nsvn_to_git_revision.txt\nsvn_to_git_revisions.txt" > .gitignore
git add .gitignore > /dev/null

svn log | grep '^r[0-9]* ' | cut -d' ' -f 1 | cut -d'r' -f 2 | sort -n > svn_to_git_revisions.txt
prefix=$(svn info | grep "^Relative URL:" | sed 's/Relative URL: ^//' | sed 's#/trunk##')

while ((i++)); read -r rev; do
    trap "echo Exited!; exit;" SIGINT SIGTERM

    svn up --force -r $rev | sed '/^At revision/d' | sed '/^Updating /d' | sed '/^  /d' | sed '/^ U/d' | sed '/^Updated to/d'
    svn log -v -r $rev > svn_to_git_revision.txt

    revisionLine=$(cat svn_to_git_revision.txt | grep '^r[0-9]* ')
    author=$(echo $revisionLine | cut -d'|' -f2 | sed 's/(no author)/none/' | cut -d' ' -f2 | sed "s/^$/none/")
    date=$(echo $revisionLine | cut -d'|' -f3 | cut -d'(' -f1)
    messageText=$(cat svn_to_git_revision.txt | awk '/^$/ {do_print=1} do_print==1 {print} NF==3 {do_print=0}' | sed '/------/d' | sed 's/\"/\\\"/g') 

    cat svn_to_git_revision.txt | sed "s/^ *//" | sed 's/(.*)$//' | sed "s/ *$//" | grep "${prefix}/trunk/" | sed "s#${prefix}/trunk/##" | sponge svn_to_git_revision.txt
    grep "^[AMR]" svn_to_git_revision.txt | cut -d' ' -f 2-99 | xargs -I {} git add "{}"
    grep "^D" svn_to_git_revision.txt | cut -d' ' -f 2-99 | xargs -I {} git rm -q --ignore-unmatch -r "{}"
    git commit --author "\"${author} <${author}@unsure>\"" --date "\"${date}\"" -m "Svn Rev: ${rev}.${messageText}" > svn_to_git_commits/"${rev}".txt

    echo "Svn revision ${rev} on $(echo $date | cut -d' ' -f 1,2)."    
    if [[ $(( i % 4000 )) == 0 ]]; then time -p sh -c 'git repack; git gc'; fi 
done < svn_to_git_revisions.txt
time -p sh -c 'git repack; git gc'
echo "ALL DONE WITH A GIT REPO SIZE OF $(du -h -d 0 .git | cut -f1)."

The above uses ack, sed, grep, and sponge from [a href="https://joeyh.name/code/moreutils"]moreutils. Note: ack is ack-grep on Linux.

Timings: 12 mins to convert a repository that was ultimately only 4.4MB in size (the .git folder’s disk usage), but over a fairly slow connection.

Compare that to just over 2.75 mins for the the same repo with git-svn-clone – over four times faster. The git-svn way probably preserves more meta data on the commits, but the actual files for the final revision are identical for both versions. My script is just for trunks, and would need some tweaks to cover commits happening to branches. It already covers commits merging in to trunk from branches.

I don’t think there is anything that can be done to the script that could boost the speed more than a small percentage. I even tried Gnu parallel instead of xargs, but it blew up as git does not quietly wait lock for locks to be released during its operations. Besides, 8 mins alone is just spent doing “svn up” one revision at a time.

Building Identity Management, including authentication and authorization? Try Stormpath! Our REST API and robust Java SDK support can eliminate your security risk and can be implemented in minutes. Sign up, and never build auth again!


Published at DZone with permission of Paul Hammant, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}