Over a million developers have joined DZone.
Platinum Partner

Extracting Parts of Git Repository and Keeping the History

· DevOps Zone

The DevOps Zone is brought to you in partnership with New Relic. Improving the performance of your app is easy with New Relic's SaaS-based monitoring.

At some point, a software project will grow beyond its original scope. In many cases, some portions of the project become their own mini world. For maintenance purposes, it is often beneficial to separate them into their own projects. Furthermore, the commit history for the extracted project should not be lost. With Git, this can be achieved using git-subtree.

While git-subtree is quite powerful, the feature that we need for this task is its splitting capability. The documentation says the following regarding this split feature:

Extract a new, synthetic project history from the history of the prefix subtree. The new history includes only the commits (including merges) that affected prefix, and each of those commits now has the contents of prefix at the root of the project instead of in a subdirectory. Thus, the newly created history is suitable for export as a separate git repository.

This turns out to be quite simple. In fact, there is already a Stack Overflow answer which describes the necessary step-by-step instructions. The illustration below, also dealing with a real-world repo, hopefully serves as an additional example of this use case.

First of all, make sure you have a fresh version of Git:

git --version

If it says 1.8.3, then get a newer version since there is a bug (fixed in 1.8.4) which will pollute your commit logs badly, i.e. by adding “-n” everywhere.

For this example, let’s say we want to extract the funny automatic name generator (for a container) from the Docker project into its own Git repository. We start by cloning the main Docker repository:

git clone https://github.com/dotcloud/docker.git
cd docker

We then split the name generator, which lives under pkg/namesgenerator, and place it into a separate branch. Here the branch is called namesgen but feel free to name it anything you like.

git-subtree split --prefix=pkg/namesgenerator/ --branch=namesgen

The above process is going to take a while, depending on the size of the repository. When it is completed, we can verify it by inspecting the commit history:

git log namesgen

The next step is to prepare a place for the new repository (choose any directory you prefer). From there, all we need to do is to pull the namesgen branch which was split before:

cd ~
mkdir namesgen
cd namesgen
git init
git pull /path/to/docker/checkout namesgen

That’s it! Of course, normally you want to push this to some remote, e.g. a repository on GitHub or Bitbucket or your own Git endpoint:

git remote add origin git@github.com:joesixpack/namesgen.git
git push -u origin --all

The new repository will only contain the files from pkg/namesgenerator/ directory from Docker repository. And obviously, every commit that touch that directory still appears in the history.

Mission accomplished!

The DevOps Zone is brought to you in partnership with New Relic. Know exactly where and when bottlenecks are occurring within your application frameworks with New Relic APM.


Published at DZone with permission of Ariya Hidayat , DZone MVB .

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}