Extracting Parts of Git Repository and Keeping the History
At some point, a software project will grow beyond its original scope. In many cases, some portions of the project become their own mini world. For maintenance purposes, it is often beneficial to separate them into their own projects. Furthermore, the commit history for the extracted project should not be lost. With Git, this can be achieved using git-subtree.
While git-subtree is quite powerful, the feature that we need for this task is its splitting capability. The documentation says the following regarding this split feature:
Extract a new, synthetic project history from the history of the prefix subtree. The new history includes only the commits (including merges) that affected prefix, and each of those commits now has the contents of prefix at the root of the project instead of in a subdirectory. Thus, the newly created history is suitable for export as a separate git repository.
This turns out to be quite simple. In fact, there is already a Stack Overflow answer which describes the necessary step-by-step instructions. The illustration below, also dealing with a real-world repo, hopefully serves as an additional example of this use case.
First of all, make sure you have a fresh version of Git:
For this example, let’s say we want to extract the funny automatic name generator (for a container) from the Docker project into its own Git repository. We start by cloning the main Docker repository:
git clone https://github.com/dotcloud/docker.git cd docker
We then split the name generator, which lives under
pkg/namesgenerator, and place it into a separate branch. Here the branch is called
namesgen but feel free to name it anything you like.
git-subtree split --prefix=pkg/namesgenerator/ --branch=namesgen
The above process is going to take a while, depending on the size of the repository. When it is completed, we can verify it by inspecting the commit history:
git log namesgen
The next step is to prepare a place for the new repository (choose any directory you prefer). From there, all we need to do is to pull the namesgen branch which was split before:
cd ~ mkdir namesgen cd namesgen git init git pull /path/to/docker/checkout namesgen
That’s it! Of course, normally you want to push this to some remote, e.g. a repository on GitHub or Bitbucket or your own Git endpoint:
git remote add origin firstname.lastname@example.org:joesixpack/namesgen.git git push -u origin --all
The new repository will only contain the files from
pkg/namesgenerator/ directory from Docker repository. And obviously, every commit that touch that directory still appears in the history.