Create new git repo from the subdirectory of an existing git repo

Goal

Create a git repository from the subdirectory of an existing git repository. Furthermore, preserve the log of only that subdirectory.

Let us unpack this statement. The context will also help understand it a little better.

  • There is a git repository hosted on a private git server, let us call this repo private_repo.

  • Within private_repo, there is a folder called repo-specific, which contains the git hooks. These git hooks are written in Python for the Bioconductor git server. The hooks check each push that comes into each Bioconductor package and run validity checks.

  • Now, there is no reason why this folder should not be open-source. It has no private information or sensitive code the world should not see or share with the right open-source license.

  • This directory repo-specific has a git log that we need to preserve while creating a new repository. However, the entire log of the private_repo should not become available. The log of the entire repository has commit messages that record sensitive information.

Some ASCII art to help visualize the repository structure:

private_repo
|- .git/
|- file.txt
|- subdirectory1/
|- ...
|- repo-specific/
|   |- sub_sub_directory/
|   |- ...
|   |- sub_file.txt

Steps

  1. Clone the entire private_repo in a temporary location, with

    $ git clone git@server.com:path/private_repo
  2. Within the repository,

    $ git filter-branch --subdirectory-filter repo-specific -- --all

    The command makes repo-specific the root of the folder, and we extract the history ofthe subdirectory as well.

  3. The next step is to create a new repository on Github. The name of the repository is the same as the subdirectory we extract (for consistency).

    The repository on GitHub is called repo-specific, with the SSH link as git@github.com:Bioconductor/repo-specific.

  4. Back on the command line, we edit the remote in the git repository,

    $ git remote --set-url origin git@github.com:Bioconductor/repo-specific
  5. We can now verify that the remote origin is correct,

    $ git remote -v
    origin  https://github.com/Bioconductor/repo-specific.git (fetch)
    origin  https://github.com/Bioconductor/repo-specific.git (push)
  6. Push changes to the Github repository,

    $ git push --set-upstream origin master

Conclusion

As we can see in Github, there is a new repository Bioconductor/repo-specific. The repository has an open-source LICENSE and small README.md file providing some information. The process of creating a new repository from a subdirectory is fairly straight forward and documented in the advanced git commands section in Github’s documentation.