git submodules

Today’s lesson is mainly for the people who want to use the Talia source code, but it could be useful for many git users, so I’m putting it here. In our code we use git submodules a lot; they are a bit like the good old svn externals but have different quirks that are not that easy to understand. And of course neither the official documentation nor the tutorial explain it in more detail. (I’ll assume that you’re already familiar with the general concept):

Adding another git repository as a submodule couldn’t be more easy:

git submodule add <other_repo> <local_path>

This will clone the “other_repo” and put it at the given path. The first thing to note is, that this is a completely independent repository. It will not be pushed, pulled or commited automatically when the “parent” repository is modified.

Unlike a svn external, the submodule is a “pointer” to a specific revision instead of a URL. This means that, even if development on the submodule repository continues, your project will always use the exact same version of that module until you change that version.

But how does it work?

When creating the submodule, git creates two new things: A file called “.gitmodules” and a “fake” entry in the repository, which represents the submodule in the file system. Just think of that as a symbolic link to a specific revision.

When someone clones the project, they will get the “.gitmodules” file and will have to do the following drill:

git submodule init
git submodule update

First catch: The “init” command will read the “.gitmodules” file and simply add it’s content to .git/config”. That’s right, it will copy the configuration, and the “.gitmodules” file will not be used any more. Changes to the “.gitmodules” file will have no effect after you called “init”. The only way to re-update your configuration is:

git submodule sync

Second catch: The submodule is a full-blown git repository. Its contents are completely invisible to the parent repository, but the parent repository tracks the “pointer” to a specific version. When you do a “submodule update”, the system will always fetch that particular version.

But if you modifications to the submodule, by checking out another version/branch (or something like that) and then commit the parent repository, you will modify the “pointer” in the parent directory (assume the submodule is in ./submodule):

git submodule init
git submodule update
# -> ./submodule now points to revision ABCE
cd submodule
git checkout XYZ
# or hack, push, whathever ;-)
cd ..
git commit 
# -> ./submodule now to points to revision XYZ
# Once you push, all other users will get XYZ when they
# do a "git submodule update"

Third catch: Updating submodules is not automatic. When you do a pull or merge, the “pointer” might be updated – but the submodule repository is not.

# submodule is currently at ABCE
git fetch
# submodule now points to XYZ, but the repository is unchanged:
cd submodule
git status
> ... repository is at ABCE
cd ..
git submodule update
# submodule repository is updated to XYZ

(Disclaimer: I compiled this to the best of my knowledge, but I didn’t have time to verify that it works as I promised…)

One thought on “git submodules

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>