How to make shallow git submodules?

New in the upcoming git1.8.4 (July 2013):

git submodule update” can optionally clone the submodule repositories shallowly.

(And git 2.10 Q3 2016 allows to record that with git config -f .gitmodules submodule.<name>.shallow true.
See the end of this answer)

See commit 275cd184d52b5b81cb89e4ec33e540fb2ae61c1f:

Add the --depth option to the add and update commands of “git submodule”, which is then passed on to the clone command. This is useful when the submodule(s) are huge and you’re not really interested in anything but the latest commit.

Tests are added and some indention adjustments were made to conform to the rest of the testfile on “submodule update can handle symbolic links in pwd”.

Signed-off-by: Fredrik Gustafsson <[email protected]>
Acked-by: Jens Lehmann <[email protected]>

That means this works:

# add shallow submodule
git submodule add --depth 1 <repo-url> <path>
git config -f .gitmodules submodule.<path>.shallow true

# later unshallow
git config -f .gitmodules submodule.<path>.shallow false
git submodule update <path>

The commands can be ran in any order. The git submodule command perform the actual clone (using depth 1 this time). And the git config commands make the option permanent for other people who will clone the repo recursively later.

As an example, suppose you have the repo https://github.com/foo/bar and you want to add https://github.com/lorem/ipsum as a submodule, in your repo at path/to/submodule. The commands may look like like the following:

git submodule add --depth 1 [email protected]:lorem/ipsum.git path/to/submodule
git config -f .gitmodules submodule.path/to/submodule.shallow true

The following results in the same thing too (opposite order):

git config -f .gitmodules submodule.path/to/submodule.shallow true
git submodule add --depth 1 [email protected]:lorem/ipsum.git path/to/submodule

The next time someone runs git clone --recursive [email protected]:foo/bar.git, it will pull in the whole history of https://github.com/foo/bar, but it will only shallow-clone the submodule as expected.

With:

--depth

This option is valid for add and update commands.
Create a ‘shallow’ clone with a history truncated to the specified number of revisions.


atwyman adds in the comments:

As far as I can tell this option isn’t usable for submodules which don’t track master very closely. If you set depth 1, then submodule update can only ever succeed if the submodule commit you want is the latest master. Otherwise you get “fatal: reference is not a tree.

That is true.
That is, until git 2.8 (March 2016). With 2.8, the submodule update --depth has one more chance to succeed, even if the SHA1 is directly reachable from one of the remote repo HEADs.

See commit fb43e31 (24 Feb 2016) by Stefan Beller (stefanbeller).
Helped-by: Junio C Hamano (gitster).
(Merged by Junio C Hamano — gitster in commit 9671a76, 26 Feb 2016)

submodule: try harder to fetch needed sha1 by direct fetching sha1

When reviewing a change that also updates a submodule in Gerrit, a common review practice is to download and cherry-pick the patch locally to test it.
However when testing it locally, the ‘git submodule update‘ may fail fetching the correct submodule sha1 as the corresponding commit in the submodule is not yet part of the project history, but also just a proposed change.

If $sha1 was not part of the default fetch, we try to fetch the $sha1 directly. Some servers however do not support direct fetch by sha1, which leads git-fetch to fail quickly.
We can fail ourselves here as the still missing sha1 would lead to a failure later in the checkout stage anyway, so failing here is as good as we can get.


MVG points out in the comments to commit fb43e31 (git 2.9, Feb 2016)

It would seem to me that commit fb43e31 requests the missing commit by SHA1 id, so the uploadpack.allowReachableSHA1InWant and uploadpack.allowTipSHA1InWant settings on the server will probably affect whether this works.
I wrote a post to the git list today, pointing out how the use of shallow submodules could be made to work better for some scenarios, namely if the commit is also a tag.
Let’s wait and see.

I guess this is a reason why fb43e31 made the fetch for a specific SHA1 a fallback after the fetch for the default branch.
Nevertheless, in case of “–depth 1” I think it would make sense to abort early: if none of the listed refs matches the requested one, and asking by SHA1 isn’t supported by the server, then there is no point in fetching anything, since we won’t be able to satisfy the submodule requirement either way.


Update August 2016 (3 years later)

With Git 2.10 (Q3 2016), you will be able to do

 git config -f .gitmodules submodule.<name>.shallow true

See “Git submodule without extra weight” for more.


Git 2.13 (Q2 2017) do add in commit 8d3047c (19 Apr 2017) by Sebastian Schuberth (sschuberth).
(Merged by Sebastian Schuberth — sschuberth in commit 8d3047c, 20 Apr 2017)

a clone of this submodule will be performed as a shallow clone (with a history depth of 1)

However, Ciro Santilli adds in the comments (and details in his answer)

shallow = true on .gitmodules only affects the reference tracked by the HEAD of the remote when using --recurse-submodules, even if the target commit is pointed to by a branch, and even if you put branch = mybranch on the .gitmodules as well.


Git 2.20 (Q4 2018) improves on the submodule support, which has been updated to read from the blob at HEAD:.gitmodules when the .gitmodules file is missing from the working tree.

See commit 2b1257e, commit 76e9bdc (25 Oct 2018), and commit b5c259f, commit 23dd8f5, commit b2faad4, commit 2502ffc, commit 996df4d, commit d1b13df, commit 45f5ef3, commit bcbc780 (05 Oct 2018) by Antonio Ospite (ao2).
(Merged by Junio C Hamano — gitster in commit abb4824, 13 Nov 2018)

submodule: support reading .gitmodules when it’s not in the working tree

When the .gitmodules file is not available in the working tree, try
using the content from the index and from the current branch.
This covers the case when the file is part of the repository but for some
reason it is not checked out, for example because of a sparse checkout.

This makes it possible to use at least the ‘git submodule‘ commands
which read the gitmodules configuration file without fully populating
the working tree.

Writing to .gitmodules will still require that the file is checked out,
so check for that before calling config_set_in_gitmodules_file_gently.

Add a similar check also in git-submodule.sh::cmd_add() to anticipate the eventual failure of the “git submodule add” command when .gitmodules is not safely writeable; this prevents the command from leaving the repository in a spurious state (e.g. the submodule repository was cloned but .gitmodules was not updated because config_set_in_gitmodules_file_gently failed).

Moreover, since config_from_gitmodules() now accesses the global object
store, it is necessary to protect all code paths which call the function
against concurrent access to the global object store.
Currently this only happens in builtin/grep.c::grep_submodules(), so call
grep_read_lock() before invoking code involving config_from_gitmodules().

NOTE: there is one rare case where this new feature does not work
properly yet: nested submodules without .gitmodules in their working tree.


Note: Git 2.24 (Q4 2019) fixes a possible segfault when cloning a submodule shallow.

See commit ddb3c85 (30 Sep 2019) by Ali Utku Selen (auselen).
(Merged by Junio C Hamano — gitster in commit 678a9ca, 09 Oct 2019)


Git 2.25 (Q1 2020), clarifies the git submodule update documentation.

See commit f0e58b3 (24 Nov 2019) by Philippe Blain (phil-blain).
(Merged by Junio C Hamano — gitster in commit ef61045, 05 Dec 2019)

doc: mention that ‘git submodule update’ fetches missing commits

Helped-by: Junio C Hamano
Helped-by: Johannes Schindelin
Signed-off-by: Philippe Blain

git submodule update’ will fetch new commits from the submodule remote if the SHA-1 recorded in the superproject is not found. This was not mentioned in the documentation.


Warning: With Git 2.25 (Q1 2020), the interaction between “git clone --recurse-submodules” and alternate object store was ill-designed.

The documentation and code have been taught to make more clear recommendations when the users see failures.

See commit 4f3e57e, commit 10c64a0 (02 Dec 2019) by Jonathan Tan (jhowtan).
(Merged by Junio C Hamano — gitster in commit 5dd1d59, 10 Dec 2019)

submodule--helper: advise on fatal alternate error

Signed-off-by: Jonathan Tan
Acked-by: Jeff King

When recursively cloning a superproject with some shallow modules defined in its .gitmodules, then recloning with “--reference=<path>“, an error occurs. For example:

git clone --recurse-submodules --branch=master -j8 \
  https://android.googlesource.com/platform/superproject \
  master
git clone --recurse-submodules --branch=master -j8 \
  https://android.googlesource.com/platform/superproject \
  --reference master master2

fails with:

fatal: submodule '<snip>' cannot add alternate: reference repository
'<snip>' is shallow

When a alternate computed from the superproject’s alternate cannot be added, whether in this case or another, advise about configuring the “submodule.alternateErrorStrategy” configuration option and using “--reference-if-able” instead of “--reference” when cloning.

That is detailed in:

With Git 2.25 (Q1 2020), The interaction between “git clone –recurse-submodules” and alternate object store was ill-designed.

Doc: explain submodule.alternateErrorStrategy

Signed-off-by: Jonathan Tan
Acked-by: Jeff King

Commit 31224cbdc7 (“clone: recursive and reference option triggers submodule alternates”, 2016-08-17, Git v2.11.0-rc0 — merge listed in batch #1) taught Git to support the configuration options “submodule.alternateLocation” and “submodule.alternateErrorStrategy” on a superproject.

If “submodule.alternateLocation” is configured to “superproject” on a superproject, whenever a submodule of that superproject is cloned, it instead computes the analogous alternate path for that submodule from $GIT_DIR/objects/info/alternates of the superproject, and references it.

The “submodule.alternateErrorStrategy” option determines what happens if that alternate cannot be referenced.
However, it is not clear that the clone proceeds as if no alternate was specified when that option is not set to “die” (as can be seen in the tests in 31224cbdc7).
Therefore, document it accordingly.

The config submodule documentation now includes:

submodule.alternateErrorStrategy::

Specifies how to treat errors with the alternates for a submodule as computed via submodule.alternateLocation.
Possible values are ignore, info, die.
Default is die.
Note that if set to ignore or info, and if there is an error with the computed alternate, the clone proceeds as if no alternate was specified.


Note: “git submodule update --quiet(man) did not propagate the quiet option down to underlying git fetch(man), which has been corrected with Git 2.32 (Q2 2021).

See commit 62af4bd (30 Apr 2021) by Nicholas Clark (nwc10).
(Merged by Junio C Hamano — gitster in commit 74339f8, 11 May 2021)

submodule update: silence underlying fetch with “--quiet

Signed-off-by: Nicholas Clark

Commands such as

$ git submodule update --quiet --init --depth=1

involving shallow clones, call the shell function fetch_in_submodule, which in turn invokes git fetch.
Pass the --quiet option onward there.

Leave a Comment