How to handle a large git repository?

Update April 2015: Git Large File Storage (LFS) (by GitHub).

It uses git-lfs (see git-lfs.github.com) and tested with a server supporting it: lfs-test-server:
You can store metadata only in the git repo, and the large file elsewhere.

https://cloud.githubusercontent.com/assets/1319791/7051226/c4570828-ddf4-11e4-87eb-8fc165e5ece4.gif


Original answer (2012)

One solution, for large binary files that don’t change much, is to store them in a different referential (like a Nexus repository), and version only a text file which declares which version you need.
Using an “artifact repository” is easier than storing binary elements in a source repo (made for comparing versions and merging between branches, which isn’t of much use for said binaries).

The other solution, more git-centric, is git-annex:

git-annex allows managing files with git, without checking the file contents into git.
While that may seem paradoxical, it is useful when dealing with files larger than git can currently easily handle, whether due to limitations in memory, time, or disk space.

It is however not compatible with Windows.

A more generic solution could be git-media, which also allows you to use Git with large media files without storing the media in Git itself.

Finally, the easiest solution is to isolate those binaries in their own git submodule as you mention in your question: it isn’t very satisfactory, and the initial clone will still take times, but the next updates for the parent repo will be short.

Leave a Comment