Concurrency in a GIT repo on a network shared folder

Git requires minimal file locking, which I believe is the main cause of problems when using this kind of shared resource over a network file system. The reason it can get away with this is that most of the files in a Git repo— all the ones that form the object database— are named as a digest of their content, and immutable once created. So there the problem of two clients trying to use the same file for different content doesn’t come up.

The other part of the object database is trickier– the refs are stored in files under the “refs” directory (or in “packed-refs”) and these do change: although the refs/* files are small and always rewritten rather than being edited. In this case, Git writes the new ref to a temporary “.lock” file and then renames it over the target file. If the filesystem respects O_EXCL semantics, that’s safe. Even if not, the worst that could happen would be a race overwriting a ref file. Although this would be annoying to encounter, it should not cause corruption as such: it just might be the case that you push to the shared repo, and that push looks like it succeeded whereas in fact someone else’s did. But this could be sorted out simply by pulling (merging in the other guy’s commits) and pushing again.

In summary, I don’t think that repo corruption is too much of a problem here— it’s true that things can go a bit wrong due to locking problems, but the design of the Git repo will minimise the damage.

(Disclaimer: this all sounds good in theory, but I’ve not done any concurrent hammering of a repo to test it out, and only share them over NFS not CIFS)

More Related Contents:

Leave a Comment Cancel reply