What is the internal format of a Git tree object?

The format of a tree object:

tree [content size]\0[Entries having references to other trees and blobs]

The format of each entry having references to other trees and blobs:

[mode] [file/folder name]\0[SHA-1 of referencing blob or tree]

I wrote a script deflating tree objects. It outputs as follows:

tree 192\0
40000 octopus-admin\0 a84943494657751ce187be401d6bf59ef7a2583c
40000 octopus-deployment\0 14f589a30cf4bd0ce2d7103aa7186abe0167427f
40000 octopus-product\0 ec559319a263bc7b476e5f01dd2578f255d734fd
100644 pom.xml\0 97e5b6b292d248869780d7b0c65834bfb645e32a
40000 src\0 6e63db37acba41266493ba8fb68c76f83f1bc9dd

The number 1 as the first character of a mode shows that is reference to a blob/file. The example above, pom.xml is a blob and the others are trees.

Note that I added new lines and spaces after \0 for the sake of pretty printing. Normally all the content has no new lines. Also I converted 20 bytes (i.e. the SHA-1 of referencing blobs and trees) into hex string to visualize better.

Leave a Comment