Vendor Branches in Git

I think submodules are the way to go when it comes to “vendor branch”.
Here is how you should use submod… hmmm, just kidding.

Just a thought; you want:

  • to develop both main project and sub-project within the same directory (which is called a “system approach“: you develop, tag and merge the all system)
  • or to view your sub-project as a “vendor branch” (which is a branch which allows you to access a well-defined version of a vendor external component – or “set of files” – , and which is only updated with the new version every release of that external component: that is called a “component-approach“, the all system is viewed as a collection of separate components developed on their own)

The two approaches are not compatible:

  • The first strategy is compatible with a subtree-merge: you are working both on project and sub-project.
  • The second one is used with submodules, but submodules is used to define a configuration (list of tag you need to work): each git submodules, unlike svn:externals, are pinned to a particular commit id, and that is what allows you to define a configuration (as in SCM: “software configuration management”)

I like the second approach because most of the time, when you have a project and a sub-project, their lifecycle is different (they are not developed at the same rhythm, not tagged together at the same time, nor with the same name).

What really prevents that approach (“component-based”) in your question is the “both can be developed and updated from the same working directory” part.
I would really urge you to reconsider that requirement, as most IDE are perfectly capable to deals with multiple “sources” directories, and the sub-project development can be done in its own dedicated environment.


samgoody adds:

Imagine an eMap plugin for both Joomla and ModX. Both the plugin and the Joomla-specific code (which is part of Joomla, not of eMap) are developed while the plugin is inside Joomla. All paths are relative, the structure is rigid, and they must be distributed together – even though each project has its own lifecycle.

If I understand correctly, you are in a configuration where the development environment (the set of files you are working on) is quite the same than the distribution environment (the same set of file is copied on the release platform)

It all comes done to a granularity issue:

  • if both sets of files cannot exist one without the other, then they should be viewed as one big project (and subtree-merged), but that force them to be tagged and merged as one.
    -if one depends on the other (which can be developed alone), then they should be in their own Git repository and project, the first one depending on a specific commit of the second as a sub-module: if the sub-module is defined in the right subtree of the first component, all relative paths are respected.

samgoody adds:

The original thread listed issues with submodules – primarily that GitHub’s download doesn’t include them (vital to me) and that they get stuck on a particular commit.

I am not sure GitHub’s download is an issue recently: that “Guides: Developing with Submodules” article does mention:

Best of all: people cloning your my-awesome-framework fork will have no problem pulling down your my-fantastic-plugin submodule, as you’ve registered the public clone URL for the submodule. The commands

$ gh submodule init
$ gh submodule update

Will pull the submodules into the current repository.

As for the “they get stuck on a particular commit”: that is the all point of a submodule, allowing you to work with a configuration (list of tagged version of components) instead of a latest potentially unstable set of files.

samgoody mentions:

I need to avoid both subtrees and submodules (see question), and would rather address this need without arguing too much if the approach is justified

Your requirement is a perfectly legitimate one, and I do not want to judge its justification: my previous answers are only here to provide a larger context and try to illustrate the options usually available with a generic SCM tool.

Subtree merge should be the answer here, but would imply to merge back only commits made for files for the main project, and not commits made for the sub-projects. If you can manage that kind of partial merge, I would reckon it is the right path to follow.

I do not see however a native Git way to do what you want that does not use subtree-merge or submodule.
I hope a true Git guru will post here a more adequate answer.

Leave a Comment