Why is branching and merging easier in Mercurial than in Subversion?

In Subversion (and CVS), the repository is first and foremost. In git
and mercurial there is not really the concept of a repository in the
same way; here changes are the central theme.

The hassle in CVS/SVN comes from the fact that these systems do not
remember the parenthood of changes. In Git and Mercurial,
not only can a commit have multiple children, it can also have multiple
parents!

That can easily observed using one of the graphical tools, gitk or hg view. In the following example, branch #2 was forked from #1 at
commit A, and has since been merged once (at M, merged with commit B):

o---A---o---B---o---C         (branch #1)
     \       \
      o---o---M---X---?       (branch #2)

Note how A and B have two children, whereas M has two parents. These
relationships are recorded in the repository. Let’s say the maintainer of
branch #2 now wants to merge the latest changes from branch #1, they can
issue a command such as:

$ git merge branch-1

and the tool will automatically know that the base is B–because it
was recorded in commit M, an ancestor of the tip of #2–and
that it has to merge whatever happened
between B and C. CVS does not record this information, nor did SVN prior to
version 1.5. In these systems, the graph
would look like:

o---A---o---B---o---C         (branch #1)
     \    
      o---o---M---X---?       (branch #2)

where M is just a gigantic “squashed” commit of everything that happened between A and B,
applied on top of M. Note that after the deed is done, there is no trace
left (except potentially in human-readable comments) of where M did
originate from, nor of how many commits were collapsed together–making
history much more impenetrable.

Worse still, performing a second merge becomes a nightmare: one has to figure out
what the merge base was at the time of the first merge (and one has to know
that there has been a merge in the first place!), then
present that information to the tool so that it does not try to replay A..B on
top of M. All of this is difficult enough when working in close collaboration, but is
simply impossible in a distributed environment.

A (related) problem is that there is no way to answer the question: “does X
contain B?” where B is a
potentially important bug fix. So, why not just record that information in the commit, since
it is known at merge time!

P.-S. — I have no experience with SVN 1.5+ merge recording abilities, but the workflow seems to be much more
contrived than in the distributed systems. If that is indeed the case, it’s probably because–as mentioned
in the above comment–the focus is put on repository organization rather than on the changes themselves.

More Related Contents:

Leave a Comment Cancel reply