Git: Is there a way to figure out where a commit was cherry-pick’ed from?

By default, the information about the original, “cherry” commit is not recorded as part of the new commit.

Record the Source Commit in the Commit Message

If you can force the use of particular workflows/options,
git cherry-pick has the -x option:

When recording the commit, append to the original commit message a note that indicates which commit this change was cherry-picked from.

This is obviously useless if you can not rely on the cherry pickers using the option.
Also, since the recorded information is just plain text—not an actual reference as far as Git is concerned—even if you use -x, you still have to take steps to make sure that the original commit is kept alive (e.g. is is part of the DAG of a tag or a non-rewinding branch).

git cherry and git patch-id

If you can restrict your search to two particular branches of the history DAG, then git cherry can find both “unpicked” and “picked” cherries.

Note: This command (and the related git patch-id) can only identify conflict-free cherries that were individually picked without extra changes. If there was a conflict while picking the cherry (e.g. you had to slightly modify it to get it to apply), or you used -n/--no-commit to stage extra changes (e.g. multiple cherries in a single commit), or the content of the commit was rewritten after the picking, then you will have to rely on commit message comparison (or the -x information if it was recorded).

git cherry is not really designed to identify the origin of picked cherries, but we can abuse it a bit to identify single cherry pairs.

Given the following history DAG (as in the original poster’s example):

1---2---3---B---D  master
         \
          A---C    dev
# D is a cherry-picked version of C

you will see something like this:

% git cherry master dev
+ A
- C
% git cherry dev master
+ B
- D

(A, B, C, and D are full SHA-1 hashes in the real output)

Since we see one cherry (the - lines) in each list, they must form a cherry pair. D was a cherry picked from C (or vice versa; you can not tell by the DAG alone, though the commit dates might help).

If you are dealing with more than one potential cherry, you will have to “roll your own” program to do the mapping. The code should be easy in any language with associative arrays, hashes, dictionaries, or equivalent. In awk, it might look like this:

match_cherries() {
    a="$(git rev-parse --verify "$1")" &&
    b="$(git rev-parse --verify "$2")" &&
    git rev-list "$a...$b" | xargs git show | git patch-id |
    awk '
        { p[$1] = p[$1] " " $2 }
    END { 
            for (i in p) {
                l=length(p[i])
                if (l>41) print substr(p[i],2,l-1)
            }
        }'
}
match_cherries master dev

With an extended example that has two picked cherries:

1---2---3---B---D---E  master
         \
          A---C        dev
# D is a cherry-picked version of C
# E is a cherry-picked version of A

The output might look like this:

match_cherries master dev
D C
E A

(A, C, D, and E are full SHA-1 hashes in the real output)

This tells us that C and D represent the same change and that E and A represent the same change. As before, there is no way to tell which of each pair was “the first” unless you also consider (e.g.) the commit dates of each commit.

Commit Message Comparison

If your cherries were not picked with -x, or they are “dirty” (had conflicts, or other changes added to them (i.e. with --no-commit plus staging extra changes, or with git commit --amend or other “history rewriting” mechanism)), then you may have to fall back on less the less reliable technique of comparing commit messages.

This technique works best if you can find some bit of the commit message that is likely to be unique to the commit and is unlikely to have changed in the commit that resulted from the cherry pick. The bit that would work best would depend on the style of commit messages used in your project.

Once you have picked out an “identifying part” of the message, you can use git log to find commits (also demonstrated in Jefromi’s answer).

git log --grep='unique part of the commit message' dev...master

The argument to --grep is actually a regular expression, so you might need to escape any regexp metacharacters ([]*?.\).

If you are not sure which branches might hold the original commit and the new commit, you can use --all as Jefromi showed.

Leave a Comment