Merge recursive strategy Tuesday, September 27, 2011 Pablo Santos merging 10 Comments

NB: This article has been updated through out the years. Last update was done on December 2018.

The basics: elements of a merge

The source : the changeset you're merging from. Changeset 16 in the example below.

: the changeset you're merging from. Changeset in the example below. The destination : the changeset you're merging to. Changeset 15 in the example below.

: the changeset you're merging to. Changeset in the example below. The ancestor: the changeset (or commit) which is the nearest parent of the source and the destination. This is Changeset 10 in the example below.

16

15

10

When is merge recursive needed?

Please note: the example is a little bit forced since there's not a good reason – initially – for the developer merging from changeset 11 into 16 instead of merging from changeset 15 (the latest from the branch main at the point of the merge). But let's assume it has to be done for a reason, let's say, changeset 11 was stable and 13 and 15 weren't at the time, for instance. The point is: between 15 and 16 there's not a single unique ancestor, but rather, two ancestors at the same "distance": 12 and 11 .

How merge recursive works?

ancestor 2

Why merge recursive is better: a step by step example

foo.c

b c d

/foo.c = bcd

0. We had a file foo.c with content foo=bcd (three lines, first is b , second is c and third is d )

with content (three lines, first is , second is and third is ) 1. We edit foo.c on a branch and add a new line so it ends up being foo=bcde

on a branch and add a new line so it ends up being 2. We modify the second line of the file on main . Now the file is foo=bCd

. Now the file is 3. We create a new branch from changeset 2 and add a new line at the beginning so it is now foo.c=abCd

and add a new line at the beginning so it is now 4. Going back to task001 , we modify the line we just added: foo.c=bcdE

, we modify the line we just added: 5. We "undo" the change we just did on main : foo.c=bcd

: 6. We merge 4 and 3 and create 6 as foo.c=abCdE . (We combine the changes we made on task002 (adding a new line at the beginning) with the ones in task001 (adding E at the end) and also the change coming from 2 on /main .)

and and create as . (We combine the changes we made on (adding a new line at the beginning) with the ones in (adding at the end) and also the change coming from on .) 7. We now merge 4 and 5 introducing the change from 4 (last line added) into main : foo.c=bcdE

task002

main

7

6

task002

a

7

foo.c=abcdE

C

task002

main

4

2

6

7

4

4

4=bcdE

6=abCdE

7=abcdE

First line -> a (it's there on 6 and 7 )

(it's there on and ) Second line -> b (it's there on the three contributors)

(it's there on the three contributors) Third line -> C (changed on 6 but unchanged on 4 and 7 )

(changed on but unchanged on and ) Fourth line -> d (unchanged)

(unchanged) Fifth line -> E (unchanged)

foo=abCdE

C

c

How recursive merge fixes the mess?

4

2

changeset X

foo=bCdE .

changeset X

6

7

Ancestor: foo=bCdE

Source: foo=abCdE

Destination: foo=bcdE

Result: foo=abcdE , which is what we were looking for!

5

Why it is so good?

More on recursive merge strategy

Pablo Santos I'm the CTO and Founder at Códice.

I've been leading Plastic SCM since 2005. My passion is helping teams work better through version control.

I had the opportunity to see teams from many different industries at work while I helped them improving their version control practices.

I really enjoy teaching (I've been a University professor for 6+ years) and sharing my experience in talks and articles.

And I love simple code. You can reach me at I'm the CTO and Founder at Códice.I've been leading Plastic SCM since 2005. My passion is helping teams work better through version control.I had the opportunity to see teams from many different industries at work while I helped them improving their version control practices.I really enjoy teaching (I've been a University professor for 6+ years) and sharing my experience in talks and articles.And I love simple code. You can reach me at @psluaces

You must have heard aboutwhich is the default algorithm that git uses when merging two branches.How does it work and why is it good?You've got two branches you want to merge. The basic elements to consider are:So, we will mergeandusingas ancestor. Weneed an ancestor to be used in "three-way" merges (more about it here ). However, some times the scenario is not that simple.What if we find "two common ancestors"? The branch explorer view below shows an alternative in which there are two possible "common ancestors".While this won't happen frequently, it is really likely to happen with long lived branches or complex branch topologies. (The case depicted above is the shortest one driving to the "multiple ancestor" problem, but it can happen too with several changesets and branches in between the "crossed" merges).One solution is to "select" one of the ancestors as the valid one for the merge (which is the option Mercurial takes) but as we will see below, it has many drawbacks.When more than one valid ancestor is found, the recursive-merge strategy will create a new unique "virtual ancestor" merging the ones initially found. The following image depicts the algorithm:A newwill be used as "ancestor" to merge the "src" and "dst".The "merge recursive strategy" is able to find a better solution than just "selecting one of the two" as I'll describe below.Let me use the following "notation" in the next example: a filewith three lines like these:It will be described as:Of course, for the sake of simplicity I'll be using "stupid lines" like "abc" but assume the example is valid for real code too.Let's take a look at the following case in the diagram below:I’ll try to describe it changeset by changeset:If we now mergeon(changesetand changeset) what should we get?We should get the "addition" on(a new lineat the beginning) on top ofThe expected result is:We shouldn't get the line with the uppercaseas it is onbecause we fixed it afterwards onAs you can see in the diagram, I highlighted the changesetsandbecause they're the two possible common ancestors fromandWhich one should we choose?Mercurial will choosebecause its algorithm chooses the "deepest" ancestor in the case were more than one is found.What happens if we chooseWe will mergeandand the automatic result (by any 3-way merge tool will be):So, we automatically getwhich isWe tookinstead ofdue to the wrong ancestor selection.As I described above, the first thing recursive merge is going to do is to calculate a new "virtual ancestor" mergingandas the following picture shows:The resultisLater,is used as the "ancestor" ofandand then we get:The calculated result takes into account the "fix" done in changesetand therefore the result is correct!If you have to deal with branching and merging and you don't have a good merge algorithm, you can end up with broken files without warning! Branching and merging are the two weapons you must have in your developer's toolset... but ensure you have the best possible ones, the ones that really do the job. In short: Git will do it correctly, Hg will break the result, and SVN and others will simply mess up the whole thing.also includes a powerful merge-recursive algorithm, so it is able to produce the same result. (In fact, our algorithm is even more powerful, correctly handling cases that even Git is unable to deal with successfully).Recursive merge is not only good for “criss-cross” merge situations but also for more “regular” ones where merge history simply gets complicated. Watch our on spot, detail-rich explanation of recursive merge in Plastic SCM