You are viewing bramcohen

Sat, Feb. 28th, 2009, 11:14 pm
version control tidbits

I wrote up an explanation of when patience diff gives significantly different results, and why they're important.

The trickiest case for line based merging is when the number of blank lines between two functions changes. Were the new lines inserted at the beginning or the end? If two different people add a single line, should that clean merge? What if one person adds one blank line and another person adds two? Should that be a conflict of one blank line on one side and two on another, or a conflict of no lines on one side and a blank line on the other with a single line added, or no conflict and merge to the original length plus two, or no conflict and merge to the original length plus three? How about if a function was added in the middle of a section of blank lines and the number of blank lines after it was changed - should the new lines be at the end of the new function or the beginning of the old one? What if lines were removed, where should they have been removed from?

There aren't clear answers to these questions. Any answer you come up with will involve some tradeoffs, and it's a judgement call what behavior really is best. The one clear lesson is that you shouldn't change the number of blank lines between functions willy-nilly, because it will confuse the version control system.

A difficult distributed merge scenario which hasn't been written up anywhere is one I like to call the circular staircase. Three different branches are all made off of trunk, and all make changes to the same section of code. Think of the branches as all sitting around a circular table. Each branch then simultaneously pulls the most recent version of the branch to their right, and resolves the conflict in favor of their version. This situation is weird because any two of the three branches will now do a clean merge together, but pulling in the third one should cause a conflict. I won't give the full explanation as to why that is here, because my point is that the scenario is completely counterintuitive and inscrutable. As soon as the relationship between branches ceases to be a tree, it becomes impossible for users to understand what's going on. The lesson is that you should have clear relationships between your branches, and not do anything goofy. At this point I'm in favor of the version control system not allowing you to do goofy things, and keep track of the coherent relationships between branches which you do have. I have more ideas than were given in my last post on this subject, but more on that at a later time.

Another interesting case is the following:
     a
    / \
   b   c
  / \ /
 c   b
  \ /
   ?

If we want to support cherry-picking, we have a problem here. On the one hand, the change to c has already been applied to and rejected from on the right hand side. On the other hand, the change to c on the left happened 'after' the change on the right, so by the staircase criterion the new value should be c.

I think that if you want to support implicit cherry-picking you need to suck it up and accept that this scenario creates the weird action at a distance of merging to b. I don't really want to argue it all that much though, because the horse has already been beaten to death, and basically nobody in the real world is asking for implicit cherry-picking, because nobody has proposed a good UI for it, and nobody other than a handful of experts understands how it behaves, and because it directly conflicts with implicit undo.