You are viewing bramcohen

Sun, Apr. 17th, 2011, 06:56 pm
Git Can't Be Made Consistent

This post complains about Git lacking eventual consistency. I have a little secret for you: Git can't be made to have eventual consistency. Everybody seems to think the problem is a technical one, of complexity vs. simplicity of implementation. They're wrong. The problem is semantics. Git follows the semantics which you want 99% of the time, at the cost of having some edge cases which it's inherently just plain broken on.

When you make a change in Git (and Mercurial) you're essentially making the following statement:

This is the way things are now. Forget whatever happened in the past, this is what matters.


Which is subtly and importantly different from what a lot of people assume it should be:

Add this patch to the corpus of all changes which have ever been made, and are what defines the most recent version.


The example linked above has a lot of extraneous confusing stuff in it. Here's an example which cuts through all the crap:

  A
 / \
B   B
|
A


In this example, one person changed a files contents from A to B, then back to A, while someone else changed A to B and left it that way. The question is: What to do when the two heads are merged together? The answer deeply depends on the assumed semantics of what the person meant when they reverted back to A. Either they meant 'oops I shouldn't have committed this code to this branch' or they meant 'this was a bad change, delete it forever'. In practice people mean the former the vast majority of the time, and its later effects are much more intuitive and predictable. In fact it's generally a good idea to make the separate branch with the change to B at the same time as the reversion to A is done, so further development can be done on that branch before being merged back in later. So the preferred answer is that it should clean merge to B, the way 3 way merge does it.

Unfortunately, this decision comes at significant cost. The biggest problem is that it inherently gives up on implicit cherry-picking. I came up with some magic merge code which allowed you to cut and paste small sections of code between branches, and the underlying version control system would simply figure out what you were up to and make it all work, but nobody seemed much interested in that functionality, and it unambiguously forced the merge result in this case to be A.

A smaller problem, but one which seems to perturb people more, is that there are some massively busted edge cases. The worst one is this:

  A
 / \
B   B
|   |
A   A


Obviously in this case both sides should clean merge to A, but what if people merge like this?

  A
 / \
B   B
|\ /|
A X A
|/ \|


Because of the cases we just went over, they should clean merge to B. What if they are then merged with each other? Since both sides are the same, there's only one thing they can merge to: B

  A
 / \
B   B
|\ /|
A X A
|/ \|
B   B
 \ /
  B


Hey, where'd the A go? Everybody reverted their changes from B back to A, and then via the dark magic of merging the B came back out of the ether, and no amount of further merging will get rid of it again!

The solution to this problem in practice is Don't Do That. Having multiple branches which are constantly pulling in each others's changes at a slight lag is bad development practice anyway, so people treat their version control system nicely and cross their fingers that the semantic tradeoff they made doesn't ever cause problems.

Mon, Apr. 18th, 2011 09:04 am (UTC)
MrJoy: Re: Complaining about pathological insanity...

I've seen the scenarios you describe arise, and in every case it's been a matter of people pushing buttons and hoping for the best. I have never seen a case where people intended to flip-flop between versions and then got hung up because the merge didn't do what they expected and they couldn't clearly and simply say "y'know what, screw B, I want A to 'win'.".

You either know what you expect to happen to the code when you do a merge, or you do not. If you know what you expect, git provides plenty of tools to help you ensure that you get what you expect. If you don't actually know what to expect and are blindly hoping for the best -- then you have no clear sense of your codebase, or the changes being made by others (either the mechanics or the intent).

Mon, Apr. 18th, 2011 02:43 pm (UTC)
bramcohen: Re: Complaining about pathological insanity...

You've both sneered at me for claiming that claiming that there's usually a unique latest common ancestor in practice, saying that I'm coming at it from SVN experience, and said that anyone following a methodology which doesn't result in a latest common ancestor is engaged in pathological insanity. You've now covered everything which someone might do as being crazy, and your only claim is that Git somehow has magical pixie dust to make neither of these cases happen, even though there are no other cases possible.

I'm going to start deleting further comments from you unless you start understanding what's being discussed and drop the Git fanboyism. You really aren't contributing anything to the conversation.

Wed, Apr. 20th, 2011 06:34 am (UTC)
mcandre: Re: Complaining about pathological insanity...

Computers exist to perform computations, to do them automatically, scalably, and most importantly, predictably. Regardless of how an edge case manifests, we need computer software to resolve that edge case consistently. Most computer code isn't directly connected to bumbling humans but rather connected via a long daisy chain of code to other code, and finally a user interface. Fuckups like the failure at Dhahran happen when the properties of a system are taken for granted.