What I learned about Git during a 20 developer commit frenzy

Age Mooij

Last week we had one of our infamous tech rallies and we had a lot of fun trying to build our own Posterous clone in a day, i.e. turning emails into blog posts.

The project sources are hosted on Github. We were working on this one-day project with about 20 developers spread over 4 teams and one of the things I noticed is that most of us did not have more than the very basic Git experience. This meant that we ran into a lot of merge conflicts and solving those is not always easy. Below is a little rundown of the Git learning stages we went through in team MongoDB to deal with this.

Most of us started out with an approach that comes very natural to year-long Subversion users: just work on the master, commit your changes locally, pull to get commits made by other people, solve any merge conflicts, push to the central server. Unfortunately this approach will lead to lots of merge conflicts since other people are doing exactly the same thing. Git is pretty good at auto merging but that does create extra commits that make your history look like this:

Git Merge Hell

Improvement Attempt 1

Iwein told us about a workflow that he had been using on one of his open source projects. The first innovation is to use a small development branch on which you do all your commits while keeping the master clean. When it comes time to push to the central server, you go through a convoluted series of steps to update the master branch to the latest central version, rebase your local branch against that, merge the result back to master, and finally pushing that to the central repository. The workflow looks something like this:

git co local-dev
... do some commits...
git co master
git pull
git co local-dev
git rebase master
git co master
git merge local-dev
...run your tests again, and if they are green...
git push

This worked pretty well but it won't win any beauty prices and all those steps are bound to lead to some errors. Add to that the fact that while you are doing those steps, someone else has probably already committed again. How about something a little simpler ?

Improvement Attempt 2

Once again it was Iwein who came up with a much simpler version that brought us almost all the way back to our first approach. We worked on our local master branch (or on any branch of your choosing as long as it tracks the remote master) as before but when we pulled in changes made by other people, we used rebasing instead of merging to insert those commits into our local branch. Like so:

git co master
... do some commits...
git pull --rebase
...run your tests again, and if they are green...
git push

This worked a lot better and this was the approach that we ended up using the rest of the day.

Afterwards I did some reading and, like all things Git, there are people who say you should never do the above and there are people who swear by it. The Git Ready site has a good article about it and StackOverflow has an interesting discussion on why or why not to use this technique so you can make up your own minds.

Check out the online Pro Git book for a very good Git introduction and/or manual.

Comments (12)

  1. David Gageot - Reply

    September 21, 2010 at 8:17 am

    Nice article.

    Rebasing (as is merging) can be annoyingly difficult if you have ongoing changes that you don't want to commit yet. You can had these three lines in your .gitconfig to smooth things out.

    wip = !"git add -A; git ls-files --deleted -z | xargs -0 git rm; git commit -m \"wip\""
    unwip = !"git log -n 1 | grep -q -c wip && git reset HEAD~1"
    pr = !"git fetch;git wip;git rebase origin;git unwip"

  2. Vincent Partington - Reply

    September 21, 2010 at 8:35 am

    Hi Age, Interesting approaches. But they do assume one thing: that you do not work together on the feature branch. At XebiaLabs we also use feature branches in Git (can't live without 'em!), but we also push them to the origin when we switch partners during pairing. Or at the end of the day as a way to backup the precious code. With that setup you have to remove the branch from the origin before rebasing. And rebasing is a lot harder to understand when things get sticky (git rebase --continue hell), so we usually just merge the branch back into master when we're done with it.

    The only time we do rebase is when we need some stuff on the master in the branch. One time we did this was when somebody else introduced some nice test utilities on master that I wanted to use in a feature branch.

  3. Age Mooy - Reply

    September 21, 2010 at 9:07 am

    I don't quite understand why you have to remove the branch from the origin before rebasing. Do you use remote tracking branches for your feature branches ?

    AFAIK you can work on a feature branch, rebase or merge any commits other people make on that branch and push to your hearts content. You can of course also rebase (or even merge) in any commits from the master branch until you are ready to merge your feature branch back to master.

  4. Age Mooy - Reply

    September 21, 2010 at 9:09 am

    Also check out "easily manage git remote branches" on Git Ready for a handy tool that makes working with remote branches easier.

  5. Jeroen van Erp - Reply

    September 21, 2010 at 10:00 am

    Hi Age,

    When you rebase a branch, you in essence are rewriting its history. If someone else shares that branch with you, they'll end up in a a bit of a merge hell as their commit IDs don't match up anymore with yours. And indeed we do use remote tracking branches.
    The following excerpt from the Pragmatic Guide To Git on rebasing:

    Rebase takes a series of commits—normally a branch—and replays them on top of another commit—normally the last commit in another branch. The parent commit changes so all of the commit IDs are recalculated. This can cause problems for other developers who have your code because the IDs don’t match up.
    There’s a simple rule of thumb with git rebase: use it as much as you want, only on local commits. Once you’ve shared changes with another developer, the headache is generally not worth the trouble.

  6. Age Mooy - Reply

    September 21, 2010 at 10:33 am

    Aha, of course, makes sense. This just proofs that I have forgotten most of my Git knowledge by being forced to use Subversion every day ;(

    But the workflow for dealing with tracking branches would still be essentially the same right ? You make some local commits, you pull with rebase to get the changes other people made in front of the changes you made, and then you push. No remote history gets rewritten. Or am I missing something ?

  7. Jeroen van Erp - Reply

    September 21, 2010 at 10:38 am

    That makes sense indeed, but then you keep your rebases to your branch, and you don't rebase onto master. That should indeed work, even when working distributed on a feature branch.

  8. Iwein Fuld - Reply

    September 21, 2010 at 2:44 pm

    One thing that we didn't manage to cover was polishing commits with `git rebase -i` and `git commit --amend`. These are worth looking into when you're keen on showing smart change sets.

  9. Age Mooy - Reply

    September 22, 2010 at 9:58 am

    David, isn't that what "git stash" is for ?

  10. Agata Przybyszewska - Reply

    September 22, 2010 at 10:13 am

    Nice article - best practices of this new technology are still on their way, emerging ...
    How did you make the commit-history-diagram?


  11. Iwein Fuld - Reply

    September 22, 2010 at 10:21 am

    @Jeroen, Vincent, I've actually accidentally tried to rebase a remote branch (you can even see it in the diagram that Age uploaded). We didn't practice the "recover from an upstream rebase" because we had better things to do 🙂

  12. Age Mooy - Reply

    September 22, 2010 at 10:28 am

    The commit history diagram was generated by GitX, a native Mac client. It is one of the most forked/branched GitHub projects I know and I use the fork by "brotherbard".

    The normal (but ugly) "gitk" or "git gui" tool will generate similar diagrams though.

Add a Comment