Source Control Done Right

2011-09-06

wva:
Why the hell would you want to keep documentation out of the source control system?
What makes a latex/org/... file different from a c/py/.. file? In both cases you want to track and merge changes and see who did what when, and branch as ideas are tried or different "stable" versions are needed...

Documentation is usually written in Word (or using a similarly ill-maintainable program) so in a source-control system usually need to be stored in binary form. In such a form it may not be as easy to establish what the differences are between versions.

If you're maintaining your documentation in e.g. TeX, then it may well be more appropriate to use a source-control system for the docs.

Another option is to use a wiki for the documentation.

Sanity · 2011-09-06

First: The revised "Distributed" diagram is not the only way to set up a DVCS. It's by far the easiest, but certainly not the only way. It's also worth mentioning that for a typical project, a local Git repository is actually going to require less space than a Subversion checkout, so storage isn't relevant.

But there's more to it than that. Adding individual shelves to Subversion was the best and worst thing our team did. Best, because it meant that people were no longer afraid to check in for days at a time, not wanting to break the build with their experimental stuff, which only prolonged that experimental stuff due to them not being able to use version control the whole time.

Before this, we were all working off trunk, which has the additional issue that anytime a developer checks something in, they potentially create a merge conflict with someone else's working copy when they do an 'svn up'.

Worst, because Subversion completely fell over when we tried to use it that way. Especially when we added the merge tracking feature, which does make merges less likely to require manual intervention, but also made them take roughly half an hour. At that point, we were afraid to branch more than we had to, because merging would be painful -- but then, it's usually a given that merging is painful.

Git changed all that. After migrating the project to Git, those half-hour merges took seconds. Plus, we get backup for free, and we can work offline easily.

This is a valid concern:

"With the ease of forking, the simplicity of merging, and allure of pulling, it only seems logical to branch by shelf and end up picking up Jenga pieces."

But the real win isn't that each developer has a shelf, it's that even a feature which will take me only a few hours to develop can get its own branch. I can start work on a new feature without worrying that it'll interfere with anything else -- an urgent bug request could come in, and I'll just switch back to my 'master' or 'shelf' (or even 'trunk') branch and work from that, or create a new branch from the latest release if it's urgent enough to rush out a patch.

There's nothing inherently 'distributed' about any of the above. If non-distributed systems made it easy for a developer to create a new branch on the server, work on it for an hour, and then merge it in seconds, that'd be great. But making merging efficient and easy is a problem every DVCS has had to solve, much more so than a traditional SCM with a central server.

But the distributed pieces matter, too. As mentioned before, developer checkouts are now a sort of backup of your version history. I think it also helps that I'm in the habit of committing as soon as I have anything, which makes it a lot easier to keep commits small and to the point. The nice thing about a DVCS here is that I can easily roll back and re-commit that last change before I push it to anywhere publicly visible -- git's "--amend" option, for example -- so if I catch a typo or something minor before I push, I avoid a lot of "fixed typo" commits. This is even more useful with open source -- I can completely edit my commit history, rearranging things until it looks much more logical than it was, before I submit them for review.

For what it's worth, I also don't agree that every merge needs to be manually reviewed. If you've got a decent test suite, run that. Manual code reviews are useful, but they should be done independently of merges.

I can't really say I'm surprised. What would be surprising is if Alex ever admitted that a hot, new, over-hyped technology actually improved anything. But I have to say, there really are better and worse source control systems, and the new breed of DVCS is a huge improvement over systems like SVN.

2011-09-06

QJo:
Documentation is usually written in Word (or using a similarly ill-maintainable program) so in a source-control system usually need to be stored in binary form. In such a form it may not be as easy to establish what the differences are between versions. ... Another option is to use a wiki for the documentation.

Document control systems are usually optimized to deal with these problems. They generally understand the file formats involved (e.g. MS Office's various formats). While they may store each revision as a binary, they have features to take advantage of the file formats. For example, one system we use at work (sorry, I don't know who makes it - it might be an in-house system) has the following features:

Stores metadata (document number, creator, revision, etc.) in MS Office document property fields.
Automatically creates corporate-standard header/footer sections on all documents, making the metadata visible.
Includes facilities for document approval, so managers can sign-off on released versions of a document, distinguishing it from works-in-progress.
Automatic generation of PDFs from all supported document types (so people viewing files don't need to have the original app installed.)

These are things that are really useful (possibly even critical) for document storage, but would be mostly pointless when applied to source code.

2011-09-06

You're doing it wrong!

C:\VersionControl MyProg.201109060900.c MyProg.201109060904.c MyProg.201109060915.c

It's so easy.

Source Control Done Right

Back to Basics

Revision Control -vs- Source Control

Source Control Operations

The Fourth Dimension

Fork Me

The Urge to Merge

What the Fork?

Source Control and Application Development

A Tale of Two Branching Strategies

Branching by Exception

Branching by Rule

Branching by Shelf

The Distributed “Revolution”

In The End: Not Really About The Tools

Featured Comments