# Source Control Done Right

• ParkinT 2011-09-06 07:30
After reading this I need to be Committed.
Or, at least, Checked-Out.

...use your source control system to properly manage source code changes. That’s why you’re using it in the first place.

That's why I am a huge proponent of 'git'.

Perhaps I need to be Committed.
Or, at least, Checked-Out.
• Wagner 2011-09-06 08:16
ParkinT:
After reading this I need to be Committed.
Or, at least, Checked-Out.

...use your source control system to properly manage source code changes. That’s why you’re using it in the first place.

That's why I am a huge proponent of 'git'.

Perhaps I need to be Committed.
Or, at least, Checked-Out.

Oh my. After reading these horrible puns I feel the need to push a fork into you and pull it out repeatedly.
• Dave 2011-09-06 08:25
I've actually had the conversation about the difference between document management and source control before with co-workers. But I admit to not knowing what to use instead. Any suggestions?
• letseatlunch 2011-09-06 08:25
is it just me or is any one else feel that they must be in the twilight zone because this was posted before 8:30?
• wva 2011-09-06 08:26
Why the hell would you want to keep documentation out of the source control system?

What makes a latex/org/... file different from a c/py/.. file? In both cases you want to track and merge changes and see who did what when, and branch as ideas are tried or different "stable" versions are needed...
• Chris Cowdery 2011-09-06 08:28
Dimension one look like bits to me. Surely they should be broken up into 8 bit blocks to become bytes?

Chris.
• SG_01 2011-09-06 08:30
The companies I've worked for so far, as well as the companies I will work for in the future, will always have to store vast amounts of binary data next to the code. And due to requirements it will always have to be in the same system.

But that's game industry for you, assets of one version might not be compatible with assets of a different version, as such the assets are linked to the code, and using multiple systems for that would make life a living nightmare!
• QJo 2011-09-06 08:30
A few years back I was tasked with merging version 12 into the by now completely different product which was forked off from version 9 (specifically, version 9.4) four years previously. Of course, the QA procedures had since been rewritten, and QA itself had been offshored, so relying on the same test schedule that had been used for version 9.4, 10, 11 and 12 (all versions) would have been inadequate, so 300 pages of QA instructions had to be regenerated from scratch.

There was so much to do and the deadlines were so tight it resulted in me working on the average of 70 hour weeks (all overtime unpaid) and being denied any decent leave for 18 months. Surprise was expressed at my exit interview that one of the reasons I was leaving was that I was tired.

The moral of the story: if you're going to make stupid management decisions, make sure that the engineer who said at the time that it was a stupid idea isn't going to be assigned the task of fixing it, or you may lose him.
• stubola 2011-09-06 08:33
In the codebase I work with, there is a document written in MS Word HTML (don't ask me why; not my decision). While a simple change to a cell in a table looks easy when editing with Word, the underlying HTML seemed to change very drastically with that simple change. Making a patch in SVN with a change of that file ended up just being ridiculous because nobody wanted to go through the crapload of changes. We ended up moving to more of a Wiki system but that, I think, is a good reason why some docs should be left out of a source control system.
• QJo 2011-09-06 08:35
wva:
Why the hell would you want to keep documentation out of the source control system?

What makes a latex/org/... file different from a c/py/.. file? In both cases you want to track and merge changes and see who did what when, and branch as ideas are tried or different "stable" versions are needed...

Documentation is usually written in Word (or using a similarly ill-maintainable program) so in a source-control system usually need to be stored in binary form. In such a form it may not be as easy to establish what the differences are between versions.

If you're maintaining your documentation in e.g. TeX, then it may well be more appropriate to use a source-control system for the docs.

Another option is to use a wiki for the documentation.
• pjt33 2011-09-06 08:43
stubola:
In the codebase I work with, there is a document written in MS Word HTML (don't ask me why; not my decision). While a simple change to a cell in a table looks easy when editing with Word, the underlying HTML seemed to change very drastically with that simple change. Making a patch in SVN with a change of that file ended up just being ridiculous because nobody wanted to go through the crapload of changes.

Sounds remarkably similar to my experience with Visual Studio's designer tools. The only way to get readable diffs is to hand-edit the autogenerated code which you're not supposed to touch.
• The "Real" WTF 2011-09-06 08:45
mm...is it SourceSafe (who uses source safe??)

or is it writing an educational article about source control in a website dedicated to bad practices?

i think we're long overdue an article about WTF practices.
taken from select implementations on this website through the years and put together into one incoherent collection of WTF practices.

also - how about a WTF rating:
you often here people say "i worked on a really crappy system"
or "you wouldn't believe how bad the code here is"

but these claims are often not backed up due to "copyright" or "theft of source code" or something.
well - i have a solution:
WTF rating - a comparable scale of WTFP (WTF points).
the ratio tries to capture the level of WTFness (bugs, crashes, user complaints, complexity and incomprehensibility) introduced by each WTF.
Points accumulate, reflecting the fact that the larger the system the more WTFs you will find, and the worse off it is.

i.e.
1000 WTFP - Writing your own version of a bool (with that long missing 3rd option)
10 WTFP - giving a variable a meaningless name
20 WTFP - giving a variable an opposite name
50 WTFP - giving a variable an incomprehensible name (NotNonInvisible)

you can also give points to WTFP (WTF Practices)
i.e.
300 - Hiring experts that take months to write their own name in code
500 - Hiring cheap outsourceres that build you a crappy system and then paying 3 times the price to have it fixed
6000 - using ACCESS as a database
..

Finally we can award WTFP (WTF Prizes) to the system with the highest WTFP and WTFP.
• The poop... of DOOM! 2011-09-06 08:58
The "Real" WTF:
mm...is it SourceSafe (who uses source safe??)

or is it writing an educational article about source control in a website dedicated to bad practices?

i think we're long overdue an article about WTF practices.
taken from select implementations on this website through the years and put together into one incoherent collection of WTF practices.

also - how about a WTF rating:
you often here people say "i worked on a really crappy system"
or "you wouldn't believe how bad the code here is"

but these claims are often not backed up due to "copyright" or "theft of source code" or something.
well - i have a solution:
WTF rating - a comparable scale of WTFP (WTF points).
the ratio tries to capture the level of WTFness (bugs, crashes, user complaints, complexity and incomprehensibility) introduced by each WTF.
Points accumulate, reflecting the fact that the larger the system the more WTFs you will find, and the worse off it is.

i.e.
1000 WTFP - Writing your own version of a bool (with that long missing 3rd option)
10 WTFP - giving a variable a meaningless name
20 WTFP - giving a variable an opposite name
50 WTFP - giving a variable an incomprehensible name (NotNonInvisible)

you can also give points to WTFP (WTF Practices)
i.e.
300 - Hiring experts that take months to write their own name in code
500 - Hiring cheap outsourceres that build you a crappy system and then paying 3 times the price to have it fixed
6000 - using ACCESS as a database
..

Finally we can award WTFP (WTF Prizes) to the system with the highest WTFP and WTFP.

7000 WTFP for using VB
7000 WTFP for using PHP
• Paratus 2011-09-06 09:03
The poop... of DOOM!:
The "Real" WTF:

6000 - using ACCESS as a database

7000 WTFP for using VB
7000 WTFP for using PHP

VB and PHP are certainly RWTFs, but there's no way that they're worse than using Access.
• The poop... of DOOM! 2011-09-06 09:07
Paratus:
The poop... of DOOM!:
The "Real" WTF:

6000 - using ACCESS as a database

7000 WTFP for using VB
7000 WTFP for using PHP

VB and PHP are certainly RWTFs, but there's no way that they're worse than using Access.

He said using Access as a database, so you can combine that.

A PHP application calling an Access database would result in 13000 WTFP (and a developer who's been committed to a mental hospital)
• the beholder 2011-09-06 09:08
Wait, what? The other universe's Alex is the Bizarro one??
• QJo 2011-09-06 09:11
The poop... of DOOM!:
The "Real" WTF:
mm...is it SourceSafe (who uses source safe??)

or is it writing an educational article about source control in a website dedicated to bad practices?

i think we're long overdue an article about WTF practices.
taken from select implementations on this website through the years and put together into one incoherent collection of WTF practices.

also - how about a WTF rating:
you often here people say "i worked on a really crappy system"
or "you wouldn't believe how bad the code here is"

but these claims are often not backed up due to "copyright" or "theft of source code" or something.
well - i have a solution:
WTF rating - a comparable scale of WTFP (WTF points).
the ratio tries to capture the level of WTFness (bugs, crashes, user complaints, complexity and incomprehensibility) introduced by each WTF.
Points accumulate, reflecting the fact that the larger the system the more WTFs you will find, and the worse off it is.

i.e.
1000 WTFP - Writing your own version of a bool (with that long missing 3rd option)
10 WTFP - giving a variable a meaningless name
20 WTFP - giving a variable an opposite name
50 WTFP - giving a variable an incomprehensible name (NotNonInvisible)

you can also give points to WTFP (WTF Practices)
i.e.
300 - Hiring experts that take months to write their own name in code
500 - Hiring cheap outsourceres that build you a crappy system and then paying 3 times the price to have it fixed
6000 - using ACCESS as a database
..

Finally we can award WTFP (WTF Prizes) to the system with the highest WTFP and WTFP.

7000 WTFP for using VB
7000 WTFP for using PHP

10^4 WTFP for storing passwords and other sensitive info in plaintext in a database and only thinking you've encoded them

2 x 10^4 WTFP for consistently releasing untested code changes to Production so as to save time assuaging the rage of the customer who's fallen foul of the last swathe of untested code changes that were released to Production

10^5 WTFP for balancing the mission-critical servers on the cistern in the women's bathroom of the office next door and/or frying your breakfast on the chassis of a similarly-deployed servers in the office kitchen

10^6 WTFP for dismissing your most capable staff for daring to suggest that your arse-brained scheme may need some further refinement, nay, even a redesign, before rushing ahead with implementation

10^7 WTFP for turning up at an interview for your perfect job wearing brown shoes
• QJo 2011-09-06 09:15
QJo:
The poop... of DOOM!:

7000 WTFP for using VB
7000 WTFP for using PHP

10^4 WTFP for storing passwords and other sensitive info in plaintext in a database and only thinking you've encoded them

2 x 10^4 WTFP for consistently releasing untested code changes to Production so as to save time assuaging the rage of the customer who's fallen foul of the last swathe of untested code changes that were released to Production

10^5 WTFP for balancing the mission-critical servers on the cistern in the women's bathroom of the office next door and/or frying your breakfast on the chassis of a similarly-deployed servers in the office kitchen

10^6 WTFP for dismissing your most capable staff for daring to suggest that your arse-brained scheme may need some further refinement, nay, even a redesign, before rushing ahead with implementation

10^7 WTFP for turning up at an interview for your perfect job wearing brown shoes

... and of course 10^9 WTFP for confusion of "its" and "it's" in a comment in a program module
• Anonymous Cow-Herd 2011-09-06 09:24
QJo:
10^7 WTFP for turning up at an interview for your perfect job wearing brown shoes

Pff, I once overslept on an interview day, and as a result arrived unwashed, unshaved, with an unironed shirt and shoes that had been dusted off rather than polished, and 20 minutes late. I got the job and started the following Monday.

Maybe Graham's number WTFP for using SourceSafe for anything other than to demonstrate to management that you're using it.
• The poop... of DOOM! 2011-09-06 09:28
500 WTFP for having the only developer on the project write the tests after development, then having him run the tests and calling that QA (not to mention being surprised there're still bugs in the thing, as it's been tested!)
• L. 2011-09-06 09:38
Over 9000 WTFP for using MySQL in a project where a database is really required.

And .. I strongly disagree with putting php (5+) on the same level as ASP . I believe we can safely assume WTFP(ASP/vb)=10*WTFP(php).

But if we're into the language bashing (which imo is not serious WTF .. except asp/vb/foxpro/cobol/...) why not go for a good old religion war and state the following:

if($language.creator=='microsoft'){$language.wtfp=nanotime();}
• Sanity 2011-09-06 09:39
First: The revised "Distributed" diagram is not the only way to set up a DVCS. It's by far the easiest, but certainly not the only way. It's also worth mentioning that for a typical project, a local Git repository is actually going to require less space than a Subversion checkout, so storage isn't relevant.

But there's more to it than that. Adding individual shelves to Subversion was the best and worst thing our team did. Best, because it meant that people were no longer afraid to check in for days at a time, not wanting to break the build with their experimental stuff, which only prolonged that experimental stuff due to them not being able to use version control the whole time.

Before this, we were all working off trunk, which has the additional issue that anytime a developer checks something in, they potentially create a merge conflict with someone else's working copy when they do an 'svn up'.

Worst, because Subversion completely fell over when we tried to use it that way. Especially when we added the merge tracking feature, which does make merges less likely to require manual intervention, but also made them take roughly half an hour. At that point, we were afraid to branch more than we had to, because merging would be painful -- but then, it's usually a given that merging is painful.

Git changed all that. After migrating the project to Git, those half-hour merges took seconds. Plus, we get backup for free, and we can work offline easily.

This is a valid concern:

"With the ease of forking, the simplicity of merging, and allure of pulling, it only seems logical to branch by shelf and end up picking up Jenga pieces."

But the real win isn't that each developer has a shelf, it's that even a feature which will take me only a few hours to develop can get its own branch. I can start work on a new feature without worrying that it'll interfere with anything else -- an urgent bug request could come in, and I'll just switch back to my 'master' or 'shelf' (or even 'trunk') branch and work from that, or create a new branch from the latest release if it's urgent enough to rush out a patch.

There's nothing inherently 'distributed' about any of the above. If non-distributed systems made it easy for a developer to create a new branch on the server, work on it for an hour, and then merge it in seconds, that'd be great. But making merging efficient and easy is a problem every DVCS has had to solve, much more so than a traditional SCM with a central server.

But the distributed pieces matter, too. As mentioned before, developer checkouts are now a sort of backup of your version history. I think it also helps that I'm in the habit of committing as soon as I have anything, which makes it a lot easier to keep commits small and to the point. The nice thing about a DVCS here is that I can easily roll back and re-commit that last change before I push it to anywhere publicly visible -- git's "--amend" option, for example -- so if I catch a typo or something minor before I push, I avoid a lot of "fixed typo" commits. This is even more useful with open source -- I can completely edit my commit history, rearranging things until it looks much more logical than it was, before I submit them for review.

For what it's worth, I also don't agree that every merge needs to be manually reviewed. If you've got a decent test suite, run that. Manual code reviews are useful, but they should be done independently of merges.

I can't really say I'm surprised. What would be surprising is if Alex ever admitted that a hot, new, over-hyped technology actually improved anything. But I have to say, there really are better and worse source control systems, and the new breed of DVCS is a huge improvement over systems like SVN.
• David C. 2011-09-06 09:45
QJo:
Documentation is usually written in Word (or using a similarly ill-maintainable program) so in a source-control system usually need to be stored in binary form. In such a form it may not be as easy to establish what the differences are between versions.
...
Another option is to use a wiki for the documentation.
Document control systems are usually optimized to deal with these problems. They generally understand the file formats involved (e.g. MS Office's various formats). While they may store each revision as a binary, they have features to take advantage of the file formats. For example, one system we use at work (sorry, I don't know who makes it - it might be an in-house system) has the following features:

* Stores metadata (document number, creator, revision, etc.) in MS Office document property fields.

* Automatically creates corporate-standard header/footer sections on all documents, making the metadata visible.

* Includes facilities for document approval, so managers can sign-off on released versions of a document, distinguishing it from works-in-progress.

* Automatic generation of PDFs from all supported document types (so people viewing files don't need to have the original app installed.)

These are things that are really useful (possibly even critical) for document storage, but would be mostly pointless when applied to source code.
• QJo 2011-09-06 09:46
Anonymous Cow-Herd:
QJo:
10^7 WTFP for turning up at an interview for your perfect job wearing brown shoes

Pff, I once overslept on an interview day, and as a result arrived unwashed, unshaved, with an unironed shirt and shoes that had been dusted off rather than polished, and 20 minutes late. I got the job and started the following Monday.

Maybe Graham's number WTFP for using SourceSafe for anything other than to demonstrate to management that you're using it.

Any advance on Grahams's Number! Who's first up for suggesting a particular WTF is worth Aleph-Null WTFP?
• Decoded 2011-09-06 09:49
100110010010110001101111111100001111100111011110
110011010001001010111000010111011111011110001101
100001011101111101111000110110011001001011000110
100110010010110001101111111100001111100111011110
110011010001001010111000010111011111011110001101
100001011101111101111000110110011001001011000110
100110010010110001101111111100001111100111011110
110011010001001010111000010111011111011110001101
100001011101111101111000110110011001001011000110
100110010010110001101111111100001111100111011110
110011010001001010111000010111011111011110001101
100001011101111101111000110110011001001011000110

10011001 00101100 01101111 11110000 11111001 11011110
11001101 00010010 10111000 01011101 11110111 10001101
10000101 11011111 01111000 11011001 10010010 11000110
10011001 00101100 01101111 11110000 11111001 11011110
11001101 00010010 10111000 01011101 11110111 10001101
10000101 11011111 01111000 11011001 10010010 11000110
10011001 00101100 01101111 11110000 11111001 11011110
11001101 00010010 10111000 01011101 11110111 10001101
10000101 11011111 01111000 11011001 10010010 11000110
10011001 00101100 01101111 11110000 11111001 11011110
11001101 00010010 10111000 01011101 11110111 10001101
10000101 11011111 01111000 11011001 10010010 11000110

153 44 111 240 249 222
205 18 184 93 247 141
133 223 120 217 146 198
153 44 111 240 249 222
205 18 184 93 247 141
133 223 120 217 146 198
153 44 111 240 249 222
205 18 184 93 247 141
133 223 120 217 146 198
153 44 111 240 249 222
205 18 184 93 247 141
133 223 120 217 146 198

™ , o ð ù Þ
Í  ¸ ] ÷ 
… ß x Ù ’ Æ
™ , o ð ù Þ
Í  ¸ ] ÷ 
… ß x Ù ’ Æ
™ , o ð ù Þ
Í  ¸ ] ÷ 
… ß x Ù ’ Æ
™ , o ð ù Þ
Í  ¸ ] ÷ 
… ß x Ù ’ Æ

I'm so very dissapointed.

~Nick~
• David C. 2011-09-06 09:59
The most important thing, IMO, about any VCS is its ability to make merges as painless as possible.

Many offer very little. They may generate diffs and point out conflicts, but they make you resolve even the most trivial of conflicts. When you've created a private branch/shelf and need to merge in others' changes without damaging your own edits, a good merge system is critical.

Unfortunately, most systems are very bad at this, and every free system I've used is included.

Without (hopefully) sounding like an advertisement, I've found that the commercial product, Perforce, is the only one that gets this right. The server tracks a file's entire revision history, through all of its permutations of branches (and there may be hundreds, for some key files belonging to large projects.) When you need to do a merge (which they call "integrate"), the system uses the version history to find the common ancestor between your file and the one you're merging in (even if this common ancestor is separated by dozens of intermediate branches.) It then does a 3-way diff on the files (yours, the one you're merging in, and the common ancestor), presenting all conflicts as all three versions of the conflicting lines. Sections where only one source (yours or the merged-in version) differ from the ancestor are automatically merged without any user intervention. (You can, of course, still review the merged changes and fix any mistakes, which still happen occasionally.)

With this system, you can actually merge any branch into any other branch, not just into direct parent/child branches. The server will track the operations and make the right thing happen, even if the branch/merge history starts looking like a tangled ball of rubber bands.

It's not a perfect system. You still sometimes have to manually merge files, but they manage to automate all of the easy situations, so you only have to manually merge the really nasty changes that no system is likely to be able to handle automatically.

(No, I don't work for Perforce, but my employer has used their product for many years and I'm very happy with it.)
• Poo 2011-09-06 10:00
Dave:
I've actually had the conversation about the difference between document management and source control before with co-workers. But I admit to not knowing what to use instead. Any suggestions?

Alfresco can be used for document management with versioning of documents.
• Poo 2011-09-06 10:07
David C.:
The most important thing, IMO, about any VCS is its ability to make merges as painless as possible.

Many offer very little. They may generate diffs and point out conflicts, but they make you resolve even the most trivial of conflicts. When you've created a private branch/shelf and need to merge in others' changes without damaging your own edits, a good merge system is critical.

Unfortunately, most systems are very bad at this, and every free system I've used is included.

Without (hopefully) sounding like an advertisement, I've found that the commercial product, Perforce, is the only one that gets this right. The server tracks a file's entire revision history, through all of its permutations of branches (and there may be hundreds, for some key files belonging to large projects.) When you need to do a merge (which they call "integrate"), the system uses the version history to find the common ancestor between your file and the one you're merging in (even if this common ancestor is separated by dozens of intermediate branches.) It then does a 3-way diff on the files (yours, the one you're merging in, and the common ancestor), presenting all conflicts as all three versions of the conflicting lines. Sections where only one source (yours or the merged-in version) differ from the ancestor are automatically merged without any user intervention. (You can, of course, still review the merged changes and fix any mistakes, which still happen occasionally.)

With this system, you can actually merge any branch into any other branch, not just into direct parent/child branches. The server will track the operations and make the right thing happen, even if the branch/merge history starts looking like a tangled ball of rubber bands.

It's not a perfect system. You still sometimes have to manually merge files, but they manage to automate all of the easy situations, so you only have to manually merge the really nasty changes that no system is likely to be able to handle automatically.

(No, I don't work for Perforce, but my employer has used their product for many years and I'm very happy with it.)

So, have you tried git?
• me 2011-09-06 10:08
QJo:
Documentation is usually written in Word (or using a similarly ill-maintainable program)

...and that, as they say, is the real WTF...
• QJo 2011-09-06 10:22
me:
QJo:
Documentation is usually written in Word (or using a similarly ill-maintainable program)

...and that, as they say, is the real WTF...

No argument against that point of view from me.
• egon 2011-09-06 10:23
I tend to like distributed system more...

Local copy of repo is most useful when either the main server is down or you don't have network. You can't commit or switch / create / merge (working) branches or even see the logs, then what good is a source control system if you can't use it.

Also I've always thought "commit" and "push" as "commit" and "inflict". One just stores data, the other inflicts other people with it. This extra step means usually I think twice before sending stuff to main server.
• Hello 2011-09-06 10:31
QJo:

There was so much to do and the deadlines were so tight it resulted in me working on the average of 70 hour weeks (all overtime unpaid) and being denied any decent leave for 18 months. Surprise was expressed at my exit interview that one of the reasons I was leaving was that I was tired.

You had a choice, a choice to quit immediately when faced with 30 hours overtime each week. But you didn't and then you complain afterwards. Perhaps now you have learned to stick up for yourself.
• lyates 2011-09-06 10:50
Thanks, Alex, for this back to basics SCM tutorial.
It's concise and very informative.

-Larry
• QJo 2011-09-06 10:53
Hello:
QJo:

There was so much to do and the deadlines were so tight it resulted in me working on the average of 70 hour weeks (all overtime unpaid) and being denied any decent leave for 18 months. Surprise was expressed at my exit interview that one of the reasons I was leaving was that I was tired.

You had a choice, a choice to quit immediately when faced with 30 hours overtime each week. But you didn't and then you complain afterwards. Perhaps now you have learned to stick up for yourself.

Good call. You're dead right, I should have jumped ship way before I did. But (a) I hadn't realised at the time how heavy it was going to be, (b) it took some time to find somewhere to jump to.

Bear in mind that when you have responsibilities and dependents that are absolutely your job to look after, you can not just walk up and leave a job without preparing the ground first. When you're a bit older maybe you'll learn this.
• The "Real" WTF 2011-09-06 11:06
L.:
Over 9000 WTFP for using MySQL in a project where a database is really required.

And .. I strongly disagree with putting php (5+) on the same level as ASP . I believe we can safely assume WTFP(ASP/vb)=10*WTFP(php).

But if we're into the language bashing (which imo is not serious WTF .. except asp/vb/foxpro/cobol/...) why not go for a good old religion war and state the following:

if($language.creator=='microsoft'){$language.wtfp=nanotime();}

I think most languages are not inherently WTFP worthy.

except for perl.

it may be the only "real" language who's 99 bottles code is somewhat similar looking to brainf*ck

oh and why is my comment spam?? what did i do wrong?
• ac 2011-09-06 11:24
QJo:
Good call. You're dead right, I should have jumped ship way before I did. But (a) I hadn't realised at the time how heavy it was going to be, (b) it took some time to find somewhere to jump to.

Bear in mind that when you have responsibilities and dependents that are absolutely your job to look after, you can not just walk up and leave a job without preparing the ground first. When you're a bit older maybe you'll learn this.

Bear in mind that doing a lot of work for free (e.g. 30 hours of overtime per week for 18 months > 2000 hours), your employers may already not think very highly of you. Sure, they'll think you're a hard working fellow and might recommend your new prospective employer makes references calls. However, they'll most likely laugh their asses off because you gave them so much work for free and soon forget that you ever existed. They most likely know very well that they're asking you to do 2 people's jobs for 1 salary. If you jump ship and drop all those "responsibilities and dependants" once you realize the task is totally unreasonable, it probably won't change what they think of you.

There's one thing I learned early on in my career, by looking at the people around me. Those who put in all that (unpaid) overtime for someone else leave a huge part of their life between four walls and get nothing but a nice reference call, if your new employer actually does call for references.
• trtrwtf 2011-09-06 11:24
QJo:

Documentation is usually written in Word (or using a similarly ill-maintainable program) so in a source-control system usually need to be stored in binary form. In such a form it may not be as easy to establish what the differences are between versions.

There is some documentation written in word, but it's hardly the industry standard. XML markup of various sorts (ie, docbook) is much more reasonable, since it can generate web and PDF and other forms. I believe all of the popular authoring systems are saved as flavors of XML.
• QJo 2011-09-06 11:27
The "Real" WTF:
L.:
Over 9000 WTFP for using MySQL in a project where a database is really required.

And .. I strongly disagree with putting php (5+) on the same level as ASP . I believe we can safely assume WTFP(ASP/vb)=10*WTFP(php).

But if we're into the language bashing (which imo is not serious WTF .. except asp/vb/foxpro/cobol/...) why not go for a good old religion war and state the following:

if($language.creator=='microsoft'){$language.wtfp=nanotime();}

I think most languages are not inherently WTFP worthy.

except for perl.

it may be the only "real" language who's 99 bottles code is somewhat similar looking to brainf*ck

oh and why is my comment spam?? what did i do wrong?

APL is arguably WTF-worthy.
• Dave 2011-09-06 11:28
"The third – and the least used – special type of Fork operation is called Shelf."

From the point of view of the master repository of a large project, aren't most developer repositories and branches in a distributed system?

My branch of a project on GitHub will never be used to make a build of the official library for instance, so aside from my PoV (and perhaps others working with me) my repo/branch is a shelf.

Well, it'll never be used _directly_ to make an official build at least - if I submit a patch from my code that is accepted or the people with access to the master repo/branch otherwise chose to pull in some of my changes then my branch has an indirect effect on the "real" build. but as that indirect effect is gated by the mast repo's owners I'd say they are still in control of that and my copy is still a shelf from their PoV (and the PoV of the public at large if they use the library) by the definition in this article.

That would make the shelf a very common type of fork, rather than the least used one.
• Zetetic 2011-09-06 11:41
$perl -ne 'chomp; print pack "B*",$_' <<END | ent -b
100110010010110001101111111100001111100111011110
110011010001001010111000010111011111011110001101
100001011101111101111000110110011001001011000110
100110010010110001101111111100001111100111011110
110011010001001010111000010111011111011110001101
100001011101111101111000110110011001001011000110
100110010010110001101111111100001111100111011110
110011010001001010111000010111011111011110001101
100001011101111101111000110110011001001011000110
100110010010110001101111111100001111100111011110
110011010001001010111000010111011111011110001101
100001011101111101111000110110011001001011000110
END
Entropy = 0.986040 bits per bit.

Optimum compression would reduce the size
of this 576 bit file by 1 percent.

Chi square distribution for 576 samples is 11.11, and randomly
would exceed this value 0.05 percent of the times.

Arithmetic mean value of data bits is 0.5694 (0.5 = random).
Monte Carlo value for Pi is 2.666666667 (error 15.12 percent).
Serial correlation coefficient is 0.036979 (totally uncorrelated = 0.0).
• blah 2011-09-06 12:17
Of course, knowing how to merge doesn’t exactly help you understand why parallel repositories needing merging in the first place.
// Maybe I needing this
• RRDY 2011-09-06 12:38
ac:
QJo:
Good call. You're dead right, I should have jumped ship way before I did. But (a) I hadn't realised at the time how heavy it was going to be, (b) it took some time to find somewhere to jump to.

Bear in mind that when you have responsibilities and dependents that are absolutely your job to look after, you can not just walk up and leave a job without preparing the ground first. When you're a bit older maybe you'll learn this.

Bear in mind that doing a lot of work for free (e.g. 30 hours of overtime per week for 18 months > 2000 hours), your employers may already not think very highly of you. Sure, they'll think you're a hard working fellow and might recommend your new prospective employer makes references calls. However, they'll most likely laugh their asses off because you gave them so much work for free and soon forget that you ever existed. They most likely know very well that they're asking you to do 2 people's jobs for 1 salary. If you jump ship and drop all those "responsibilities and dependants" once you realize the task is totally unreasonable, it probably won't change what they think of you.

There's one thing I learned early on in my career, by looking at the people around me. Those who put in all that (unpaid) overtime for someone else leave a huge part of their life between four walls and get nothing but a nice reference call, if your new employer actually does call for references.

I don't think you understood what QJo meant be 'responsibilities and dependents'. Not the JOB responsiblities. Spouse and kids more likely.
• frits 2011-09-06 12:40
RRDY:
ac:
QJo:
Good call. You're dead right, I should have jumped ship way before I did. But (a) I hadn't realised at the time how heavy it was going to be, (b) it took some time to find somewhere to jump to.

Bear in mind that when you have responsibilities and dependents that are absolutely your job to look after, you can not just walk up and leave a job without preparing the ground first. When you're a bit older maybe you'll learn this.

Bear in mind that doing a lot of work for free (e.g. 30 hours of overtime per week for 18 months > 2000 hours), your employers may already not think very highly of you. Sure, they'll think you're a hard working fellow and might recommend your new prospective employer makes references calls. However, they'll most likely laugh their asses off because you gave them so much work for free and soon forget that you ever existed. They most likely know very well that they're asking you to do 2 people's jobs for 1 salary. If you jump ship and drop all those "responsibilities and dependants" once you realize the task is totally unreasonable, it probably won't change what they think of you.

There's one thing I learned early on in my career, by looking at the people around me. Those who put in all that (unpaid) overtime for someone else leave a huge part of their life between four walls and get nothing but a nice reference call, if your new employer actually does call for references.

I don't think you understood what QJo meant be 'responsibilities and dependents'. Not the JOB responsiblities. Spouse and kids more likely.

Exactly. Anyone who doesn't realize this is a n00b at life, or maybe a Forever Alone Guy.
• tomhanks 2011-09-06 13:22
QJo:
wva:
Why the hell would you want to keep documentation out of the source control system?

What makes a latex/org/... file different from a c/py/.. file? In both cases you want to track and merge changes and see who did what when, and branch as ideas are tried or different "stable" versions are needed...

Documentation is usually written in Word (or using a similarly ill-maintainable program) so in a source-control system usually need to be stored in binary form. In such a form it may not be as easy to establish what the differences are between versions.

If you're maintaining your documentation in e.g. TeX, then it may well be more appropriate to use a source-control system for the docs.

Another option is to use a wiki for the documentation.

trwtf is writing documentation in the first place

am i right?
• snoofle 2011-09-06 13:37
QJo:
30 hours weekly unpaid overtime for a long time
We've all done that early on in our careers. Once burned, twice shy. Once you learn to see it coming you can make for the exit long before you're exhausted (who wants to show up at a new job the first day - needing a vacation?)

It's the fools who keep doing it over and over that cause management to continue this practice of abusing and then discarding employees.
• Nagesh 2011-09-06 13:52
I am not to be using sorce control for porpuses of incriminations.
• Smitty 2011-09-06 14:03

Bear in mind that when you have responsibilities and dependents that are absolutely your job to look after, you can not just walk up and leave a job without preparing the ground first. When you're a bit older maybe you'll learn this.

QFT.
• Part-time dev 2011-09-06 14:35
I rather suspect that Alex doesn't understand distributed source control systems like Git and Mercurial.

Once you figure out how it actually works and stop thinking in terms of svn and SourceSafe then you start to realise why it really is the best way to work.

That's not surprising because it really doesn't actually behave the same way as traditional source control.

Joel Spolsky said a while ago “To me, the fact that they make branching and merging easier just means that your coworkers are more likely to branch and merge, and you’re more likely to be confused.”

He later realised that he was totally wrong - http://www.joelonsoftware.com/items/2010/03/17.html

Alex seems to be making a similar mistake here.

One of the really key things I've found with Git is that you never have to 'check-out' a file and two people working on the same file is rarely an issue.

For example:

I'm told "Fix bug A", my colleague is told "Fix bug B".

We both need to work on the same file to fix these bugs for whatever reason. (Maybe it's WTF code, maybe not)

With git, we can both go ahead and do our bugfixes on our local repos, then push the changes to each other once we're each happy that our stuff is good.

If I'm slower than her, when I sync with her to merge in her changes (or a central server) I magically get her fixes - unless we both edit the same method, it Just Works.
- Much of the time it's fine as long as we haven't both edited the same exact line.

With VSS, if she gets to the repository before me then I can't do anything. In other ones, it's extremely easy for either of us to accidentally wipe out the other's work.
(Sync A, Sync B, Commit A, Commit B - A's commit just vanished!)

That's really annoying.

Distributed source control means that every single developer automatically has their own 'shelf' to work with, and that merging the shelves to make 'trunk' is usually quite easy.

As mentioned earlier, this also means devs are more likely to keep 'commits' down to very small, focused changes - which are much easier to deal with later on.

-Plus you can all keep working - including rollbacks, 'forks' and 'labelling' - when the network (or server) is down and know that it's not going to be painful to put it all together when the sysadmins get it fixed.
• annie the moose 2011-09-06 14:35
You're doing it wrong!

C:\VersionControl
MyProg.201109060900.c
MyProg.201109060904.c
MyProg.201109060915.c

It's so easy.
• Abso 2011-09-06 14:42
Part-time dev:
With VSS, if she gets to the repository before me then I can't do anything. In other ones, it's extremely easy for either of us to accidentally wipe out the other's work.
(Sync A, Sync B, Commit A, Commit B - A's commit just vanished!)

That's really annoying.

VSS may be that much of a WTF, but that doesn't mean that every non-distributed source control system is. In this specific situation, even CVS will refuse to commit B's changes until B updates/syncs again.
• gnasher729 2011-09-06 14:43
QJo:
There was so much to do and the deadlines were so tight it resulted in me working on the average of 70 hour weeks (all overtime unpaid) and being denied any decent leave for 18 months. Surprise was expressed at my exit interview that one of the reasons I was leaving was that I was tired.

If you had worked 40 hours a week and told them for 18 months that everything was going to plan, they would have got exactly what they paid for, you would have enjoyed those 18 months a lot more, and you would have found a new job just the same.
• Ghost of Nagesh 2011-09-06 14:43
The poop... of DOOM!:
Paratus:
The poop... of DOOM!:
The "Real" WTF:

6000 - using ACCESS as a database

7000 WTFP for using VB
7000 WTFP for using PHP

VB and PHP are certainly RWTFs, but there's no way that they're worse than using Access.

He said using Access as a database, so you can combine that.

A PHP application calling an Access database would result in 13000 WTFP (and a developer who's been committed to a mental hospital)

And hosting the system on Unix and the DB on windows. *WIn!!!!*
• Matt Westwood 2011-09-06 14:59
QJo:
Anonymous Cow-Herd:
QJo:
10^7 WTFP for turning up at an interview for your perfect job wearing brown shoes

Pff, I once overslept on an interview day, and as a result arrived unwashed, unshaved, with an unironed shirt and shoes that had been dusted off rather than polished, and 20 minutes late. I got the job and started the following Monday.

Maybe Graham's number WTFP for using SourceSafe for anything other than to demonstrate to management that you're using it.

Any advance on Grahams's Number! Who's first up for suggesting a particular WTF is worth Aleph-Null WTFP?

Um okay, got one. Being passed over for promotion in favour of the CEO's nephew. Boring and pedestrian, but, yeah.
• Abso 2011-09-06 15:40
So what's the "right" way to deal with features that aren't really assigned to a specific release? At any given time, the project I'm working on has several half-finished features which will be shipped either in the next release (if they're finished in time) or in the one after.

How we currently handle this is similar to the anti-pattern Alex describes, with "dev" and "main" branches/shelves. But within each of those, testing and such is done on immutable, labelled builds. We seem to avoid the Jenga pattern most of the time by associating commits with bugs and merging all the commits for a given bug at the same time.

We also have a branch for each release, but it's not created until the release is mostly finished.

I guess the obvious fix would be to improve our release-defining, but that's not likely to actually happen.

How do other people handle this sort of thing?
Part-time dev:
One of the really key things I've found with Git is that you never have to 'check-out' a file and two people working on the same file is rarely an issue.

This has nothing to do with distributed vs centralized: if a source control system has a locking mechanism, then a "Check-out/Edit/Check-in" style of development is possible. Some systems (Microsoft Visual SourceSafe) mandate locking, where as others (SourceGear Vault) don't.

Part-time dev:
If I'm slower than her, when I sync with her to merge in her changes (or a central server) I magically get her fixes - unless we both edit the same method, it Just Works.

Again, nothing new. This is the whole idea behind Edit/Merge/Commit style of development. Distributed doesn't make merging any easier.

Part-time dev:
you can all keep working - including rollbacks, 'forks' and 'labelling' - when the network (or server) is down and know that it's not going to be painful to put it all together when the sysadmins get it fixed.

Well, there's no good reason to fork (label, branch, or shelf) your code "offline", so you're really only left with one thing: viewing history. There are some advantages to that.
• Matt Westwood 2011-09-06 15:48
snoofle:
QJo:
30 hours weekly unpaid overtime for a long time
We've all done that early on in our careers. Once burned, twice shy. Once you learn to see it coming you can make for the exit long before you're exhausted (who wants to show up at a new job the first day - needing a vacation?)

It's the fools who keep doing it over and over that cause management to continue this practice of abusing and then discarding employees.

Yeah but if you've gone 18 months without leave you've built up enough to take it all at once and have a nice loooooong rest before starting at the new place all fresh and, er, having forgotten how to get out of bed at 6 a.m. Er, yeah, I can see why that wouldn't work.
• Part-time dev 2011-09-06 15:58
Abso:
Part-time dev:
With VSS, if she gets to the repository before me then I can't do anything. In other ones, it's extremely easy for either of us to accidentally wipe out the other's work.
(Sync A, Sync B, Commit A, Commit B - A's commit just vanished!)

That's really annoying.

VSS may be that much of a WTF, but that doesn't mean that every non-distributed source control system is. In this specific situation, even CVS will refuse to commit B's changes until B updates/syncs again.
Interesting, that's better than I'd been lead to believe.

That's still going to be annoying though, and can easily become very nasty with a large development team.
Sync - Commit BONG (A beat you to it) - Sync - Commit BONG (C) - Sync - Commit... Finally! What was I doing again?

So VSS really is the worst application ever conceived for this...
I was starting to think that maybe we were just using it wrong! (More ammo for switching to something better.)
• boomzilla 2011-09-06 16:02

Part-time dev:
If I'm slower than her, when I sync with her to merge in her changes (or a central server) I magically get her fixes - unless we both edit the same method, it Just Works.

Again, nothing new. This is the whole idea behind Edit/Merge/Commit style of development. Distributed doesn't make merging any easier.

That's true, except that most (all?) DVCSes have figured out that Edit/Merge/Commit sucks. Much better to Edit/Commit/Merge with anonymous branches where necessary.

This way, if we happen to work on the same file, we aren't forced to merge prior to committing our fixes. We can solve our problems in isolation, and then worry about combining them. Much simpler.

CVCS could do this, of course, but generally don't AFAIK.
Abso:
So what's the "right" way to deal with features that aren't really assigned to a specific release?

...

I guess the obvious fix would be to improve our release-defining, but that's not likely to actually happen.

Your release process is a little broken, but only semantically. Instead of considering these "in progress" features that will go in a "TBD release", assign them to a specific release from the get-go. You can always move them around as things change.

There's nothing wrong with using shelves - heck, create one for each feature if you really want - just don't create your "release candidate" builds from them. Merge changes into a release branch and *then* createa build from that to run through the gauntlet.
• Abso 2011-09-06 16:06
Part-time dev:
Abso:
Part-time dev:
With VSS, if she gets to the repository before me then I can't do anything. In other ones, it's extremely easy for either of us to accidentally wipe out the other's work.
(Sync A, Sync B, Commit A, Commit B - A's commit just vanished!)

That's really annoying.

VSS may be that much of a WTF, but that doesn't mean that every non-distributed source control system is. In this specific situation, even CVS will refuse to commit B's changes until B updates/syncs again.
Interesting, that's better than I'd been lead to believe.

That's still going to be annoying though, and can easily become very nasty with a large development team.
Sync - Commit BONG (A beat you to it) - Sync - Commit BONG (C) - Sync - Commit... Finally! What was I doing again?

So VSS really is the worst application ever conceived for this...
I was starting to think that maybe we were just using it wrong! (More ammo for switching to something better.)

Maybe? Our team is pretty small, so it's rarely a problem for me. But if you have three people committing changes to the same file at the same time, someone is going to get stuck doing the merge regardless of what source control you're using.
boomzilla:
That's true, except that most (all?) DVCSes have figured out that Edit/Merge/Commit sucks. Much better to Edit/Commit/Merge with anonymous branches where necessary.

This way, if we happen to work on the same file, we aren't forced to merge prior to committing our fixes. We can solve our problems in isolation, and then worry about combining them. Much simpler.

CVCS could do this, of course, but generally don't AFAIK.

Good point, and one that seems to go back to personal/team preference. I find it awfully silly to say "let's just merge later", but then again I like the Check-out/Edit/Check-in style myself. Not enough to put up a fight, though. I would probably complain about it, however.
• Marvin the Martian 2011-09-06 16:24
As a mathematician sticking around philosophers, I have to say that the choice of word "dimension" is quite, very, bad.

Dimensions should be independent things, that together span the whole set of possibilities; phase space or whatever you want to call it. But here it's a hierarchical ordering... They're levels of precision in a way. So why not "level" (to plug in the basic concept of low/high level languages) or "order"? That's what you're conveying anyway.

That, or some far-fetched analogy (bits&bytes= bones or cells, files&filesystem= flesh or organs, mutations= .. ) to be worked out.
• trtrwtf 2011-09-06 16:25
Abso:
But if you have three people committing changes to the same file at the same time, someone is going to get stuck doing the merge regardless of what source control you're using.

Not if you use HAL_VS! It's wonderful! It even writes the code if you ask it nicely...
• boomzilla 2011-09-06 16:27

boomzilla:
Much better to Edit/Commit/Merge with anonymous branches where necessary.

Good point, and one that seems to go back to personal/team preference. I find it awfully silly to say "let's just merge later", but then again I like the Check-out/Edit/Check-in style myself. Not enough to put up a fight, though. I would probably complain about it, however.

"Awfully silly" is a lot better than PITA. Also, it makes it easier to make small commits yourself without worrying about having to merge with other changes.

This isn't simply for the same files, either. It's quite possible that you rely on some behavior in a part of the system that you're not changing but that someone else is. Dealing with that mid stream just makes things more difficult than they need to be.

It's also more obvious that you're actually merging since you actually use a merge command, as opposed to an update. I can't imagine working under a checkout style regime on anything of substance.
Part-time dev:
That's still going to be annoying though, and can easily become very nasty with a large development team.
Sync - Commit BONG (A beat you to it) - Sync - Commit BONG (C) - Sync - Commit... Finally! What was I doing again?

This doesn't happen in practice. If you have that many people simultaneously working on the same files, you will have much bigger problems.

Big teams are compartmentalized by module, so in effect it's just a bunch of small teams integrating modules together. These integration decisions (i.e. cross-over) are best determined ahead of time (and documented!), not at commit-time.

Part-time dev:
So VSS really is the worst application ever conceived for this...
I was starting to think that maybe we were just using it wrong! (More ammo for switching to something better.)

VSS is among the best of the worst I'd say. You haven't seen SCM hell until you've worked with configspecs. *shudder*
• Matt Westwood 2011-09-06 16:29
Abso:
Part-time dev:
Abso:
Part-time dev:
With VSS, if she gets to the repository before me then I can't do anything. In other ones, it's extremely easy for either of us to accidentally wipe out the other's work.
(Sync A, Sync B, Commit A, Commit B - A's commit just vanished!)

That's really annoying.

VSS may be that much of a WTF, but that doesn't mean that every non-distributed source control system is. In this specific situation, even CVS will refuse to commit B's changes until B updates/syncs again.
Interesting, that's better than I'd been lead to believe.

That's still going to be annoying though, and can easily become very nasty with a large development team.
Sync - Commit BONG (A beat you to it) - Sync - Commit BONG (C) - Sync - Commit... Finally! What was I doing again?

So VSS really is the worst application ever conceived for this...
I was starting to think that maybe we were just using it wrong! (More ammo for switching to something better.)

Maybe? Our team is pretty small, so it's rarely a problem for me. But if you have three people committing changes to the same file at the same time, someone is going to get stuck doing the merge regardless of what source control you're using.

If you find it's happening a lot, and you're always working on the same file(s), then you might find it pays to do some refactoring.

OTOH if it's because they always let three people loose at a programming job at once, and you're always fighting with each other over a commit, there's something iffy about the business process.

I would have thought it rare for more than one person to need to work on the same file at once unless there's something really funny with your system configuration.
• Abso 2011-09-06 16:32
Abso:
So what's the "right" way to deal with features that aren't really assigned to a specific release?

...

I guess the obvious fix would be to improve our release-defining, but that's not likely to actually happen.

Your release process is a little broken, but only semantically. Instead of considering these "in progress" features that will go in a "TBD release", assign them to a specific release from the get-go. You can always move them around as things change.

There's nothing wrong with using shelves - heck, create one for each feature if you really want - just don't create your "release candidate" builds from them. Merge changes into a release branch and *then* createa build from that to run through the gauntlet.

Features are technically assigned to a release from the get go, it's just that we often move more of them than we leave in the release. Sometimes our manager has difficulty accepting that we can't put every feature ever into the next release, while also getting it out the door someday.

And we do create a branch for each release before the first release candidate is built. It does end up with everything that makes it to the main branch before then, but it's generally only the dev branch that has works in progress. So I guess our system isn't all that far from "right" after all.

• Arancaytar 2011-09-06 16:47
Changes made to a branch are generally merged back in, but one thing must happen: at some point, the branch has to go away. If it doesn’t, it’s just a fork.

Sometimes (actually most of the time in my experience) the branch represents a feature-frozen version that is being readied for a stable release... no merging there, though many changesets will be propagated "upstream" in the process.
• Part-time dev 2011-09-06 16:58
This has nothing to do with distributed vs centralized: if a source control system has a locking mechanism, then a "Check-out/Edit/Check-in" style of development is possible.

My point here is that locking is actually a bad thing.
We're not talking records in a DB table that update nearly-instantaneously, we're talking about items that need several minutes, if not hours of work to update.

With regards to Edit/Merge/Commit:

I linked to Joel Spolsky as he explained it much better than me.
My post was pointing to an effect of the way distributed source control works that I particularly like.

Distributed version control systems do not think in terms of bits/paths or file versions.

They work in terms of deltas, and *only* deltas.
Version [GUIDB] is Version [GUIDA] +this, -that.

This means that you don't Edit/Merge/Commit - you Edit/Commit/Edit/Commit/Edit/Commit then PUSH and/or PULL - and the push/pull is when the merge happens.

In both DCVS and CCVS there will of course be places where the merge needs a human to sort it out no matter how clever the system.

However, in a distributed system, that human merging always happens at the end, not in the middle - so it doesn't break your chain of thought.

This encourages small commits in DCVS, while the behaviour of CCVS actively discourages them.

It's generally acknowledged that small focused commits are better - less work later if you find something has broken.

Small focused commits in DVCS mean less work now and less work later, while in CCVS, small focused commits mean more work now.

Well, there's no good reason to fork (label, branch, or shelf) your code "offline"
Of course there are - the same reasons you'd fork otherwise!

Life doesn't stop just because somebody nuked the server - those projects still have deadlines!
• Part-time dev 2011-09-06 17:00
Matt Westwood:
If you find it's happening a lot, and you're always working on the same file(s), then you might find it pays to do some refactoring.

OTOH, refactoring is of course one of the tasks that makes that event rather likely!
• Abso 2011-09-06 17:24
trtrwtf:
Abso:
But if you have three people committing changes to the same file at the same time, someone is going to get stuck doing the merge regardless of what source control you're using.

Not if you use HAL_VS! It's wonderful! It even writes the code if you ask it nicely...

Yes, but it has a bug where it sometimes deletes vital functions.
• Mr.'; Drop Database -- 2011-09-06 17:55
The "Real" WTF:
L.:
Over 9000 WTFP for using MySQL in a project where a database is really required.

And .. I strongly disagree with putting php (5+) on the same level as ASP . I believe we can safely assume WTFP(ASP/vb)=10*WTFP(php).

But if we're into the language bashing (which imo is not serious WTF .. except asp/vb/foxpro/cobol/...) why not go for a good old religion war and state the following:

if($language.creator=='microsoft'){$language.wtfp=nanotime();}
I think most languages are not inherently WTFP worthy.

except for perl.

it may be the only "real" language who's 99 bottles code is somewhat similar looking to brainf*ck oh and why is my comment spam?? what did i do wrong?
Spam filters often mistake comments for spam if they contain a link or two. Or maybe the spam filter tries to filter out comments about Perl being bad because it's possible to write obfuscated code in it. There's a lot to dislike about Perl, but let's stick to real reasons, please? :)
my @arr = (10, 20); # Array variables are prefixed with at signs
print $arr[0]."\n"; # Except when they're not push(@arr, 30); # Some functions can take arrays as parameters threeArgFn(@arr); # Others will "unpack" the array's elements into separate arguments my %args = ( # A hashmap, but Perl calls it a "hash" for no reason searchTerm =>$cgi->param("s"),
category => $cgi->param("cat"), ); # If no query-string parameters were passed, %args is now equal to ("searchTerm" => "category"). # This is due to the unpacking behaviour and because the double arrow is treated almost the same as a comma # Arrays and "hashes" can only contain scalars. Perl provides references, which wrap things up as scalars...$args{subCats} = [1, 2, 3]; # Square brackets create an array reference
push(@{ $args{subCats} }, 4); # But now you must dereference it every time you want to perform a common operation on it eval { # "try" die("exception"); # "throw" }; if ($@) { # "catch"
}
Etc...
• notromda 2011-09-06 20:34
Abso:
So what's the "right" way to deal with features that aren't really assigned to a specific release? At any given time, the project I'm working on has several half-finished features which will be shipped either in the next release (if they're finished in time) or in the one after.

How we currently handle this is similar to the anti-pattern Alex describes, with "dev" and "main" branches/shelves. But within each of those, testing and such is done on immutable, labelled builds. We seem to avoid the Jenga pattern most of the time by associating commits with bugs and merging all the commits for a given bug at the same time.

The way it's handled with git: create a new branch for *every* new feature, and *every* bug report/ticket number.

While working on a given branch you can always pull in from other sources, and when it's done, merge it back into whatever release branch you want.

Git makes this all very easy.
• Luiz Felipe 2011-09-06 20:37
The poop... of DOOM!:
Paratus:
The poop... of DOOM!:
The "Real" WTF:

6000 - using ACCESS as a database

7000 WTFP for using VB
7000 WTFP for using PHP

VB and PHP are certainly RWTFs, but there's no way that they're worse than using Access.

He said using Access as a database, so you can combine that.

A PHP application calling an Access database would result in 13000 WTFP (and a developer who's been committed to a mental hospital)

20000 Using firebird/interbase (its worse than access).

Access is a little db for simple use, its not WTF to use in correct situation, but its easy to abuse. There nothing wrong in using simple rdbms.

Firebird is crap, access can sustain more records and users.

• Sanity 2011-09-07 00:05
David C.:
The most important thing, IMO, about any VCS is its ability to make merges as painless as possible.

And as I said in my comment, every DVCS has had to solve this in one way or another in order to be at all viable.

David C.:
Without (hopefully) sounding like an advertisement, I've found that the commercial product, Perforce, is the only one that gets this right. The server tracks a file's entire revision history, through all of its permutations of branches (and there may be hundreds, for some key files belonging to large projects.) When you need to do a merge (which they call "integrate"), the system uses the version history to find the common ancestor between your file and the one you're merging in (even if this common ancestor is separated by dozens of intermediate branches.) It then does a 3-way diff on the files (yours, the one you're merging in, and the common ancestor), presenting all conflicts as all three versions of the conflicting lines. Sections where only one source (yours or the merged-in version) differ from the ancestor are automatically merged without any user intervention. (You can, of course, still review the merged changes and fix any mistakes, which still happen occasionally.)

You not only just described how Git seems to work (and it does this in seconds, most often in under a second), but you also described how recent builds of SVN with merge-tracking work (it just ended up taking half an hour for it to finish merging, and that's not counting any manual conflict resolution).

David C.:
With this system, you can actually merge any branch into any other branch, not just into direct parent/child branches. The server will track the operations and make the right thing happen, even if the branch/merge history starts looking like a tangled ball of rubber bands.

A little thought will show that the "tangle ball of rubber bands" is precisely the problem DVCS was invented to solve. Maybe an illustration is in order: Linus' role in Linux these days is essentially merging patches from other people. To do this, he pulls and merges from about a hundred top-level contributors (generally subsystem maintainers), who themselves pull and merge, or apply patches from, people lower down.

There does have to be a common ancestor for the repository itself, but beyond that, everything else typically just works.

David C.:
It's not a perfect system. You still sometimes have to manually merge files, but they manage to automate all of the easy situations, so you only have to manually merge the really nasty changes that no system is likely to be able to handle automatically.

And for those manual merges -- few and far between though they may be -- I have my Git configured to launch kdiff3, so I get a nice graphical 3-way diff I can edit.

I can believe you're not astroturfing for Perforce, but if you're going to claim that it's the only one which gets merging right, you really need to try Git. It's free (and open source), requires very little set-up (a server is optional, not required), and there are free books full of documentation out there.
Part-time dev:
My point here is that locking is actually a bad thing.

There's nothing wrong with locking. You just don't like it.

That's fine... and I don't like not locking; it's just a matter of team preference.

Part-time dev:
Distributed version control systems do not think in terms of bits/paths or file versions.

They work in terms of deltas, and *only* deltas.
Version [GUIDB] is Version [GUIDA] +this, -that.

Yes yes, and time is really just discrete snapshots of the universe as it moves in a direction towards greater entropy. But realistically, we need watches and timezones... and realistically, files are bits/paths and (in revision control) have a history of changes.

I understand how directed acyclical graphs work and that a file can be multi-headed and have multiple "current" versions. But what does that mean in practice? You can't get the file without resolving the merge... thus it's effectively just a reverse lock.

Again, nothing new here. Except a lot of confusion for the developers who have a hard enough time grasping the 3-dimensions of revision control.

Part-time dev:
It's generally acknowledged that small focused commits are better - less work later if you find something has broken.

A commit should represent a reasonable attempt at implementing a specific task. Thus, it's the *tasks* that should be kept small, not the commits. This is an important distinction -- if tasks are big but commits are kept small, then commits and tasks become further separated.
• eMBee 2011-09-07 01:48
annie the moose:
You're doing it wrong!

C:\VersionControl
MyProg.201109060900.c
MyProg.201109060904.c
MyProg.201109060915.c

It's so easy.

i liked VMS built in versioning:
MyProg.c;1
MyProg.c;2
MyProg.c;3
• Anon 2011-09-07 04:38
Trwtf is 3857 words?
• AndyCanfield 2011-09-07 04:43
Backup a directory tree and reload and you get a clone of what you started with. Put that directory tree into version control, and check it out again, and it's nowhere near a clone.

I'm doing my first Drupal web site, and looked into Subversion and Git. I was horrified at how much Version Control does ***not*** track:

File ownership.

File access; e.g. which pieces must be writable by others?

File timestamps; forcing modtime=NOW is convenient for 'make' but there's no option.

Database contents; e.g. MySQL. Try to put a MySQL backup into version control and you get one diff line per table.

My Linux (case sensitive) system had two files, named "Install.text" and "install.text"; the Subversion repository was on OS X (case insensitive). An svn checkout on OS X confused svn terribly. Not sure whether the repository or the checkout was blown.

As far as I can see, "Version Control" means "Source Code Version Control", and it is not yet ready for Web 3.0.
• L. 2011-09-07 04:49
Luiz Felipe:
The poop... of DOOM!:
Paratus:
The poop... of DOOM!:
The "Real" WTF:

6000 - using ACCESS as a database

7000 WTFP for using VB
7000 WTFP for using PHP

VB and PHP are certainly RWTFs, but there's no way that they're worse than using Access.

He said using Access as a database, so you can combine that.

A PHP application calling an Access database would result in 13000 WTFP (and a developer who's been committed to a mental hospital)

20000 Using firebird/interbase (its worse than access).

Access is a little db for simple use, its not WTF to use in correct situation, but its easy to abuse. There nothing wrong in using simple rdbms.

Firebird is crap, access can sustain more records and users.

Access is total crap, there is no valid reason to use Access instead of MySQL (which already is a simple rdbms that sucks a lot).
I do agree that for very simple and basic db use, one can stick to mySQL or other half-assed dbms's, but it is also clear that a LOT of these cases are misunderstood.

I.E. developpers who know nothing about SQL think it's only good to store objects in a table, thus take no advantage of the tool and thus design an application that uses little or no features which IS a WTF in itself, for using the wrong tools for the job.

I'm not a DBA and I'm quite surprised to see how much other devs have no clue about SQL in general (yes, all of you who use MySQL can be included in this if you think innoDB is strictly ACID compliant for example, etc.) - in the end, know your tools and use them right, also remember some tools are USELESS for some projects, there is NO using them right (like access for anything or MySQL for complex applications).

In the end, the only good ones are and will be those who try to do better every single time, spend time reading and learning all they can (and posting their own fails on tdwtf for our enjoyment).
• The Poop... of DOOM! 2011-09-07 05:36
snoofle:
QJo:
30 hours weekly unpaid overtime for a long time
We've all done that early on in our careers. Once burned, twice shy. Once you learn to see it coming you can make for the exit long before you're exhausted (who wants to show up at a new job the first day - needing a vacation?)

It's the fools who keep doing it over and over that cause management to continue this practice of abusing and then discarding employees.

Been there, done that (showing up at a new job the first day - needing a vacation, due to having been burnt out like that by the previous job). Also had a heavy flu that first week, cause some idiot colleague at the job before found it necessary to come into work while being seriously ill, only to do nothing but moan while hanging over the kitchen sink all day long, every day.

Another real WTF: Not staying at home when you're too ill to go to work.
• Gizz 2011-09-07 05:55
...source code control was done on floppy disks. The release code was written to a floppy (5.25") and write protected and put in the fire safe. To comply with BS5750, we also printed the source out on a huge sheaf of paper. As a backup.
Happy days.
• Anonymous Cow-Herd 2011-09-07 06:04
Part-time dev:
That's still going to be annoying though, and can easily become very nasty with a large development team.
Sync - Commit BONG (A beat you to it) - Sync - Commit BONG (C) - Sync - Commit... Finally! What was I doing again?

Can you even do that? Last time I had to bring a ten-foot pole near to a VSS repo, it was a case of "Sorry, you can't edit this file, A has checked it out already." Maybe that was a result of us not realizing that SourceSafe could do it in the more expected way (my excuse was that I'd never used it before, my senior partner had used it for some 10 years so he doesn't really have one).
• Ru 2011-09-07 06:06
AndyCanfield:
Backup a directory tree and reload and you get a clone of what you started with. Put that directory tree into version control, and check it out again, and it's nowhere near a clone.

The article and indeed preceding comments mentioned this very fact. Do try to keep up.

AndyCanfield:
As far as I can see, "Version Control" means "Source Code Version Control", and it is not yet ready for Web 3.0.

Web 3.0? Now is that a geometric increase in bullshit, or an arithmetic one? I was under the impression that the Next Big Fad was finally implementing the semantic web.

What it certainly seems to be heading towards is a complete reimplementation of an operating system using nothing but javascript and HTML. In this situation, I'd expect file metadata to be in its header in some suitable form, and therefore trivially amenable to source control.
• Ru 2011-09-07 06:12
Not even Microsoft use it internally. Haven't done for years. They've had their own perforce-based thing for a little while (which was awful) but nowadays they've eaten their own dogfood and moved to TFS.

Given that there are lots of lovely tools for migrating out of awful old control systems that are so atrocious even their creators prefer never to look at them ever again, TRWTF would presumably be carrying on using it.
• Part time dev 2011-09-07 06:23
Part-time dev:
Distributed version control systems do not think in terms of bits/paths or file versions.

They work in terms of deltas, and *only* deltas.
Version [GUIDB] is Version [GUIDA] +this, -that.
Yes yes, and time is really just discrete snapshots of the universe as it moves in a direction towards greater entropy. But realistically, we need watches and timezones... and realistically, files are bits/paths and (in revision control) have a history of changes.

Actually, files that are part of an application cannot have a history of changes, because they are almost always inter-dependant.

The file myobject.h v5 probably won't be compatible with myobject.cpp v5.

The repository needs to be able to tell you the state of the *entire* 'fork' at each point in time, so you can pull out myobject.h and myobject.cpp as they both were at a specific point in time.

The core thing is that you shouldn't think in terms of files, you should be thinking in terms of changes made to the whole 'fork'.

Of course, this isn't specific to DCVS against CCVS.

However, it was thinking in terms of 'files' rather than 'fork state' that got VSS and CVS into that mess.
Part-time dev:
It's generally acknowledged that small focused commits are better - less work later if you find something has broken.

A commit should represent a reasonable attempt at implementing a specific task. Thus, it's the *tasks* that should be kept small, not the commits. This is an important distinction -- if tasks are big but commits are kept small, then commits and tasks become further separated.

Yes, that is a very good point.

However, most tasks are divisible - and generally they are easily divisible beyond what a reasonable manager should need to ask a programmer.

"Fix Bug A" is generally a reasonable request.
However, once the programmer gets into the code, they'll probably find that there are several disparate elements that cause the bug that all need to be fixed to properly solve the bug.

So as "Fix Bug A" is now known to actually be several smaller tasks, the programmer should provide these as separate commits.

This is something DCVS makes easy - the programmer doesn't need to ask anyone, doesn't risk losing a lock on the necessary files, or need to wait for a lock on another file once they realise it's important, and doesn't need to merge anything (introducing unknown elements) until the 'big' task of "Fix Bug A" is done.

So DCVS encourages good practice, while CCVS actively discourages it.
• Part time dev 2011-09-07 06:33
Ru:
Not even Microsoft use it internally. Haven't done for years. They've had their own perforce-based thing for a little while (which was awful) but nowadays they've eaten their own dogfood and moved to TFS.

Given that there are lots of lovely tools for migrating out of awful old control systems that are so atrocious even their creators prefer never to look at them ever again, TRWTF would presumably be carrying on using it.
Yes, I'm stuck with it for two projects.

Rational ClearCase is used for others, that's better as it does at least have atomic commits, but it's not much of an improvement and rather complex to use.

Manglement appear to think it would be too difficult to migrate to anything else.
• Anonymous Cow-Herd 2011-09-07 07:21
L.:
(yes, all of you who use MySQL can be included in this if you think innoDB is strictly ACID compliant for example, etc.)

I guess you're including MySQL and InnoBASE in this, since they seem to think InnoDB is ACID-compliant. Eight of the page 1 results for "innodb acid compliant" claim that it is, the other two are a bug report where someone claims that it isn't only to find they're wrong (and by "bug report", I mean "rant that ended up in the bug tracker"), and a MySQL vs PostgreSQL comparison which claims it but doesn't substantiate it. So, we could do with an explanation of why it's not the case, and those external anyone-can-edit sources could do with updating with said same.
• L. 2011-09-07 07:41
Anonymous Cow-Herd:
L.:
(yes, all of you who use MySQL can be included in this if you think innoDB is strictly ACID compliant for example, etc.)

I guess you're including MySQL and InnoBASE in this, since they seem to think InnoDB is ACID-compliant. Eight of the page 1 results for "innodb acid compliant" claim that it is, the other two are a bug report where someone claims that it isn't only to find they're wrong (and by "bug report", I mean "rant that ended up in the bug tracker"), and a MySQL vs PostgreSQL comparison which claims it but doesn't substantiate it. So, we could do with an explanation of why it's not the case, and those external anyone-can-edit sources could do with updating with said same.

The only ones claiming that MySQL is acid compliant, is MySQL / Oracle themselves.

ACID : 'C' compliance means any transaction will bring the database from a consistent state to a consistent state, both of which of course respect every single rule implemented in the system.

Due to the way MySQL treats CASCADE, triggers will NOT be fired on cascade operations, which violates the consistency rule by making a cascaded action bypass triggers which inherently contain consistency rules.

On the same topic, MSSQL's trigger nesting is limited to 32 levels, which implies that in the event that a 33rd trigger should have been fired, the database will be left in an inconsistent state, thus breaking 'C' compliance aswell.

On the exact same topic, PostgreSQL's trigger nesting is NOT limited and their doc states developers should be careful not to create infinite trigger loops.

I do not know Oracle a lot but I would expect it to do the same as Postgres, considering how both are extremely focused on SQL standards, consistency and reliability.

Yes, most people don't care and most people don't notice and most people don't quite understand what ACID means and buy the sticker wether it's true or not, and that is why you can read everywhere that InnoDB is fine - written by people who don't use triggers/cascades/both (at least I hope so ...consequences would be interesting).

On the same ACID topic, for those who are interested, the 'I' is a very interesting beast ;)
• Ru 2011-09-07 08:26
Anonymous Cow-Herd:
L.:
(yes, all of you who use MySQL can be included in this if you think innoDB is strictly ACID compliant for example, etc.)

I guess you're including MySQL and InnoBASE in this, since they seem to think InnoDB is ACID-compliant. Eight of the page 1 results for "innodb acid compliant" claim that it is, the other two are a bug report where someone claims that it isn't only to find they're wrong (and by "bug report", I mean "rant that ended up in the bug tracker"), and a MySQL vs PostgreSQL comparison which claims it but doesn't substantiate it. So, we could do with an explanation of why it's not the case, and those external anyone-can-edit sources could do with updating with said same.

The Wikipedia page seems to suggest that using a BDB backend makes MySQL ACID capable. First I've ever heard of that, though.

Anyhoo, you could listen to http://nosql.mypopescu.com/post/1085685966/mysql-is-not-acid-compliant, if you're bored. Might be a bit outdated nowadays. Some of you may find it familiar...
• Nagesh 2011-09-07 09:05
letseatlunch:
is it just me or is any one else feel that they must be in the twilight zone because this was posted before 8:30?

8:30 is when tdwtf artical is tipicaly publish here in Hyderabad. In U.S. time this is being more aproximate 3:00pm?
• QJo 2011-09-07 09:10
The Poop... of DOOM!:
snoofle:
QJo:
30 hours weekly unpaid overtime for a long time
We've all done that early on in our careers. Once burned, twice shy. Once you learn to see it coming you can make for the exit long before you're exhausted (who wants to show up at a new job the first day - needing a vacation?)

It's the fools who keep doing it over and over that cause management to continue this practice of abusing and then discarding employees.

Been there, done that (showing up at a new job the first day - needing a vacation, due to having been burnt out like that by the previous job). Also had a heavy flu that first week, cause some idiot colleague at the job before found it necessary to come into work while being seriously ill, only to do nothing but moan while hanging over the kitchen sink all day long, every day.

Another real WTF: Not staying at home when you're too ill to go to work.

Especially nowadays when there exists (a) wireless internet and (b) neat little tables which can extend over a sickbed that will easily accommodate a laptop.

Send the email, tell them "WFB".
• QJo 2011-09-07 09:14
Gizz:
...source code control was done on floppy disks. The release code was written to a floppy (5.25") and write protected and put in the fire safe. To comply with BS5750, we also printed the source out on a huge sheaf of paper. As a backup.
Happy days.

IIRC "write protect" was performed by munching a slot out of the cardboard which formed the envelope for the disc. You could un-write-protect it by sticking a pic of opaque tape (insulating tape, gaffa tape, wotever) over the notch.
• The Daily What The Comment 2011-09-07 09:16
Nagesh:
letseatlunch:
is it just me or is any one else feel that they must be in the twilight zone because this was posted before 8:30?

8:30 is when tdwtf artical is tipicaly publish here in Hyderabad. In U.S. time this is being more aproximate 3:00pm?

Hey, Nagesh, do you ever wonder what all those red squiggly lines under your comments means?

CAPTCHA: incassum -- incassum you didn't know, it means you are barely literate.
• jnareb 2011-09-07 09:34
The most important thing that distributed version control (DVCS) bring is the third workflow (besides Check-Out (Lock) - Edit - Check-in (Unlock) and Edit - Merge (Update) - Commit mentioned in article):

* Edit - Commit - Merge

See e.g. "Understanding Version Control" by Eric S. Raymond, or "Version Control by Example" by Eric Sink.
• frits 2011-09-07 09:56
The Daily What The Comment:
Nagesh:
letseatlunch:
is it just me or is any one else feel that they must be in the twilight zone because this was posted before 8:30?

8:30 is when tdwtf artical is tipicaly publish here in Hyderabad. In U.S. time this is being more aproximate 3:00pm?

Hey, Nagesh, do you ever wonder what all those red squiggly lines under your comments means?

CAPTCHA: incassum -- incassum you didn't know, it means you are barely literate.

I believe it's a way Nagesh uses to measure the trolliness of his comments.

There's a certain magical amount of total length of red squigglies that, when achieved, captures the most flames. To short only snags the hardcore spelling snobs, and too long only gets a few morans.
• Jupiter 2011-09-07 09:57
L.:
Anonymous Cow-Herd:
L.:
(yes, all of you who use MySQL can be included in this if you think innoDB is strictly ACID compliant for example, etc.)

I guess you're including MySQL and InnoBASE in this, since they seem to think InnoDB is ACID-compliant. Eight of the page 1 results for "innodb acid compliant" claim that it is, the other two are a bug report where someone claims that it isn't only to find they're wrong (and by "bug report", I mean "rant that ended up in the bug tracker"), and a MySQL vs PostgreSQL comparison which claims it but doesn't substantiate it. So, we could do with an explanation of why it's not the case, and those external anyone-can-edit sources could do with updating with said same.

The only ones claiming that MySQL is acid compliant, is MySQL / Oracle themselves.

ACID : 'C' compliance means any transaction will bring the database from a consistent state to a consistent state, both of which of course respect every single rule implemented in the system.

Due to the way MySQL treats CASCADE, triggers will NOT be fired on cascade operations, which violates the consistency rule by making a cascaded action bypass triggers which inherently contain consistency rules.

On the same topic, MSSQL's trigger nesting is limited to 32 levels, which implies that in the event that a 33rd trigger should have been fired, the database will be left in an inconsistent state, thus breaking 'C' compliance aswell.

On the exact same topic, PostgreSQL's trigger nesting is NOT limited and their doc states developers should be careful not to create infinite trigger loops.

I do not know Oracle a lot but I would expect it to do the same as Postgres, considering how both are extremely focused on SQL standards, consistency and reliability.

Yes, most people don't care and most people don't notice and most people don't quite understand what ACID means and buy the sticker wether it's true or not, and that is why you can read everywhere that InnoDB is fine - written by people who don't use triggers/cascades/both (at least I hope so ...consequences would be interesting).

On the same ACID topic, for those who are interested, the 'I' is a very interesting beast ;)

you must be fun at dinner parties...
• jnareb 2011-09-07 10:05
David C.:
The most important thing, IMO, about any VCS is its ability to make merges as painless as possible.

I wholeheartily agree. It is merging that has to be easy, not only branching.

David C.:
Many offer very little. [...]

Unfortunately, most systems are very bad at this, and every free system I've used is included.

It looks like you didn't use any of modern free (open source) version control systems: Git, Marcurial, Bazaar, etc.

David C.:
Without (hopefully) sounding like an advertisement, I've found that the commercial product, Perforce, is the only one that gets this right. The server tracks a file's entire revision history, through all of its permutations of branches (and there may be hundreds, for some key files belonging to large projects.) When you need to do a merge (which they call "integrate"), the system uses the version history to find the common ancestor between your file and the one you're merging in (even if this common ancestor is separated by dozens of intermediate branches.) It then does a 3-way diff on the files (yours, the one you're merging in, and the common ancestor), presenting all conflicts as all three versions of the conflicting lines. Sections where only one source (yours or the merged-in version) differ from the ancestor are automatically merged without any user intervention. (You can, of course, still review the merged changes and fix any mistakes, which still happen occasionally.)

That is exactly what e.g. Git does (and any version control system that implements merge tracking does), though common ancestor is about version of whole project, not individual file.

I don't know how Perforce does it, and if it does it, but with default "recursive" merge strategy (merge algorithm) that Git uses it can deal with case where there are multiple common ancestors, like e.g. in the case of so called criss-cross merge.

David C.:
With this system (Perforce), you can actually merge any branch into any other branch, not just into direct parent/child branches. The server will track the operations and make the right thing happen, even if the branch/merge history starts looking like a tangled ball of rubber bands.

Same with Git. And all that it requires is for merge commits to remember all its parents...

Nb. with Git you can even merge unrelated branches, e.g. incorporating code which was developed independently as now a part of larger project.
• L. 2011-09-07 10:25
Unfortunately the wikipedia page is full of shit, unsupported pro-mysql crap and stuff (like google uses mysql instead of 'some very minor google apps use mysql'.. Etc.) and I've had the displeasure to witness the bias around it - on the other hand ... wikipedia is mysql based. (and I doubt they even have a single dba in their dev team, when you see the number of dead links to paragraphs long deleted ;) ).

Never tried the BDB or the other one they say should be ACID .. .but if it's as ACID as innodb (which they used to say was acid until i modified it --), ... meh

Nobody talks about it because dba's who see that as a problem are dba's used to better and stricter rdbms's (Oracle,postgreSQL,...) who would at best use mysql as a "cheap" solution, if at all.
• QJo 2011-09-07 10:32
L.:
Unfortunately the wikipedia page is full of shit, unsupported pro-mysql crap and stuff (like google uses mysql instead of 'some very minor google apps use mysql'.. Etc.) and I've had the displeasure to witness the bias around it - on the other hand ... wikipedia is mysql based. (and I doubt they even have a single dba in their dev team, when you see the number of dead links to paragraphs long deleted ;) ).

Never tried the BDB or the other one they say should be ACID .. .but if it's as ACID as innodb (which they used to say was acid until i modified it --), ... meh

Nobody talks about it because dba's who see that as a problem are dba's used to better and stricter rdbms's (Oracle,postgreSQL,...) who would at best use mysql as a "cheap" solution, if at all.

So, we'll be able to see your replacement Wikipedia article on this subject when?

Understanding its limitations, I find Wikipedia a huge asset. I believe it's a Good Thing to correct inaccuracies and mistakes as soon as you see them. Although arguments over matters of opinion based on personal preferences are probably best kept away from, as we all have far too much of that sort of thing to do here instead.
• L. 2011-09-07 10:32
Touché.

On the other hand, when you know not everyone else is as passionate, you still have a chance ;)

And, between "funny" and "interesting", I know which one I'll pick any day -- passion is great.

And if the ACID 'I' sounds uninteresting to you, some of your code surely deserves to stand atop the very frist comment ;)
• L. 2011-09-07 10:38
Wikipedia is a huge asset, surely the main reason I think a good smartphone and 3G are good.

Look at the talk page for mysql, there is already something about that subject and clearly noone wants to see those updates made...
• QJo 2011-09-07 10:59
L.:
Anonymous Cow-Herd:
L.:
(yes, all of you who use MySQL can be included in this if you think innoDB is strictly ACID compliant for example, etc.)

I guess you're including MySQL and InnoBASE in this, since they seem to think InnoDB is ACID-compliant. Eight of the page 1 results for "innodb acid compliant" claim that it is, the other two are a bug report where someone claims that it isn't only to find they're wrong (and by "bug report", I mean "rant that ended up in the bug tracker"), and a MySQL vs PostgreSQL comparison which claims it but doesn't substantiate it. So, we could do with an explanation of why it's not the case, and those external anyone-can-edit sources could do with updating with said same.

The only ones claiming that MySQL is acid compliant, is MySQL / Oracle themselves.

ACID : 'C' compliance means any transaction will bring the database from a consistent state to a consistent state, both of which of course respect every single rule implemented in the system.

Due to the way MySQL treats CASCADE, triggers will NOT be fired on cascade operations, which violates the consistency rule by making a cascaded action bypass triggers which inherently contain consistency rules.

On the same topic, MSSQL's trigger nesting is limited to 32 levels, which implies that in the event that a 33rd trigger should have been fired, the database will be left in an inconsistent state, thus breaking 'C' compliance aswell.

On the exact same topic, PostgreSQL's trigger nesting is NOT limited and their doc states developers should be careful not to create infinite trigger loops.

I do not know Oracle a lot but I would expect it to do the same as Postgres, considering how both are extremely focused on SQL standards, consistency and reliability.

Yes, most people don't care and most people don't notice and most people don't quite understand what ACID means and buy the sticker wether it's true or not, and that is why you can read everywhere that InnoDB is fine - written by people who don't use triggers/cascades/both (at least I hope so ...consequences would be interesting).

On the same ACID topic, for those who are interested, the 'I' is a very interesting beast ;)

If one is not in any position to pick the tools one uses, then it would be prudent for any engineer to learn the limitations of the tool to ensure that none of those limits are exceeded. I will wholeheartedly agree: use of a particular language, tool or environment etc. it not in itself a WTF, but making non-robust tools available for easy use by non-technical personnel definitely is.

This discussion about MySQL is a prime example of this. Without knowing about the limitations on triggers, and without familiarising myself with the ACID compliance (DB admin is not within my main field of expertise), I would have been blissfully ignorant about all this.

However, knowing that more than 32 levels of trigger are a Bad Thing on MySQL, I am a wiser man.

Having said that, my intuition already informs me that 32 levels of trigger is probably a WTF however you cut it.
• Anonymous Cow-Herd 2011-09-07 11:17
L.:
Wikipedia is a huge asset, surely the main reason I think a good smartphone and 3G are good.

Look at the talk page for mysql, there is already something about that subject and clearly noone wants to see those updates made...

I read it as: noone wants to see those updates made without some decent sourcing to back it up. Understandably, they're not interested in "This says A, this says B, and I say A+B=C" - they're attempting an encyclopaedia, after all. As I mentioned earlier, most sources seem to say that InnoDB is ACID compliant, therefore, as far as Wikipedia is concerned, the proposition might not be entirely true but it is the only one that is properly "verifiable" (by their definition).

tl;dr it's not enough to provide the pieces, you need someone else to have completed the jigsaw.
• David C. 2011-09-07 11:27
Poo:
So, have you tried git?
Yes. It's far better than CVS and SVN, and branch creation is trivially simple, but non-trivial merges (e.g. between two lines that don't directly descend from each other, but have a common ancestor) don't always happen cleanly.

And when there are merge conflicts, it does a 2-way diff (yours and the other branch), instead of a 3-way diff (adding in the common-ancestor revision) which sometimes makes it hard to figure out what changed without extra work.

Finally, git has little concept of revision history for individual files, preferring to work instead on the entire repo. So I can't easily get a list of the last 5 changes to a single file, or diff the current version against the previous version. I have written scripts to do this, but I don't think I should have to.
• L. 2011-09-07 11:30
32 levels is MSSQL and no it is not *necessarily* a WTF to have them.
(mysql is the other problem)

I do agree that I haven't yet found a good reason to model anything using trigger chains expanding further than 32, but it makes sense that there would be extremely complex systems for which the best logical model would make extensive use of triggers.
• David C. 2011-09-07 11:36
Part-time dev:
I rather suspect that Alex doesn't understand distributed source control systems like Git and Mercurial. ... One of the really key things I've found with Git is that you never have to 'check-out' a file and two people working on the same file is rarely an issue. ... With VSS, if she gets to the repository before me then I can't do anything. In other ones, it's extremely easy for either of us to accidentally wipe out the other's work. (Sync A, Sync B, Commit A, Commit B - A's commit just vanished!)

This feature doesn't require a distributed system.

Perforce (sorry again for sounding like an ad) lets multiple users check out the same file at once. When submitting (i.e. "committing" in git terminology), the first one in has no issues. The others, when submitting are told that the file changed. They then issue a "resolve" command to merge the changes (using a 3-way diff to deal with conflicts) and re-submit once everything is merged satisfactorily.

So your sequence ends up as:
Sync A. Sync B.
A edits. B edits.
Submit A.
Submit B - generating error
Sync B - informing B about changes that need to be resolved
Resolve B - merging the diffs
Submit B - which now succeeds

All this using a centralized server.

Git does the same thing, except that you get the errors and have to perform the merges at "push" time instead of at "commit" time.

The ability to have multiple people editing files at once is critical for any project of non-trivial scope, but the feature can be implemented using centralized servers as well as with distributed systems.
• L. 2011-09-07 11:38
Looks like you read the page.
It's all a matter of subject knowledge:
- If you understand math, you will accept that anyone writes 1+1=2
- If you understand ACID and MySQL info pages, you will accept that MySQL InnoDB is not acid compliant

There is no jigsaw, the information is right there without any modification IF you know the subject.
• Abso 2011-09-07 12:08
Your introduction to source control probably was a lot like mine: “here’s how you open SourceSafe, here’s your login, and here’s how you get your files... now get to work.”

Actually, it was closer to "you don't have to use version control for your class projects, but it's a good idea. Here's how to set up RCS..."

The same prof also recommended learning either vi or emacs. That was a great course.
• swahl 2011-09-07 13:08
QJo:
Having said that, my intuition already informs me that 32 levels of trigger is probably a WTF however you cut it.

This reminds me that "The only numbers that make sense are zero, one, and infinity." I can't remember where I first heard it, but I found a reference:

http://www.catb.org/jargon/html/Z/Zero-One-Infinity-Rule.html

Gee, I guess URLs really do make the system think it's spam.
• Matt Westwood 2011-09-07 15:26
swahl:
QJo:
Having said that, my intuition already informs me that 32 levels of trigger is probably a WTF however you cut it.

This reminds me that "The only numbers that make sense are zero, one, and infinity." I can't remember where I first heard it, but I found a reference:

http://www.catb.org/jargon/html/Z/Zero-One-Infinity-Rule.html

Gee, I guess URLs really do make the system think it's spam.

Robert Ainsley: Bluff Your Way In Maths (a.k.a. The Bluffer's Guide to Maths):
"You will be expected to be something of a professional mathematician at university, and you should choose your image accordingly. There are three sharply defined groups of university mathematicians which we will number 0, 1 and \infty (the numbers 2 and 3 do not, of course, exist in university mathematics)."
• onlyme 2011-09-07 15:34
annie the moose:
You're doing it wrong!

C:\VersionControl
MyProg.201109060900.c
MyProg.201109060904.c
MyProg.201109060915.c

It's so easy.

O worked in one shop where the code was label prog.old, prog.new, prog.bad. The really hard part was finding the correlation between source and binaries. I instituted simple version control ( whatever came native in Unix, don't remember what is was ) and life became sane.
• boomzilla 2011-09-07 16:20
David C.:

So your sequence ends up as:
Sync A. Sync B.
A edits. B edits.
Submit A.
Submit B - generating error
Sync B - informing B about changes that need to be resolved
Resolve B - merging the diffs
Submit B - which now succeeds

All this using a centralized server.

Git does the same thing, except that you get the errors and have to perform the merges at "push" time instead of at "commit" time.

No, it doesn't do the same thing. B gets fully committed using git (or other modern DVCSes). This is a nontrivial difference, since there's no risk of losing B's changes during the merge. This is also one of the problems with svn.
David C.:

The ability to have multiple people editing files at once is critical for any project of non-trivial scope, but the feature can be implemented using centralized servers as well as with distributed systems.

I agree that there's no reason why not, but I'm not aware of one that does it like the modern DVCSes do, including Perforce, at least based on your description.
• matchbox 2011-09-07 17:20
Great article... please post some more on various software engineering topics :)

However i got some serious questions.

1. Can anyone give me a good example where "Branching by Rule" would feel like the right thing to do? I can only come up with "Branching by Exception" examples.

2. The author seems to dislike distributed source control a little but i had one scenario in the past where i wished i had it and i would like to hear your thoughts.
Let's assume i'm on the train (no internet connection) and i want to work on two tickets. I got all the code and environment set up on my laptop so i'm good to go.

With a distributed source control system i would finish the first ticket and commit it locally with some appropriate meta information to close the ticket too.

With a traditional source control system i can either fix both tickets together and commit them together or i could create two branches in advance one for each ticket just to avoid mixing those two together which seems like alot of overhead if they are just small bugfixes.

What are your thoughts on that? How would you approach this situation?
• some dude 2011-09-07 17:27
QJo:
wva:
Why the hell would you want to keep documentation out of the source control system?

What makes a latex/org/... file different from a c/py/.. file? In both cases you want to track and merge changes and see who did what when, and branch as ideas are tried or different "stable" versions are needed...

Documentation is usually written in Word...

Then you're doing it wrong. Way wrong.
• matchbox 2011-09-07 17:34
some dude:
QJo:
wva:
Why the hell would you want to keep documentation out of the source control system?
What makes a latex/org/...

Documentation is usually written in Word...

Then you're doing it wrong. Way wrong.

Documentation is usually written in Doxygen or Javadoc i'd say. Don't your mean specification and requirements? Which is reasonable to write in Word since it's read by non-developers too.
• Anonymous Coward 2011-09-07 18:55
David C.:
Finally, git has little concept of revision history for individual files, preferring to work instead on the entire repo. So I can't easily get a list of the last 5 changes to a single file, or diff the current version against the previous version. I have written scripts to do this, but I don't think I should have to.

You are correct about git having little concept of individual files. The upside is that it makes it easy to follow code that is moved around between files. This does not however make it hard to follow individual files. Are you are running an ancient version of git? It has been easy to track file changes for years.

> git clone https://github.com/git/git.git
> cd git
Change log for a file:
> git log url.c
Changes between current and another commit for specific file:
> git diff 3793a url.c
View changes in a gui:
> gitk url.c

matchbox:
1. Can anyone give me a good example where "Branching by Rule" would feel like the right thing to do?

As a rule of thumb, you should Branch By Rule when most of the releases are considered exceptional under Branch by Exception. There are a lot of scenarios for this, but here's one: 50 developers split into 6 teams that maintain a large, application that's released on a monthly basis. Each team would work on a feature that's planned for a release 2-, 3-, 4-, 5-, or 6-months out.

matchbox:
2. The author seems to dislike distributed source control a little

I'm more frustrated by the buzz and excitement about it. The "delayed merging" and "easy shelving" has existed in propietary systems (Perforce, Accurev, etc), and could have easily been added to Subversion clients.

Heck, Subversion could have even added the one benefit of distributed systems (offline history viewing), but instead, we just started from scratch again with Git/Mercurial/etc.

It's like sea mammals: one step forward (underwater, woo hoo!), several steps back (no gills).
• Mr.'; Drop Database -- 2011-09-07 21:42
David C.:
Yes. It's far better than CVS and SVN, and branch creation is trivially simple, but non-trivial merges (e.g. between two lines that don't directly descend from each other, but have a common ancestor) don't always happen cleanly.

And when there are merge conflicts, it does a 2-way diff (yours and the other branch), instead of a 3-way diff (adding in the common-ancestor revision) which sometimes makes it hard to figure out what changed without extra work.

Finally, git has little concept of revision history for individual files, preferring to work instead on the entire repo. So I can't easily get a list of the last 5 changes to a single file, or diff the current version against the previous version. I have written scripts to do this, but I don't think I should have to.
Git is considered to be good at merging, so I'm not sure what specific issues you've been having. I think there's a bit of terminology confusion here: a three-way diff means that the merging code compares both versions to an ancestor version (all source control systems work this way), but it doesn't necessarily mean that those ancestor lines are displayed with a merge conflict.
Git can be configured to display the ancestor lines (git config --global merge.conflictstyle diff3) but it's unfortunate that this isn't the default.

For viewing the last 5 changes to a specific file, try git log -n 5 -p filename.txt
• Luiz Felipe 2011-09-07 22:08
L.:
Luiz Felipe:
The poop... of DOOM!:
Paratus:
The poop... of DOOM!:
The "Real" WTF:

6000 - using ACCESS as a database

7000 WTFP for using VB
7000 WTFP for using PHP

VB and PHP are certainly RWTFs, but there's no way that they're worse than using Access.

He said using Access as a database, so you can combine that.

A PHP application calling an Access database would result in 13000 WTFP (and a developer who's been committed to a mental hospital)

20000 Using firebird/interbase (its worse than access).

Access is a little db for simple use, its not WTF to use in correct situation, but its easy to abuse. There nothing wrong in using simple rdbms.

Firebird is crap, access can sustain more records and users.

Access is total crap, there is no valid reason to use Access instead of MySQL (which already is a simple rdbms that sucks a lot).
I do agree that for very simple and basic db use, one can stick to mySQL or other half-assed dbms's, but it is also clear that a LOT of these cases are misunderstood.

I.E. developpers who know nothing about SQL think it's only good to store objects in a table, thus take no advantage of the tool and thus design an application that uses little or no features which IS a WTF in itself, for using the wrong tools for the job.

I'm not a DBA and I'm quite surprised to see how much other devs have no clue about SQL in general (yes, all of you who use MySQL can be included in this if you think innoDB is strictly ACID compliant for example, etc.) - in the end, know your tools and use them right, also remember some tools are USELESS for some projects, there is NO using them right (like access for anything or MySQL for complex applications).

In the end, the only good ones are and will be those who try to do better every single time, spend time reading and learning all they can (and posting their own fails on tdwtf for our enjoyment).

I agree that developers dont know SQL, and that mysql isam is not ACID.

But access is not bad, its only a tiny database.

Except that outlook uses access (its a variant of blue jet that access use), and it work good (when you dont have more than 2GB database of craper emails, then you need exchange).

Also, the windows instaler uses access to install the entire windows, office, and visual studio, and whatever uses msi installer, uses the red jet, its a variant of blue jet.

Access (JET) is full ACID compliant, except when you have more than +-512 locks (limited by filesystem), it will broke. It suports SQL also.
Crap is what people has done with it. partly because classic asp and "webdevelopers" that think they can use a database, these people ruins everthing to shit.

People like to blame thinks that they dont know.
I use JET to store temporary transactions when my (of my client) central sql server goes off-line because of lan/net problems (cheapo equipment, and eletromagnetic interference). For this purpose its is very god at it.

I agree also that JET is useless for most softwares, because its so simple and limited. But you cannot consider a knife to be a worse thing if you need to cut a tree.

Sorry, my english is poor, its not my native.

• QJo 2011-09-08 04:21
matchbox:
some dude:
QJo:
wva:
Why the hell would you want to keep documentation out of the source control system?
What makes a latex/org/...

Documentation is usually written in Word...

Then you're doing it wrong. Way wrong.

Documentation is usually written in Doxygen or Javadoc i'd say. Don't your mean specification and requirements? Which is reasonable to write in Word since it's read by non-developers too.

Silly (and possibly deliberate for the sake of being obstreperous) misunderstanding. Javadocs are actually part of the code and are generated automatically and dynamically. As such, this documentation is, by default, part of the source code itself and this aspect of the documentation is subsumed into the source code version control system.

By "documentation" in the context of "what ought to be stored in the document version control system", we are talking about standalone documents, which are written either by or for the customer, which define what the application is supposed to do in the first place. It consists of things like invitation to tender, project initiation documents, purchase agreements, project plans, records of meeting minutes, technical architecture documents, business requirements, technical requirements, migration strategies, UAT strategies, and so on.

I would be prepared to agree that writing it in Word (and Excel) is "wrong. Way wrong" except that in order to be able to do business with our potential customers <b>at all</b> we need to be able to generate and receive documentation in such a format as the customer is prepared to work with. In every single project in which I have been involved, at least <b>some</b> documentation is written using Word and Excel.

Nasty as this is, it is a business truth which is ultimately futile to try and circumvent.

Here endeth the lesson.
• L. 2011-09-08 04:42
No, seriously Access IS a WTF, it has numerous fails and you just said it's not acid compliant either, it's under windows (lol ?? windows server is a WTF too) and you have numerous less limited alternatives.
• The Poop... of DOOM! 2011-09-08 05:33
L.:
No, seriously Access IS a WTF, it has numerous fails and you just said it's not acid compliant either, it's under windows (lol ?? windows server is a WTF too) and you have numerous less limited alternatives.

I see your Windows server and I raise you an OSX server. All this: "It's supposed to be for end-users with no technical knowledge" and "it just works" bullcrap and then they make it into a server. You get a "genius" to pop a DVD into the drive and then you got a server, fully secure and set up to your specific needs? Rightoh!
• Ru 2011-09-08 05:57
Heck, Subversion could have even added the one benefit of distributed systems (offline history viewing), but instead, we just started from scratch again with Git/Mercurial/etc.

Fixing old systems is not the Open Source Way. Start by assuming everything made before now is utterly compromised and kludged by well meaning but clueless engineers trying to fix fundamental problems that simply can't go away without rearchitecting.

Sometimes it is even true.

It's like sea mammals: one step forward (underwater, woo hoo!), several steps back (no gills).

Gills are awesome if you're not endothermic. Otherwise, they're a bit like running your blood supply through a pair of bloody great heatsinks.

Tuna and sharks and probably other species have some sort of awful hack in the form of clever heat exchangers that let them have a body temperature a couple of degrees above ambient (which enables them to be a bit more energetic than other species) but it isn't going to be any more than a half-arsed attempt at fixing a fundamental problem with the architecture ;-)

Sea mammals on the other hand get to be quite adaptable to a range of temperatures and environments, they can be very effective predators and they get to have penetrative sex. That's a bit of a killer app, as I'm sure you'll agree.
• Gibbon1 2011-09-08 06:29
gnasher729:
If you had worked 40 hours a week and told them for 18 months that everything was going to plan, they would have got exactly what they paid for, you would have enjoyed those 18 months a lot more, and you would have found a new job just the same.

You could ask yourself what would Wally do? Wally would do nothing other than write status reports for 18 months. Or you could do what a friend did back in the day, write status reports while working a contract job under their noses (Double Tap).
• L. 2011-09-08 06:49
The Poop... of DOOM!:
L.:
No, seriously Access IS a WTF, it has numerous fails and you just said it's not acid compliant either, it's under windows (lol ?? windows server is a WTF too) and you have numerous less limited alternatives.

I see your Windows server and I raise you an OSX server. All this: "It's supposed to be for end-users with no technical knowledge" and "it just works" bullcrap and then they make it into a server. You get a "genius" to pop a DVD into the drive and then you got a server, fully secure and set up to your specific needs? Rightoh!

Alright you win ... damn OSX. Is there really anything that could compete with it ??
• QJo 2011-09-08 07:19
Gibbon1:
gnasher729:
If you had worked 40 hours a week and told them for 18 months that everything was going to plan, they would have got exactly what they paid for, you would have enjoyed those 18 months a lot more, and you would have found a new job just the same.

You could ask yourself what would Wally do? Wally would do nothing other than write status reports for 18 months. Or you could do what a friend did back in the day, write status reports while working a contract job under their noses (Double Tap).

Looking back on it, I think the reason I stuck with it so long was that I was actually enjoying the challenge, and it appeared at the time to be an opportunity to add some proper quality. Unfortunately that sort of relentlessness eventually takes its toll and you change your attitude towards it.
• G 2011-09-08 08:37
Wouldn't it be great to have a place where the Git/Maven/Hibernate/you name it fanboys would not pollute everything with their belief confessions?

Lack of meaning in your work? Finally something you feel you are ahead with?

That alone make me dislike Git (not too mention its terrible user interface, the lack of handling empty folders, partial checkouts, and pushes,... )

• Zygo 2011-09-08 10:53
I've worked on both proprietary and open-source projects, and I've had to submit patches to someone else and hope they're ultimately included in the shipping product (and fix them and resubmit if they're rejected) in both cases. It's called code review, and proprietary developers do it too.

The main difference between open-source and proprietary development that I've seen is that proprietary products have multiple standards for quality and use different standards on different branches in the same SCM repo. If a customer is paying for an ugly hack, they may get such a hack in their branch, while the same patch might be rejected by the main product team--but that's just like any patch for an open-source project that gets shipped in a product somewhere, but isn't merged upstream.

It's possible to set up a DVCS as a drop-in replacement for a non-distributed SCM, but doing so wastes the opportunity to do process flow improvements that DVCS can enable. Since a DVCS 4th-dimension object can physically live anywhere, there's no reason why integration, build, QA, production, custom development services, and major product revision branches can't have their own repos with stars of users around them--and plenty of reasons why they shouldn't all necessarily share one giant churning burning repo, no matter what SCM you're using.

No one should start a new project on SVN today. Subversion doesn't just need a central server with excellent network connectivity--it needs a central server 150 to 200 times the size of the equivalent git server for program source code, and that server needs a low-latency network link to its users as well as a high-bandwidth one. If you have a large product, that kind of waste puts stress on every system near it, from storage to backups to IT hardware budgets to network operations staff.
• frits 2011-09-08 11:40
Isn't it advisable to avoid shiny-new-toy syndrome when it comes to source and revision control?
• David C. 2011-09-08 12:03
boomzilla:
David C.:

So your sequence ends up as:
Sync A. Sync B.
A edits. B edits.
Submit A.
Submit B - generating error
Sync B - informing B about changes that need to be resolved
Resolve B - merging the diffs
Submit B - which now succeeds

All this using a centralized server.

Git does the same thing, except that you get the errors and have to perform the merges at "push" time instead of at "commit" time.

No, it doesn't do the same thing. B gets fully committed using git (or other modern DVCSes). This is a nontrivial difference, since there's no risk of losing B's changes during the merge. This is also one of the problems with svn.
It really is the same thing. When you push your changes to the parent repo, you need to merge your changes with the other changes, and resolve conflicts.

The distributed systems preserve individual local changes because each person works with a local copy. Effectively, every client is a separate "shelf" branch.

A centralized system can do the exact same thing if every developer creates his own private branch. He can periodically merge the parent into his branch, similar to a "git pull", commit his changes to his branch without conflict, and then merge his changes back to the parent, resolving conflicts, similar to a "git push".

Same functionality, different command sequence. This applies equally well to any VCS that allows developers to easily create/merge branches at will, whether they are distributed or centralized.
• David C. 2011-09-08 12:24
Mr.'; Drop Database --:
Git is considered to be good at merging, so I'm not sure what specific issues you've been having. I think there's a bit of terminology confusion here: a three-way diff means that the merging code compares both versions to an ancestor version (all source control systems work this way), but it doesn't necessarily mean that those ancestor lines are displayed with a merge conflict.
Git can be configured to display the ancestor lines (git config --global merge.conflictstyle diff3) but it's unfortunate that this isn't the default.

For viewing the last 5 changes to a specific file, try git log -n 5 -p filename.txt
Thanks. I didn't know you could make it show the ancestor lines with conflicts. IMO, that makes it much much easier to resolve said conflicts. I assumed, because it wasn't showing the ancestor, that it didn't use it in the merge process either. I'll add this configuration option to my current git clients (My current work involves git, which has been quite a learning curve, compared to Perforce, which my previous project used. But not nearly as scary as ClearCase, which is also used here.)

WRT viewing changes, what I'd like to do is what I frequently did with Perforce. Over there, I could type "p4 diff foo#4" and it would show me the diffs between the current version and the fourth commit on the current branch.

"git log -n 5" shows me the most recent five commits, but not the diffs.

I wrote a perl script to give me the functionality I like. My "gitediff" script allows me to type "gitediff foo.c#-5" which will copy the fifth-most-recent commit to a temporary file, launch emacs with that and the current file, and start an "ediff" to let me compare them.

The script for this was not hard to write, but it wasn't trivial either. It does a "git log" to get the commits for a file, and numbers them. Then it counts the number of edits provided as an argument to get the commit ID string for that revision, then does a "git show" to extract the file before handing it off to emacs.

It's about a 230 line script for producing all kinds of diffs using git:

gitediff foo - compare foo against the latest committed version (HEAD)

gitediff foo#<ver> - compare foo against a specified version

gitediff foo#<ver1>#<ver2> - compare two revisions of foo

where <ver> may be either an integer - representing a sequential commit on the current branch, or a negative integer - representing the most recent "nth" version, or a git commit string.

I had to write the logic for this because the built-in syntax is repo-based instead of file-based. For example, HEAD~5 shows the file as it was in the fifth-most-recent commit, even if the file in question didn't change since then. In contrast #-5 (in my script's syntax) represents the fifth-most-recent change to the specified file, even if that change took place hundreds of commits ago.

If people are interested, I can post this script for others to enjoy. Or maybe people can point out an easier approach to the problem.
• some1 2011-09-08 13:28
frits:
Isn't it advisable to avoid shiny-new-toy syndrome when it comes to source and revision control?

Isn't it advisable to always avoid the shiny-new-toy syndrome unless there is some justifiable benefit excluding "it's new" and "it's shiny" ;). Guess that's why i'm still happy with xp. With CVS->SVN there was some real benefit. With SVN->GIT i guess it depends on the project.
• Chris 2011-09-08 14:59
even the lamest source control system (*cough*SourceSafe*cough*) will far outperform a Mercurial set-up with a bunch of haphazard commits and pushes

In other words, you can write Fortran in any language.
• Matt Westwood 2011-09-08 16:45
Chris:
even the lamest source control system (*cough*SourceSafe*cough*) will far outperform a Mercurial set-up with a bunch of haphazard commits and pushes

In other words, you can write Fortran in any language.

Except COBOL, which of course isn't powerful enough.
• Mr.'; Drop Database -- 2011-09-08 19:19
David C.:
WRT viewing changes, what I'd like to do is what I frequently did with Perforce. Over there, I could type "p4 diff foo#4" and it would show me the diffs between the current version and the fourth commit on the current branch.

"git log -n 5" shows me the most recent five commits, but not the diffs.
You need the -p flag to make "git log" show diffs. It'll show one separate diff for each commit. I don't think there's a one-liner to view the combined diff across those versions though, short of specifying the file name twice and using shell trickery:
git diff \$(git log -n 5 --pretty=format:%H filename.txt | tail -n 1) HEAD filename.txt

Which I suppose is part of what your script does. So you're right, git doesn't provide the best tools for that sort of thing.
• jnareb 2011-09-09 04:44
David C.:
Poo:
So, have you tried git?
Yes. It's far better than CVS and SVN, and branch creation is trivially simple, but non-trivial merges (e.g. between two lines that don't directly descend from each other, but have a common ancestor) don't always happen cleanly.

And when there are merge conflicts, it does a 2-way diff (yours and the other branch), instead of a 3-way diff (adding in the common-ancestor revision) which sometimes makes it hard to figure out what changed without extra work.

I think you meant here that it does not by default display the ancestor version in merge conflict markers, because Git always use 3-way merge when merging branches. You can make Git to include ancestor version either by configuring it by setting merge.conflictstyle config variable to "diff3", or run git checkout --conflict=diff3 file.

David C.:
Finally, git has little concept of revision history for individual files, preferring to work instead on the entire repo. So I can't easily get a list of the last 5 changes to a single file, or diff the current version against the previous version. I have written scripts to do this, but I don't think I should have to.

You can: git log file, git diff HEAD^! -- file (see git-log manpage for details on history simplification wrt former).
• boomzilla 2011-09-09 07:45
David C.:

boomzilla:

No, it doesn't do the same thing. B gets fully committed using git (or other modern DVCSes). This is a nontrivial difference, since there's no risk of losing B's changes during the merge. This is also one of the problems with svn.
It really is the same thing. When you push your changes to the parent repo, you need to merge your changes with the other changes, and resolve conflicts.

Are you being deliberately dense? It truly isn't the same thing.

You do not have to do that merge. It's possible to have multiple anonymous branches (at least with mercurial, and I'd assume for most others, too).

David C.:

Same functionality, different command sequence. This applies equally well to any VCS that allows developers to easily create/merge branches at will, whether they are distributed or centralized.

Yes, it is possible to get the same end result with a lot more work by the users, and assuming that they always follow this pattern. But that's the problem. No one really does (it's just not the way humans function). It's a case of a tool making a common problem easier to solve. So far, I'm not aware of a centralized VCS that does it.
• valid user 2011-09-09 08:49
• anon 2011-09-09 10:31
valid user:

After showing how it's done wrong all the time i appreciate they show how to do it right once in a while.
• jnareb 2011-09-09 10:38
matchbox:
1. Can anyone give me a good example where "Branching by Rule" would feel like the right thing to do?

As a rule of thumb, you should Branch By Rule when most of the releases are considered exceptional under Branch by Exception. There are a lot of scenarios for this, but here's one: 50 developers split into 6 teams that maintain a large, application that's released on a monthly basis. Each team would work on a feature that's planned for a release 2-, 3-, 4-, 5-, or 6-months out.

You can find good example of feature branch (lots of branches) approach in last part of Eric Sink Version Control by Example (available on-line). Git development itself makes use of feature branches.

In short: using feature branches allows you to select which features to include in next release, and which are to be postponed.

matchbox:
2. The author seems to dislike distributed source control a little

I'm more frustrated by the buzz and excitement about it. The "delayed merging" and "easy shelving" has existed in propietary systems (Perforce, Accurev, etc), and could have easily been added to Subversion clients.

Heck, Subversion could have even added the one benefit of distributed systems (offline history viewing), but instead, we just started from scratch again with Git/Mercurial/etc.

"Delayed merging" and "easy shelving" is only a subset of workflows that DVCS allow.

Also Subversion made some design decisions which cannot work in distributed system, like global revision numbering (requires central numbering authority), and some bad design decisions, like "branch / tag is copy" (following Perforce AFAIK)... which makes branch deletion and merging complicated, and tags next to useless.

Starting from scratch (like in case of Git) was the only sensible choice.
• jnareb 2011-09-09 10:40
G:
That alone make me dislike Git (not too mention its terrible user interface, the lack of handling empty folders, partial checkouts, and pushes,... )

Partial checkouts are available in modern Git (though not partial clone).

I don't know what you meant by ", and pushes,..." there.
David C.:
It really is the same thing. When you push your changes to the parent repo, you need to merge your changes with the other changes, and resolve conflicts.

The distributed systems preserve individual local changes because each person works with a local copy. Effectively, every client is a separate "shelf" branch.

A centralized system can do the exact same thing if every developer creates his own private branch. He can periodically merge the parent into his branch, similar to a "git pull", commit his changes to his branch without conflict, and then merge his changes back to the parent, resolving conflicts, similar to a "git push".

Same functionality, different command sequence. This applies equally well to any VCS that allows developers to easily create/merge branches at will, whether they are distributed or centralized.

It is not the same thing, as you are not forced to push after each one commit.

Or how will centralized system solve this:
I am far away from workplace (maybe on my vacation in other country) with my NB set up, but without (reliable/usable/any) internet connection. There is a big problem with deadline soon after my leave ends, so there is no time to solve it AFTER return to work, but there is a time to connect to company net before that deadline.

The problem consist of couple small unrelated subproblems.

With git I would simply make a branch for every such subproblem, work in the branch, make a lot of commits, evetually even merge all those branches together in the end. Then after return to work, I would just merge master from server, resolve conflicts (if any) and push back - maybe only seconds of work.

With centralized VCS I would probabelly be stuck on the same start with making all those branches, not talking about unability to commit in those branches. (and to eventually undone some bad decisions as VCS allow, when it works).

Just because of missing connection I am effectivelly losing better half of the functionality of VCS on centralized systems.
• David C. 2011-09-12 12:28
It is not the same thing, as you are not forced to push after each one commit.

Or how will centralized system solve this:
I am far away from workplace (maybe on my vacation in other country) with my NB set up, but without (reliable/usable/any) internet connection. There is a big problem with deadline soon after my leave ends, so there is no time to solve it AFTER return to work, but there is a time to connect to company net before that deadline.

The problem consist of couple small unrelated subproblems.

With git I would simply make a branch for every such subproblem, work in the branch, make a lot of commits, evetually even merge all those branches together in the end. Then after return to work, I would just merge master from server, resolve conflicts (if any) and push back - maybe only seconds of work.

With centralized VCS I would probabelly be stuck on the same start with making all those branches, not talking about unability to commit in those branches. (and to eventually undone some bad decisions as VCS allow, when it works).

Just because of missing connection I am effectivelly losing better half of the functionality of VCS on centralized systems.
You don't have to merge after each commit with a centralized system either. It's only required if everybody is working from the same branch.

Your distribution is simply creating extra branches behind the scenes. There are no semantic differences.

You talk about your git solution being to create a bunch of branches for your tasks. Centralized systems all allow this. I do it all the time. And when you're done with a branch, you merge it back to the parent, which will make you resolve conflicts.

As for "not forced to push after each commit", why does that change antying? The server maintains all the branches (including the ones you create while working). It won't conflict with anything else unless others start working with your branches. The only potential advantage here is not needing the network bandwidth at commit time.

You seem to be hung up on the fact that some centralized systems make it difficult to operate with hundreds or thousands of branches in flight at once. I know that some of the most popular free ones (like CVS) fall over and die under those circumstances, but that's a problem with specific products, not with the concept of a central server.

Distribution lets people commit changes while disconnected from the network, and this is useful for many applications, but it doesn't create any other capabilities.
David C.:
It is not the same thing, as you are not forced to push after each one commit.

Or how will centralized system solve this:
I am far away from workplace (maybe on my vacation in other country) with my NB set up, but without (reliable/usable/any) internet connection. There is a big problem with deadline soon after my leave ends, so there is no time to solve it AFTER return to work, but there is a time to connect to company net before that deadline.

The problem consist of couple small unrelated subproblems.

With git I would simply make a branch for every such subproblem, work in the branch, make a lot of commits, evetually even merge all those branches together in the end. Then after return to work, I would just merge master from server, resolve conflicts (if any) and push back - maybe only seconds of work.

With centralized VCS I would probabelly be stuck on the same start with making all those branches, not talking about unability to commit in those branches. (and to eventually undone some bad decisions as VCS allow, when it works).

Just because of missing connection I am effectivelly losing better half of the functionality of VCS on centralized systems.
You don't have to merge after each commit with a centralized system either. It's only required if everybody is working from the same branch.

Your distribution is simply creating extra branches behind the scenes. There are no semantic differences.
No, it does not. It creates the branches on client, not on server, so I can do it offline. In VCS i cannot do that.

And I can create how many branches I want and do not polute the shared system with them.
David C.:

But only if you have good connection

David C.:
I do it all the time. And when you're done with a branch, you merge it back to the parent, which will make you resolve conflicts.

As for "not forced to push after each commit", why does that change antying? The server maintains all the branches (including the ones you create while working).
Only if I am online.

David C.:
It won't conflict with anything else unless others start working with your branches.
While in DVCS it simply does not conflict, regardless on what others do or do not.
David C.:
The only potential advantage here is not needing the network bandwidth at commit time.
Which allows me to commit as often as I need, whithout caring about resolving some conflict. And modify as much files as I need without making problems to all others.
David C.:

You seem to be hung up on the fact that some centralized systems make it difficult to operate with hundreds or thousands of branches in flight at once. I know that some of the most popular free ones (like CVS) fall over and die under those circumstances, but that's a problem with specific products, not with the concept of a central server.
Also the problem with concept is the connectivity, the space wasted on server and the difficulty of arranging safe cooperation of many developers working at the same project at one time
David C.:

Distribution lets people commit changes while disconnected from the network, and this is useful for many applications, but it doesn't create any other capabilities.
It also allows people to share the changes in more ways, that are possible (or acceptable) in centralized VCS.

Some things can be "emulated" in centralized VCS with a lot of unnecessary work, but usually the centralised VCS are ment to solve different and restricted problem.
Lets say (as happened to me) that there are three programmers working on one system. Then they deside to part, so two will continue on the system and the third will fork the system as different project. They diveded the company they formed and went away.

In DVCS the third simply removed one line of his configuration and was free.

In VCS the third should setup new server (including the need for HW to run it), copy over the full project with all history and then change the configuration to the new server.

But lets continue: then the second parted with the first and had to undergo the same task.

But later the second and third formed another pact and wanted to continue developing together, while keeping their respective history.

In DVCS it just needed one address to configuration and that was all for the setup. I do not know, how to manage that simply in centralized VCS.
• Dahpluth 2011-09-13 05:06
QJo:
wva:
Why the hell would you want to keep documentation out of the source control system?

What makes a latex/org/... file different from a c/py/.. file? In both cases you want to track and merge changes and see who did what when, and branch as ideas are tried or different "stable" versions are needed...

Documentation is usually written in Word (or using a similarly ill-maintainable program) so in a source-control system usually need to be stored in binary form. In such a form it may not be as easy to establish what the differences are between versions.

If you're maintaining your documentation in e.g. TeX, then it may well be more appropriate to use a source-control system for the docs.

Another option is to use a wiki for the documentation.

There is a built-in diff-tool in MS Word (since it can/could do versioning on its own), so if your versioning system supports calling third-party diff-tools, it shouldn't be any problem having the document in source control.
• abbas 2011-09-13 06:27
The "Real" WTF:
6000 - using ACCESS as a database

666000 - using Excel as a database.
Lets say (as happened to me) that there are three programmers working on one system. Then they deside to part, so two will continue on the system and the third will fork the system as different project. They diveded the company they formed and went away.

This is quite possibly the most ridiculous argument in favor of DVCS I've heard to date. DVCS saves a few hours off of the untold hours needed to dissolve a business?

If this is your workflow, then you're not developing software. You're hacking.

There's nothing wrong with hacking, but it's a different world than commercial software development.

This is quite possibly the most ridiculous argument in favor of DVCS I've heard to date. DVCS saves a few hours off of the untold hours needed to dissolve a business?

If this is your workflow, then you're not developing software. You're hacking.

There's nothing wrong with hacking, but it's a different world than commercial software development.

If I remember correctly, then parting the company took less effort, than to setup SVN archive. A less working time too.

But the point was not leaving company, the point was easy forking and merging projects. Could happen even without company to start with :)
• tgape 2011-09-15 00:07
In DVCS it just needed one address to configuration and that was all for the setup. I do not know, how to manage that simply in centralized VCS.

Thanks to a completely in-house version of that state flow, with multiple people in each of the first, second, and third positions, I managed to get the chance to *do* that, using RCS. Admittedly, RCS isn't a centralized VCS, but we migrated to CVS while I was working on that, so it kinda counts. And then first joined up again, along with fourth that they'd hooked up with during the separation.

OMG, it's a pain. It wasn't that difficult to write a script to take a particular branch of one RCS/CVS file, and upload it to another RCS/CVS file at a particular point. However, determining the appropriate point to attach things is not particularly trivial also - especially when there's some collaboration between the groups while they're split.

In some cases, that resulted in one of the later versions being identical - because one group handed a file to the other group, and the other group just took it. In that case, it'd be appropriate to split the tree one is importing at that point, attaching the rest of the tree there, rather than continuing to import into basically the same structure as the original file.

In other cases, that merely resulted in a massive change in one group's tree in a single checkin, that caused it to almost, but not quite, sync with another version in the other group's tree. Ideally, this would be handled as above, but it becomes much more difficult to detect.

Eventually, though, I found the way to do it "simply".

1. Choose which VCS instance gets to keep its head, at least for the moment. This is now the target VCS.
2. Run a process that looks through all of the other VCSs to merge, and all of the stray foo.c.bak1234 type files you have, and sorts them in date order.
3. Starting with the oldest version, programmatically compare it to every version in the target VCS. Commit it as a branch off of the version it most resembles. Using RCS, and preserving date stamps, this required a bit of trickery to get it to accept this, in some instances. But the code was easier to work out than going through thousands of different files and figuring out who had the oldest archive, since each group had originated some script or another and then shared it with the others.

It doesn't sound that simple, but the point was, it was all logic I could get the computer to do for me, so the task that was looming at me looking like it'd take years went away over the course of a weekend. I'm sure there's stuff that could be optimized - you could, for example, store the length of every version, and not bother comparing any two versions where the difference in lengths between versions was greater than the minimum difference found so far. But that doesn't matter; the task is done and I'm not doing it again.

Then, of course, we had the task of figuring out, so what changes did we then want to merge into the main branch... I don't know that any version control system could make that easy.
• Always42 2011-09-18 11:06
I think the discussion around version control has stagnated, at least for DVCS users.

In the beginning the two sides were as follows:

Initially, before they grew weary their naive questions/objections revolved around these old bogies:

"Why?"
"Isn't that complicated?"
"The whole repository?"
"No control?"
"But which one is the true version?"
... etc

And the newly converted distributed zealots (of which I was one):
"How can you not?"
"The work-flow flexibility will change your life."
"Fully functional off-line capability."
"The merging is awesome."
"Easy/free context switching."
"It is so easy to share."
... etc.

Nowadays we have the same two groups but the conversation has changed.

On the centralised/traditional side, or those purporting to be agnostic to sound more credible, folks are on the defensive.
Their articles and view point can be summarise as follows:
Human beings live in 4 'physical' dimensions with different people writing code (for the same application) in spurts through time. This then leads them to obnoxiously try and point out how important it is to have a plan to manage/approach this effectively... ah, duh?
Where they go wrong is in trying to push the theory, based on the above, that what tool you use makes no never mind to you as a developer.

The distributed folks are now board and have moved on with life.
There is an undeniable difference in the experience of authoring code using a distributed VCS that you have to experience to grasp, not as part of a tutorial but as part of real life development.

Come and join us, if your project forces you to use SVN/CVS/TFS, there is git-svn and co' (with similar tools for the other DVCS's). Or you can join an open source project using Git, there is nothing stopping you.