- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Once I forgot to initialize a variable and it sprung a leak. Oops.
Admin
Doesn't this mean he overwrites the pointer data_string with the pointer to "data data data". Since both are of type char*?
Admin
Sounds simple; check the diffs for equal amount of added lines and removed lines?
Admin
This reminded me of something from about ten years ago. We wrote computer games, and all the AI routines, game item properties, etc., were all defined in text files in a special scripting language (that could be used by non-programmers).
One of our junior programmers was working on a level and made a lot of changes to a number of files, and broke something. I did a diff to see where the changes were, and it turned out he'd also gone in and changed the capitalization (which was ignored by our compiler) of basically all the variables and keywords. "if" to "If", "monsterone" to "MonsterOne", that sort of thing. He also changed the space padding a bit.
So the diff showed that every line in every file had changed. We had to roll back all his "work."
Admin
data_string is of type char[15], not char*. In this context, this syntax is a mere syntactical sugar for char data_string[15] = { 'd', 'a', 't', 'a', ' ', 'd', 'a', 't', 'a', ' ','d', 'a', 't', 'a', 0 };. So no, he just initialized an array.
Admin
The change is actually very good. The WTF is not the code change, it's that the change isn't important enough for changing every file right before a release.
The way they worked (that is a WTF itself for most) only works when all the developers know they can't do something like that without telling everybody, and especially right before a release.
Admin
This kind of pattern is not unusual
Admin
Looks like Winston had a good idea, unorderly code is a breeding ground for bugs. He probably fixed quite a few unknown bugs with changes like that, and made tracking other bugs easier.
The argument that he wasted his own time doesn't hold either, because he was new and probably learned a lot about the code by looking through every single file.
I certainly hope he'll get promoted.
Admin
So.
You leave on a work-trip, and give one of the developers the important task to fix a memory leak, in a core application, that is about to be released in a new version. You come back, having sold this application to a bunch of high-profile clients.
After returning in the office, you see massive amounts (carnage) of useless cleanups, and a non-fixed memory leak. I'm sure you'd WTF at that. Why is this so hard to understand?
Admin
True enough, but that pattern requires a pointer and dynamic memory allocation which is apparently outside the scope of Winston's chances.
However, the argument could be made that Winston "changed every single variable to be initialized on the stack" meaning that in the original code, there was dynamic memory allocation and post newbie, there was none.
In that case, the example is woefully inadequate. However, that would account for an extreme change in the size of the executable.
Admin
I forgot to quote this:
Admin
OK, so you hire a guy in part because he offers to tidy up code in his spare time. Then you find that he actually does it. Panic ensues?
Admin
You've got it! The real WTF here is the woefully inadequate example!
Admin
I'm not sure if that's a dig at me or not, but I'm going to lol anyway...
LOL
:P
Admin
Yep, he's a "hatter" all right... a mad hatter!
Admin
Because the writeup doesn't state whether or not the memory leak was fixed. I presume that it was, or there would've (should've) been a line where Winston explained being baffled by that darned bug.
I figured TRWTF is that part where when confronted with these positive changes, they prevented him from ever touching their precious crummy code again.
Admin
captcha: vulputate - who comes up with this kind of stuff ???
Admin
That only applies for static locals, the example from the story wasn't static. Non-static locals will always be placed on the stack and never in the bss, if they were placed in the bss, it would break recursion.
Admin
I'm guessing that a tool he was using to locate the memory leak complained about the uninitialized local variables.
Admin
Global variables can go on the BSS, not stack allocated ones. The actual string will be in .data (or .strings) if it's initialized with either method.
Admin
No, that actually makes sense; when you are tracking leaks you want to account for all allocations, assignments and deallocations, and putting in automatic assignments makes the code that much easier to read and focus on (not everywhere of course, but on the methods that you are studying sure).
Still can't find the WTF, and I'm going on 33 years of coding experience. Either I'll be humbled or... there's none.
Admin
Actually, data_string IS a char*, pointing to data_string[0]. So, he alloated a 15 byte array and gave it the adress of... a 15 byte long static string? Weird...
Admin
Regarding those who just said "Roll it back, no biggie"...
Consider if Winston checked in 20% of the files each day during the two weeks.
Each morning every other developer "refreshed", made (functional) changes, and checked in their code.
If you rollback to before Winston touched it, you also lose all of the other developers work!!!
(Now if Winston had worked on a BRANCH, and then did an atomic merge.......)
Admin
There are a lot of people trying to defend Winston and trying to justify his code changes but I don't think those people quite get it. This was a small team with very limited process. Whilst a lot of people are ready to claim that that is the real WTF, the truth is that this is just reality for a lot of shops. And let's be perfectly honest, the "Process" is what you use to stop the lesser skilled coders from screwing up your codebase. In a small team of highly skilled professionals you truly can do away with the rigidity of the process and actually get things done quicker that way. But it takes discipline, experience and good team skills to do this. Winston rolled into this shop with only experience - no discipline and no ability to work in a close-knit team.
Forget the actual nature of the code changes and forget the fact that this shop didn't have a rigid process. Would any of the professionals who read this site really act like Winston did in this scenario? Does anyone seriously condone this behaviour? The WTF here is the attitude of the likes of Winston, plain and simple.
Admin
Exactly!
When there are promotions to be made, who the f*** thinks it is better to promote the new guy over all the folks who have (presumably) demonstrated a consistent service to the company?
Admin
In a past life/job, whenever I worked on a source file, I removed the zillions of lines of commented out code. Nobody seemed to miss it or even notice it was gone. Partly because nobody else knew how to run diff against past versions.
Admin
I never liked atomic merges. When they blow up there's fallout all over the place, and the whole project becomes unlivable. Ban the atomic merge, stick to conventional merges!
Admin
Under the circumstances, I'd promote the new bozo rather than one of the old bozos.
And if you're going to completely ignore that big red three-alarm bell that goes off when the berk you're interviewing says "I seek out and destroy poorly written code," then it's clearly a management WTF. When will these cretins ever learn? Obviously, they should have promoted him to a job where his access to the repository is read-only immediately. Why waste the intervening two weeks?
(And I'm just as puzzled about this weird code-bloat, and the unsupported claims of stack usage, as everyone else here. I sense over-anonymization. The proposition that an "Enterprise" application is written in C rather than Java is a bit of a give-away.)
Admin
IIUC, that only applies to globals and statics. The variables here are plain old local variables. If this function was called recursively, there might not be enough space in the .bss to contain all the copies of the string - but there should be on the stack.
Admin
For the record, the change in the example is pretty close to the right way to do it. I have never seen anyone strcpy into on stack arrays in that manner. The correct way looks more like char string[] = "data data data"; Maybe that's the WTF, but I doubt it. If I was tracking down a tricky memory leak, I'd change all the fishy data initialisation code as well.
Also, source control tools like CVS let you compare line by line, so touching all the files is not an issue.
The WTF appears to be with the submitter.
Admin
Admin
Brilliant! The first approach we all should take when tackling a difficult problem is a shotgun approach. That way you can just hope that it fixes the problem and don't have to actually identify and prove that root cause has been corrected. It's too much work to actually identify the root cause, write a unit test, and prove that your specific change actually resolves the issue. If you change everything you must have addressed something??? Err, wait...
Admin
This article is awesome. Clearly they read my last post about how one sided these posts always are so they made up one from the other side.
I'll bet if you look hard you'll find the exact opposite of this post... told from the new guys point of view were everyone that works there is an idiot and they do everything wrong and they slacker old guys actually make fun of him for being diligent and fixing their crappy code.
Not to mention the "programmer" that wrote this article doesn't know c++ from his ass.
Admin
The REAL WTF is geeks who think they're funny!
Admin
I don't know what the two CD problem is. Obviously if a piece of software comes on two CDs instead of one, it must be twice as good! The next release had better be on four CDs.
Admin
Let there be two files...
t1.c
t2.c
If we do:
We obtain:
Over 100Kb more for a single variable.
So, yeah, that could quite well mean an additional CD, or ten.
Admin
If this is true then his changes should have made the compiled versions smaller, not larger. He got rid of the uninitialized variables.
Admin
BSS is for global variables, not local (stack-allocated) variables.
Admin
Shame he didn't work on a brance.
Admin
You know, I just tried that (but with 1000 character buffers) and the object sizes were exactly the same. Granted, gcc optimizes for me a bit, but do you think that maybe your import of string.h brought a little too much with it? I'd suggest trying adding a trivial string.h function to both samples to even it out. If that doesn't work, I'd be interested in the output of objdmp.
Admin
Actually I think the bug may have been in Visual SourceSafe. The way it worked back then (and probably today as well) is it scrambles the source file lines and remembers how to put it back together. Think about a strip paper shredder and you put your source code through it so each line is on a different strip. Mix them up using a known algorithm, save to the server and when you get it back you reverse the process. Sadly sometimes VSS forgot how to put it back together.
Oh and the f****** word is not needed to make your point.
Admin
Umm, to those defending Winston, may I remind you that we are told he changed hundreds of files just before a major release. Let's grant for the sake of argument that his change is better style. But you're making another key assumption here: That Winston changed all these hundreds of files, and did not make one single mistake anywhere. Suppose, just suppose, that in all those hundreds of files that he changed, somewhere along the line the old code said, say ...
int x; x=24;
And Winston changed this to:
int x=42;
Surely if you're making many, many changes, some little typo like that is almost inevitable sooner or later. And if you're making many, many changes for stylistic reasons, the odds are that you don't even know what the function of the modules you're changing is. After a change fest like that, one would have to test EVERY function of the system to make sure that nothing had been accidentally broken. I sincerely doubt that Winston did that.
Years ago I had a programmer working for me who would do stuff like this regularly. One time he made major changes just before a release, and it turned out that he had made some tiny typo in one place that introduced a major bug. When I called him on it, he angrily replied, "I only had a few hours to test it before the new release went out! With the amount of changes I made, you couldn't expect me to find a bug like that in only a few hours of testing!" To which, of course, I replied, "That's exactly the point. That's why you should not make wholesale changes that there is no time to test."
Admin
I worked on a brance once, burlap chafes me so. :(
Admin
The difference is the 100000 position array. With the strcpy only the first elements of string size + null terminator change. With the other example ALL elements of the array are initialized, and the ones that aren't initialized by the string are set to 0. Cut the size of the array by 50K and the binary size will probably follow.
It seems like GCC is creating a block of data in the program image for initilization and just copying it directly into memory.
Admin
A lot depends upon your choice of platform, compiler, and compiler flags. You aren't using flags, which isn't very enterprisey (or indeed typical). Checking your results with msys and gcc 3.4.2, they seem very plausible.
However, checking the stripped executables with the magic of emacs, there seem to be an awful lot of nulls hanging around at the end of t1.buf. That might just have something to do with the fact that the compiler "pre-fills" the end of the buffer in t1 with nulls, whereas in t2 it uses a frame or something (god knows; I'm not a compiler freak) to reposition this absurdly large buffer at runtime -- on a simplistic view, moving it physically "out" of the stack, even though it's logically "inside" the stack.
Might just explain the "Over 100Kb more for a single variable," what with buf being declared as an array of 100Kb bytes 'n'all.
Somewhere between this contrived example and the OP reality lies ... TDWTF obfuscation. (Thanks, Mr Eliot.) In the OP, buffers are declared as an array of exactly the right size, which makes this example moot.
I have no doubt that there's some sort of silliness of the variety you postulate going on, but without having a good deal more detail, it really isn't worth even thinking about.
Admin
Especially when multiplied by "thousands of source files" and "every executable and library."
On the other hand...
In other words, if you use the example code in the article, the code actually gets smaller when it's modified that way...which probably just means that the code looks more like JarFil's and not so much like the original article.
Admin
There is actually subtle difference between those two initializations:
For this case: a) int main (int argc, char **argv) { int x; char data_string[15]; ... x = 2; strcpy(data_string,"data data data"); ... } b) int main (int argc, char **argv) { int x = 2; char data_string[15] = "data data data"; ... }
b) is superior! one gets compiler check for the size, always initialized stuff and also clearer code (who likes to read bunch of strcpys??)
For a) int main (int argc, char **argv) { int x; char data_string[100000]; ... x = 2; strcpy(data_string,"data data data"); ... } b) int main (int argc, char **argv) { int x = 2; char data_string[100000] = "data data data"; ... } Now well we have different situation! Let's ignore it would be entirely stupid to allocate such a huge array on stack, but let's inspect what really happens... Because compiler is obliged to zero initialize rest of array in case b) it calls memset OR actually uses prepared content for this variable. This can be usually triggered when compiler is instructed to optimize for speed. Ups - 100kilos larger (and compared to "a" slower) code. So in this case a) is clearly superior.
Admin
Admin
I think we're back to my original point that, what with TDWTF obfuscation (and, to put it kindly, minimal understanding of C), there's nothing much technical to discuss here.
If we're missing 95% of the original technical details, and the other 5% are obscured by this bizarre inability to detect "what's on the heap" and "what's not on the heap," then we're back to the default position. Which is that absolutely everybody concerned, including Chris, isn't quite as good as they think they are.
To put it mildly.
Admin
Man, it must suck for the poor fool in your shop who has to update the copyright notices (or any other proprietary rights notices, e.g. due to change of corporate ownership) on your source files. Do you hire a temp to do that, or do you use this policy to get rid of deadweight every year?
;)