Comment On Recursive Whitespace Removal

C is a double edged sword. On one hand, it's simple and powerful enough that, given enough effort, you can accomplish just about anything you want. However, this power is limited insomuch that you don't have many of the friendly helper-functions that exist in higher level languages, such as string manipulation for example, unless you go ahead and create them yourself. [expand full text]
« PrevPage 1 | Page 2 | Page 3 | Page 4Next »

Re: Recursive Whitespace Removal

2012-12-19 09:45 • by @Deprecated
397568 in reply to 397561
Steve The Cynic:
Xarthaneon the Unclear:
Personally, I'd just do something like this:

char* trim_right(char* str)

{
for(int i = strlen(str) - 1; i >= 0; i--)
if(str[i] == ' ')
str[i] = '\0';

return str;
}


I realize it's not as graceful - or concise in terms of lines of code - as the given anti-example, but I can guarantee it will be faster, and those spaces will be replaced with a NUL like the original code intended.

CAPTCHA: causa - the original might causa problem!

It's not as graceful, and it's also wrong. strlen() does not return int, and is not obliged to return something the same size as int (someone cited 64-bit systems with 32-bit ints as an example). And even if strlen returns an int-sized unsigned value, you have a problem if that value is > INT_MAX+1. This happens at just 32769 characters on a 16-bit system...

C++ is often cited as taking your whole leg when it does succeed in shooting you in the foot. C is quite capable of that, too, and lacks many of the elaborate safeties in C++.


Also, for the input "What The F \0", the original code will return "What The F\0\0" whereas this code will return "What\0The\0F\0\0", or just "What" if you print it out or something.

Re: Recursive Whitespace Removal

2012-12-19 09:47 • by jarvis (unregistered)
397569 in reply to 397536
chernobyl:
Well, how much whitespace characters do you expect? One? Two? A billion? The programmer has probably taken into account the kind of data fed into the function and has decided that this approach would work.


And in a nutshell that's why we have functioning exploits in the wild!

Re: Recursive Whitespace Removal

2012-12-19 09:53 • by gallier2 (unregistered)
397570 in reply to 397565
You just made it worse. Now the compiler hasn't even the chance to give you the warning of the downcast.
The correction to your code is to declare your i as a size_t. That's what it was invented for. It's in the standard, even built-in language operators like sizeof use that type (that is imho one of the few wtf of the C language, an operator returning a type, which has its definition in an external header file "stddef.h").

As for systems where sizeof (size_t) > sizeof (int) you can count all 64 bit systems (Solaris, Linux, *BSD and Windows).
Microcontrollers with Harvard architecture also often have int in the size of the RAM address size (16 bits) and size_t in the size of the ROM address size (24 bits)

Re: Recursive Whitespace Removal

2012-12-19 09:55 • by Dee (unregistered)
Apart from the buffer underwrite described above, I think tail recursion optimization would make this an acceptable solution.

Re: Recursive Whitespace Removal

2012-12-19 09:57 • by PleegWat (unregistered)
[code]
void trim_right( char * str )
{
char * p = NULL;
while( *str )
{
if( p && !isspace(*str) )
p = NULL;
if( !p && isspace(*str) )
p = str;
}

if( p ) *p = '\0';
}

Re: Recursive Whitespace Removal

2012-12-19 09:59 • by RichP
397573 in reply to 397520
Not That Paul:
C: All the power of Assembly with most of the flexibility.


and twice the readability!

Re: Recursive Whitespace Removal

2012-12-19 10:04 • by CoderHero (unregistered)
397574 in reply to 397536
chernobyl:
Well, how much whitespace characters do you expect? One? Two? A billion? The programmer has probably taken into account the kind of data fed into the function and has decided that this approach would work. The function is a bit unusual and sloppy but I would not call it a WTF.


It would be problematic if this were coming from a columnar data feed where the data has 300 columns, and the content is the word "hi".
But then again, as long as the stack is big enough, no problem right?

Re: Recursive Whitespace Removal

2012-12-19 10:04 • by Tim Ward (unregistered)
397575 in reply to 397516
Maybe they also learned about compilers that do tail recursion optimisation.

But I would agree that unless you are knowingly coding for a specific compiler for a good reason you shouldn't rely on this, and that this function isn't a good example of why you might want to do so.

Re: Recursive Whitespace Removal

2012-12-19 10:06 • by eVil (unregistered)
I'd just like to point out that tail recursion is fine for Haskell programmers for a number of reasons:

1) The entire point of Haskell, as a functional language, is to allow recursion and other similarly functional constructs.

2) Haskell will handle long chains of recursion gracefully, without spitting stack overflows at you.

3) They chose Haskell, because they're not hugely worried about performance.


None of these are appropriate reasons to use tail recursion in C++ however.

Re: Recursive Whitespace Removal

2012-12-19 10:06 • by Rick
397577 in reply to 397554
Mordred:
C does guarantee order of evaluation.

a = ++b + func(b); // evaluation order is specified
a = func(++b, b); // evaluation order is NOT specified

Re: Recursive Whitespace Removal

2012-12-19 10:14 • by gallier2 (unregistered)
397578 in reply to 397572
PleegWat:

void trim_right( char * str )
{
char * p = NULL;
while( *str )
{
if( p && !isspace(*str) )
p = NULL;
if( !p && isspace(*str) )
p = str;
}

if( p ) *p = '\0';
}



You forgot a -- somewhere and I don't think that overwriting p with NULL after you found your first non blank is a good idea either. Your code to write 0 would then not trigger.

But in a code review, I would reject it because it is too complicated for what it does. The multiple writes when blanking a lot of trailing spaces will hit the "write coalescing buffers and caches" that every modern CPU has nowaday. Only in the extreme case were writes were order of magnitudes more costly than reads would I consider such an approach.

Re: Recursive Whitespace Removal

2012-12-19 10:21 • by Steve The Cynic
397579 in reply to 397565
Xarthaneon the Unclear:
But, while I agree my rewrite is not as graceful, other than the strlen issue, fixed as such:

char* trim_right(char* str)

{
for(int i = ((int)strlen(str)) - 1; i >= 0; i--)
if(str[i] == ' ')
str[i] = '\0';

return str;
}


how is it wrong?

It's wrong for three main reasons:

1. if strlen() returns a wider type than int (e.g. 64-bit size_t, 32-bit int), and the string's length exceeds the width of int (e.g. a 65KB string on 16-bit int), then you only look at the remainder when dividing by 2<<bitsof(int) (e.g. 1K for a 65K string on 16-bit).

2. Regardless of the width of strlen() and int, if the string is INT_MAX+2 or more (but less than UINT_MAX+1) characters, you will not modify the string at all because i will be immediately negative.

3. As pointed out above by someone else, your function stomps on all spaces, even the ones with non-spaces to the right, thus turning your function into an inefficient version of truncateToFirstWord().

Re: Recursive Whitespace Removal

2012-12-19 10:28 • by so what (unregistered)
Not sure I see the wtf. This works, and (knowing the financial world fairly well) will be called at most 2-3 times in recursion. Much better than iterating over the entire string to find the first space in the last group of spaces.

At least he didn't do a mem copy every time or something retarded like that.

Re: Recursive Whitespace Removal

2012-12-19 10:34 • by Steve The Cynic
397582 in reply to 397577
Rick:
Mordred:
C does guarantee order of evaluation.

a = ++b + func(b); // evaluation order is specified
a = func(++b, b); // evaluation order is NOT specified

Actually, in the first case, you are wrong. The compiler is allowed to execute func(b) before or after ++b, depending on what sort of capricious mood it is in.

The worst "multiple interpretations of undefined code" I ever saw was this:

unsigned char *bp = /* stuff */;

unsigned char table[16] = { /* values */ };

loop N times
{
*bp++ = (table[*bp & 0x0f] << 4) | table[*bp >> 4];
}


All the compilers we had in the company, for various architectures and operating systems, produced the superficially expected result (look up the values, shift and combine them, store the result, increment the pointer).

All the compilers, except one, that is.

The x86-hosted to-MIPS cross-compiler I was using to bring this code to the controller CPU of an ATM switch (Asynchronous Transfer Mode, not Automated Teller Machine) evaluated bp++ completely (and stashed the before-increment value) before evaluating anything else, then evaluated the right-hand side using the incremented value of bp, then stored the result using the stashed before-increment value.

The contents of the table were designed so that the calculation reversed the bit-order of the bytes, so if it was working correctly and you did it twice to the same memory, there was no net result.

This stuff all added up to something that looked like a symptom of something else, and I was expecting to have that other problem, so it took me three days to track down what was happening.

Fixed:
unsigned char *bp = /* stuff */;

unsigned char table[16] = { /* values */ };

loop N times
{
*bp = (table[*bp & 0x0f] << 4) | table[*bp >> 4];
bp++;
}


Undefined behaviour: Just say, "No."

Re: Recursive Whitespace Removal

2012-12-19 10:51 • by Mike (unregistered)
This comments section looks like it has mostly been written by a bunch of non programming former programmer trolls all pointing and laughing while desperately trying to be offended by the slightest little edge case scenario.

But the fact remains that the world isn't ending and the code is executing with good performance in real time. A minor bug that has probably never manifested has been noticed and should be fixed.

As for recursion; anyone who thinks pushing to a stack is a slow and inefficient operation... now there is the real wtf.

Re: Recursive Whitespace Removal

2012-12-19 11:02 • by Coyne
397585 in reply to 397529
Garrison Fiord:
Even though I was taught recursion in almost every programming class I took, I've found that it's almost always a bad idea. In fact, like the do-while loop, I think I've only found 1 time in my career that recursion was more elegant than a simple loop.


For most RW solutions, yes, because most of them are pretty simple and really don't justify recursion. It's just that recursion is such an amusing hammer.

Factorial, for instance, is often shown in recursive form but is faster, uses less memory, and isn't really more difficult to understand in simple loop form.

But there are RW algorithms where recursion is very useful; one that comes to mind right off is quicksort.

And, of course, there is the inimitable "Towers of Hanoi". A toy problem, yes, but if you want a "head hurts" experience try implementing that without recursion sometime.

Re: Recursive Whitespace Removal

2012-12-19 11:18 • by Anonymous Bob (unregistered)
397586 in reply to 397519
Your Name *:
So what?

Sure, could be more robust, but it is working as intended. Any decent modern C compiler will eliminate the tail recursion anyway.

IMHO this is far from TDWTF material.


Besides the mentioned bugs, they've made a O(n) algorithm into an O(n^2) algorithm.

Also, if you ported this to a small embedded computer with limited resources, you could easily run into a stack overflow. This type of thing happened to me years ago when I ported some 32-bit Unix code to 16-bit Windows.

Re: Recursive Whitespace Removal

2012-12-19 11:24 • by Rnd( (unregistered)
397587 in reply to 397586
Anonymous Bob:
Your Name *:
So what?

Sure, could be more robust, but it is working as intended. Any decent modern C compiler will eliminate the tail recursion anyway.

IMHO this is far from TDWTF material.


Besides the mentioned bugs, they've made a O(n) algorithm into an O(n^2) algorithm.

Also, if you ported this to a small embedded computer with limited resources, you could easily run into a stack overflow. This type of thing happened to me years ago when I ported some 32-bit Unix code to 16-bit Windows.


One without file system?

Re: Recursive Whitespace Removal

2012-12-19 11:28 • by OldCoder (unregistered)
397588 in reply to 397546
Steve The Cynic:

The bug is actually worse than that. Any call where the string does not contain non-blanks (e.g. the empty string, or any number of blanks), and the byte immediately before the string is a blank, will destroy that blank. If the byte immediately before the string does not exist (not mapped, etc.) it will crash when checking to see if the character is blank. If that byte is marked read-only, the write will crash. If the system is a DeathStation 9000, several million people will die in a nuclear fireball.

Ah! Now I know what Jeff Goldblum did to the alien mothership in Independence Day!

He changed their trim_right routine!

Re: Recursive Whitespace Removal

2012-12-19 11:29 • by gallier2 (unregistered)
397589 in reply to 397583
Wrong on the bunch of non programming former programmer trolls, I'm a still C programming programmer troll.

As for the code, first a little context. I work on a project for a big organisation doing translation work, we have a huge system with a lot of components written in several languages (Java, perl, VBA, python, C++) but the core code of the backend was written in plain C. Mostly C90, now modernized in C99 were possible. The codebase was written around 1997 and put in production around 2000. The system runs since then rather successfully.

Now to my example. Last september we got one document that didn't crash in production but returned an erroneous output file. I tracked the error and it was in the part of the code a colleague wrote several years ago (around 2004). It was exactly the same bug as in the WTF of today. The use of an int instead of a size_t in return of strlen, a loop condition that made the program overwrite str[len-1] with len==0. The result was in some extreme conditions, the output file was a little bit garbled. Had my colleague used the size_t, the writing of str[len-1] would have resulted in a "segmentation fault" directly (because len-1 would have yielded ULONG_MAX) and we would have found the bug in 2004 and not delivered crappy files for 8 years.
(to counter directly the naysayers who will say that our quality control failed bla bla bla, an erroneous file wouldn't be noticed by the end user, it would be comparable to a missing blank in a book).


Re: Recursive Whitespace Removal

2012-12-19 11:34 • by Garrison Fiord (unregistered)
397590 in reply to 397585
Coyne:
Garrison Fiord:
Even though I was taught recursion in almost every programming class I took, I've found that it's almost always a bad idea. In fact, like the do-while loop, I think I've only found 1 time in my career that recursion was more elegant than a simple loop.


For most RW solutions, yes, because most of them are pretty simple and really don't justify recursion. It's just that recursion is such an amusing hammer.

Factorial, for instance, is often shown in recursive form but is faster, uses less memory, and isn't really more difficult to understand in simple loop form.

But there are RW algorithms where recursion is very useful; one that comes to mind right off is quicksort.

And, of course, there is the inimitable "Towers of Hanoi". A toy problem, yes, but if you want a "head hurts" experience try implementing that without recursion sometime.

It appears you overlooked the word "almost".

Re: Recursive Whitespace Removal

2012-12-19 11:35 • by Captain Oblivious (unregistered)
397591 in reply to 397525
Lisa:
Doesn't pretty much all of lisp work this way?


Yes, but Lisp and other functional languages do things like "tail call elimination", to turn a recursive scheme into a loop for the machine to run in constant space.

The Real WTF is that C compilers aren't guaranteed to do tail call elimination. Recursion is too good (expressive) a scheme to ignore.

Re: Recursive Whitespace Removal

2012-12-19 11:48 • by Edmund (unregistered)
397594 in reply to 397529
Recursion can be more elegant if the domain you're working in has objects defined in a "recursive" way, or at least a way which lends itself to recursive operations.

Re: Recursive Whitespace Removal

2012-12-19 11:48 • by Moe (unregistered)
I wouldn't trust code written by the programmer in ANY language.

Back, in the .com boom, I was tasked with giving technical interviews to prospective Java programmers. After a half dozen Java questions to judge their skill level, I put in some C questions.

The guys who were "thanked for their time" usually asked a question like this:

"That's C! Why are you asking that in a Java interview!?"

"If you can't write good C, how the hell can you write good Java?"

The ones who nailed the interview didn't need to ask the question -- they knew why. Three weeks later, I got a memo from the exec in charge of software development observing the technical quality of the hew hires.

The bad news is that the company was sucked down the toilet in the .com crash -- but my experience is far from unique in that respect....

Re: Recursive Whitespace Removal

2012-12-19 11:51 • by vahokif (unregistered)
It's tail recursive. gcc -O2 compiles it into a loop.

Re: Recursive Whitespace Removal

2012-12-19 11:54 • by Steve The Cynic
397597 in reply to 397585
Coyne:
And, of course, there is the inimitable "Towers of Hanoi". A toy problem, yes, but if you want a "head hurts" experience try implementing that without recursion sometime.

Ah, yes, Arkey-malarkey... No, it's not even hard to do without recursion.

At any point, there are at most two legal moves. It is always legal to move the smallest disk, because no disk is smaller (and therefore blocking). The other two exposed disks are different sizes, by definition, and you can move the smaller of them onto the larger.

The non-recursive solution is, therefore:

Every other move moves the smallest disk. All the other moves make the other legal move.

The only question, then, is which way to move the smallest disk.

If the goal is to use peg B to move the stack from peg A to peg C, then:
1. If the number of disks is odd, the smallest disk moves A->C->B->A
2. If the number of disks is even, the smallest disk moves A->B->C->A

Example: Four disks (15 moves)

A: 4321 B: C: <== move 1 to B only legal move at this point
A: 432 B: 1 C: <== move 2 to C just moved 1, so make a non-1 move
A: 43 B: 1 C: 2 <== move 1 to C just moved non-1, so make a 1 move
A: 43 B: C: 21 <== move 3 to B just moved 1, so make a non-1 move
A: 4 B: 3 C: 21 <== move 1 to A just moved non-1, so make a 1 move
A: 41 B: 3 C: 2 <== move 2 to B just moved 1, so make a non-1 move
A: 41 B: 32 C: <== move 1 to B just moved non-1, so make a 1 move
A: 4 B: 321 C: <== move 4 to B just moved 1, so make a non-1 move
A: B: 321 C: 4 <== move 1 to C just moved non-1, so make a 1 move
A: B: 32 C: 41 <== move 2 to A just moved 1, so make a non-1 move
A: 2 B: 3 C: 41 <== move 1 to A just moved non-1, so make a 1 move
A: 21 B: 3 C: 4 <== move 3 to C just moved 1, so make a non-1 move
A: 21 B: C: 43 <== move 1 to B just moved non-1, so make a 1 move
A: 2 B: 1 C: 43 <== move 2 to C just moved 1, so make a non-1 move
A: B: 1 C: 432 <== move 1 to C the only legal move now
A: B: C: 4321 <== done

Re: Recursive Whitespace Removal

2012-12-19 11:59 • by silent bob (unregistered)
397599 in reply to 397552
Frankly, my cat loves the dogs breakfast when the it comes up. He really meows in pleasure.

Re: Recursive Whitespace Removal

2012-12-19 12:14 • by Martijn (unregistered)
397601 in reply to 397518
I see no real problem with that. According to http://llvm.org/demo/index.cgi it produces

.file "/tmp/webcompile/_17418_0.bc"
.text
.globl strlen1
.align 16, 0x90
.type strlen1,@function
strlen1: # @strlen1
.Ltmp0:
.cfi_startproc
# BB#0:
xorl %eax, %eax
cmpb $0, (%rdi)
je .LBB0_3
# BB#1: # %tailrecurse.preheader
xorl %eax, %eax
.align 16, 0x90
.LBB0_2: # %tailrecurse
# =>This Inner Loop Header: Depth=1
cmpb $0, 1(%rdi,%rax)
leaq 1(%rax), %rax
jne .LBB0_2
.LBB0_3: # %tailrecurse._crit_edge
# kill: EAX<def> EAX<kill> RAX<kill>
ret
.Ltmp1:
.size strlen1, .Ltmp1-strlen1
.Ltmp2:
.cfi_endproc
.Leh_func_end0:


.section ".note.GNU-stack","",@progbits

vs

.file "/tmp/webcompile/_19359_0.bc"
.text
.globl strlen1
.align 16, 0x90
.type strlen1,@function
strlen1: # @strlen1
.Ltmp0:
.cfi_startproc
# BB#0:
xorl %eax, %eax
movl 4(%esp), %ecx
cmpb $0, (%ecx)
je .LBB0_2
.align 16, 0x90
.LBB0_1: # %.lr.ph
# =>This Inner Loop Header: Depth=1
cmpb $0, 1(%ecx,%eax)
leal 1(%eax), %eax
jne .LBB0_1
.LBB0_2: # %._crit_edge
ret
.Ltmp1:
.size strlen1, .Ltmp1-strlen1
.Ltmp2:
.cfi_endproc
.Leh_func_end0:


.section ".note.GNU-stack","",@progbits

generated by

unsigned int strlen1(char* str)
{
unsigned int res = 0;
while (str[res] != '\0') res++;
return res;
}

Re: Recursive Whitespace Removal

2012-12-19 12:22 • by Captain Oblivious (unregistered)
397602 in reply to 397576
eVil:
I'd just like to point out that tail recursion is fine for Haskell programmers for a number of reasons:

1) The entire point of Haskell, as a functional language, is to allow recursion and other similarly functional constructs.

2) Haskell will handle long chains of recursion gracefully, without spitting stack overflows at you.

3) They chose Haskell, because they're not hugely worried about performance.


None of these are appropriate reasons to use tail recursion in C++ however.


I don't disagree with you exactly, but as a statically typed, compiled language, Haskell is several times faster than dynamic scripting languages, and reaches performance comparable to tuned C.

I know that specific domains require high performance. But Haskell is finding a lot of use in "high performance" industries like finance specifically because correctness is easy to program, performance is good, and the tight inner loops can be written in C when it isn't good enough.

GHC is getting very good at performance, even without going to the FFI.

That said, stay away from Haskell. You guys don't want your minds blown. You want to sit in your cubicle and treat your work like an assembly line. Your bosses want you to type out buggy code which will be fixed "later" (i.e., if it ever becomes a problem customers care about).

http://lukeplant.me.uk/blog/posts/why-learning-haskell-python-makes-you-a-worse-programmer/
http://stackoverflow.com/questions/2704652/monad-in-plain-english-for-the-oop-programmer-with-no-fp-background/13656209#13656209

Re: Recursive Whitespace Removal

2012-12-19 12:27 • by Rick
397603 in reply to 397582
Steve The Cynic:
Rick:
Mordred:
C does guarantee order of evaluation.

a = ++b + func(b); // evaluation order is specified
a = func(++b, b); // evaluation order is NOT specified

Actually, in the first case, you are wrong. The compiler is allowed to execute func(b) before or after ++b, depending on what sort of capricious mood it is in.

The worst "multiple interpretations of undefined code" I ever saw was this:

unsigned char *bp = /* stuff */;

unsigned char table[16] = { /* values */ };

loop N times
{
*bp++ = (table[*bp & 0x0f] << 4) | table[*bp >> 4];
}


All the compilers we had in the company, for various architectures and operating systems, produced the superficially expected result (look up the values, shift and combine them, store the result, increment the pointer).

All the compilers, except one, that is.

The x86-hosted to-MIPS cross-compiler I was using to bring this code to the controller CPU of an ATM switch (Asynchronous Transfer Mode, not Automated Teller Machine) evaluated bp++ completely (and stashed the before-increment value) before evaluating anything else, then evaluated the right-hand side using the incremented value of bp, then stored the result using the stashed before-increment value.

The contents of the table were designed so that the calculation reversed the bit-order of the bytes, so if it was working correctly and you did it twice to the same memory, there was no net result.

This stuff all added up to something that looked like a symptom of something else, and I was expecting to have that other problem, so it took me three days to track down what was happening.

Fixed:
unsigned char *bp = /* stuff */;

unsigned char table[16] = { /* values */ };

loop N times
{
*bp = (table[*bp & 0x0f] << 4) | table[*bp >> 4];
bp++;
}


Undefined behaviour: Just say, "No."

The order is definitely defined. MIPS simply got it wrong. (Wow, haven't heard that name in a while.)

Re: Recursive Whitespace Removal

2012-12-19 12:29 • by Mr.Bob (unregistered)
All these comments about recursion is evil, he's calling strlen() twice, blah blah blah, and no one is pointing out the more serious problems like the lack of checking for a null pointer on entry or how passing an empty string causes the routine to write to invalid memory?

Jeez, it's like griping about how they used Comic Sans to spell "Titanic" on the side of your ship...

Re: Recursive Whitespace Removal

2012-12-19 12:30 • by dnabre (unregistered)
397605 in reply to 397518
Hopefully they're using a standard version, which is more like


size_t strlen(const char *s) {
size_t len = 0;
while(s[len]!='\0') ++len;
return len;
}

or at least a tail-recursive version.

Re: Recursive Whitespace Removal

2012-12-19 12:31 • by towel (unregistered)
397606 in reply to 397516
Black Bart:
When you have a hammer in your hand, everything becomes a nail.


When you have a nail in your head, everything becomes a unicorn.

Re: Recursive Whitespace Removal

2012-12-19 12:34 • by dnabre (unregistered)
397607 in reply to 397575
Most C compilers nowadays will do tail/sibling recursion ok, but either need a high enough -O or a specific flag.

It's reasonable to rely on this, but you have to the right the flags to make sure it happens. Last I looked up it up (back around 2008 I think) the current gcc and microsoft C compiler supported it.

I would assume that LLVM handles it fine.

Re: Recursive Whitespace Removal

2012-12-19 12:37 • by gallier2 (unregistered)
397608 in reply to 397604
Mr.Bob:
All these comments about recursion is evil, he's calling strlen() twice, blah blah blah, and no one is pointing out the more serious problems like the lack of checking for a null pointer on entry or how passing an empty string causes the routine to write to invalid memory?

Jeez, it's like griping about how they used Comic Sans to spell "Titanic" on the side of your ship...


You should read the comments before posting. I mentioned the invalid memory overwrite in the second comment.

Re: Recursive Whitespace Removal

2012-12-19 12:40 • by gallier2 (unregistered)
397609 in reply to 397603
Rick:
Steve The Cynic:
Rick:
Mordred:
C does guarantee order of evaluation.

a = ++b + func(b); // evaluation order is specified
a = func(++b, b); // evaluation order is NOT specified

Actually, in the first case, you are wrong. The compiler is allowed to execute func(b) before or after ++b, depending on what sort of capricious mood it is in.

The worst "multiple interpretations of undefined code" I ever saw was this:

unsigned char *bp = /* stuff */;

unsigned char table[16] = { /* values */ };

loop N times
{
*bp++ = (table[*bp & 0x0f] << 4) | table[*bp >> 4];
}


All the compilers we had in the company, for various architectures and operating systems, produced the superficially expected result (look up the values, shift and combine them, store the result, increment the pointer).

All the compilers, except one, that is.

The x86-hosted to-MIPS cross-compiler I was using to bring this code to the controller CPU of an ATM switch (Asynchronous Transfer Mode, not Automated Teller Machine) evaluated bp++ completely (and stashed the before-increment value) before evaluating anything else, then evaluated the right-hand side using the incremented value of bp, then stored the result using the stashed before-increment value.

The contents of the table were designed so that the calculation reversed the bit-order of the bytes, so if it was working correctly and you did it twice to the same memory, there was no net result.

This stuff all added up to something that looked like a symptom of something else, and I was expecting to have that other problem, so it took me three days to track down what was happening.

Fixed:
unsigned char *bp = /* stuff */;

unsigned char table[16] = { /* values */ };

loop N times
{
*bp = (table[*bp & 0x0f] << 4) | table[*bp >> 4];
bp++;
}


Undefined behaviour: Just say, "No."

The order is definitely defined. MIPS simply got it wrong. (Wow, haven't heard that name in a while.)


No, the order in which the lvalue and the rvalue are evaluated are undefined. The only thing defined is that the assignment is done after these evaluations.

see http://stackoverflow.com/questions/4362501/any-good-reason-why-assignment-operator-isnt-a-sequence-point
for a detailed explanation.

Re: Recursive Whitespace Removal

2012-12-19 13:20 • by Meower68 (unregistered)
Written by somebody who normally writes Scheme code. It is, after all, tail recursive. Doesn't that result in an optimal loop? Most Scheme interpreters or compilers will replace the call with a goto.

Oh, wait. This isn't Scheme. This is C.

Re: Recursive Whitespace Removal

2012-12-19 13:27 • by /Wegge (unregistered)
397612 in reply to 397548
Techpaul:

Well I would love to be able to guarantee that

while(l && str[--l] == ' ')

Would always work as order of evaluation is not guaranteed, do decrement inside loop.


The order of that evaluation is very well defined, for any conforming C compiler.

Re: Recursive Whitespace Removal

2012-12-19 13:33 • by David T (unregistered)
397613 in reply to 397580
so what:
Not sure I see the wtf. This works, and (knowing the financial world fairly well) will be called at most 2-3 times in recursion. Much better than iterating over the entire string to find the first space in the last group of spaces.


It _does_ iterate over the entire string, twice for each recursive call, to find the length. With a single pointer to keep track of the last non-space found, it could be finished after the _first_ iteration over the string.

Re: Recursive Whitespace Removal

2012-12-19 13:39 • by Canuck (unregistered)
397614 in reply to 397564
Actually . . .

Once there was a programmer who had a problem. He decided to use recursion. Then he had two midget programmers each whom had a problem. They decided to use recursion...

Re: Recursive Whitespace Removal

2012-12-19 13:47 • by Captcha:quibus (unregistered)
Ha ha! He's using a programming pattern that his language wasn't designed for!

That's funny.

Re: Recursive Whitespace Removal

2012-12-19 13:54 • by Mr.Bob (unregistered)
397617 in reply to 397608
gallier2:
Mr.Bob:
All these comments about recursion is evil, he's calling strlen() twice, blah blah blah, and no one is pointing out the more serious problems like the lack of checking for a null pointer on entry or how passing an empty string causes the routine to write to invalid memory?

Jeez, it's like griping about how they used Comic Sans to spell "Titanic" on the side of your ship...


You should read the comments before posting. I mentioned the invalid memory overwrite in the second comment.


I do see it there; my apologies, good sir.

Re: Recursive Whitespace Removal

2012-12-19 14:01 • by Ilkka (unregistered)
397618 in reply to 397526
Also it doesn't trim strings that contain only a single space. Although that might be by... "design", to put it politely.

Re: Recursive Whitespace Removal

2012-12-19 14:07 • by AN AMAZING CODER (unregistered)
397619 in reply to 397552
A developer:
This is the sort of academic BS I had to do in university programming courses.
For some reason professors think recursion is the cats meow when in reality it's the dog's breakfast.


I have a buddy who's taking CS right now. All he's been talking about for the last 3 months is recursion recursion recursion. He said his professor wants him to do things this way.

I've been wondering WTF is up with that, so I'm glad to hear I'm not the only one.

I tried to explain to him that "recursion is dead simple, and just one way of solving something that you'll probably not use very often in practice". He responded with a question about recursion.

Re: Recursive Whitespace Removal

2012-12-19 14:16 • by TGV
397620 in reply to 397580
so what:
Not sure I see the wtf. This works, and (knowing the financial world fairly well) will be called at most 2-3 times in recursion. Much better than iterating over the entire string to find the first space in the last group of spaces.

No. The code has the potential to write in memory outside the string and calls strlen, which iterates the whole string, something you don't seem to understand, every iteration twice, except for the last one. Your programming license has been revoked. Please go do management.

Mike:
This comments section looks like it has mostly been written by a bunch of non programming former programmer trolls ... the code is executing with good performance in real time.

Another one that doesn't get it. Please hand in your programming license.

vahokif:
It's tail recursive. gcc -O2 compiles it into a loop.

Which, as pointed out, is a minor WTF in comparison to the other issues with the code.

Re: Recursive Whitespace Removal

2012-12-19 14:19 • by Devastated of Rochdale (unregistered)
Words cannot express the weird gut-wrench I felt when my eyes alighted on this code in my RSS feed. I think it may have given me norovirus.

I don't have anything constructive to add (sorry) because when I try to focus on the code my vision starts to blur, presumably in self-defense.

Suffice it to say the real WTF may be that this is the first code snippet ever to make me feel the need to comment on it, even if that comment is little more than a howl of anguish.

Re: Recursive Whitespace Removal

2012-12-19 14:35 • by Delve
397622 in reply to 397604
Mr.Bob:
All these comments about recursion is evil, he's calling strlen() twice, blah blah blah, and no one is pointing out the more serious problems like the lack of checking for a null pointer on entry or how passing an empty string causes the routine to write to invalid memory?

Jeez, it's like griping about how they used Comic Sans to spell "Titanic" on the side of your ship...


Except that if he had used a simple, straightforward loop it would have obviated the more serious problems he introduced. A string (or another array) is a loop-friendly data structure. Use the right tool for the job. What makes this TDWTF-worthy is the novel (hopefully) use of recursion as an anti-pattern. Everything else is just plain poor quality.

Re: Recursive Whitespace Removal

2012-12-19 14:47 • by ¯\(°_o)/¯ I DUNNO LOL (unregistered)
As long as we're playing Code Golf...

void trim_right(char *str)
{
  char *end = str;
  while (*str) {
    if (*str++) != ' ') {
      end = str;
    }
  }
  *end = 0;
}

Why worry about size_t?

Re: Recursive Whitespace Removal

2012-12-19 14:49 • by ronpaii (unregistered)
397624 in reply to 397558
If you are optimizing, why replace all the white space with NUL instead of only the 1st space character on the right? In C a string ends with the 1st NUL. String function don't check the whole buffer looking for another string.

Re: Recursive Whitespace Removal

2012-12-19 14:55 • by Evan (unregistered)
All you people yelling "tail recursion" don't understand what you're talking about.

The problem isn't the lack of iteration -- TCO solves that nicely. The problem is that every recursive call is going to do another strlen() call. TCO does nothing to eliminate that.
« PrevPage 1 | Page 2 | Page 3 | Page 4Next »

Add Comment