The Daily WTF: Curious Perversions in Information Technology

foxyshadis · 2005-03-07 Reply Admin

I think you guys are forgetting one ingenious way of dealing with this.

if(true) // may have unpredictable results for unusually large values of true
{
return false;
}
else
{
return true;
}

Obviously conditionals are much slower than raw booleans and must be avoided for scalable code.

AndrewB · 2005-03-08 Reply Admin

ftumph:
[image] GWheeler wrote:
I'm surprised no one has commented on the silly pattern of

if condition {
valid = true;
}
else {
valid = false;
}

which obviously can be expressed more succinctly as

valid = condition;

You beat me to it. This is a WTF that I see ALL THE TIME.

AndrewB · 2005-03-08 Reply Admin

Is that a performance hit of one instruction, or are both implementations compiled identically? Either way, it's a harmless WTF.

Also, I think that

[code language="c#"]
if (value = condition)
{
	// Do stuff
}
[/code]

is

a very dangerous way of doing it; dangerous to the reader, of course. That can SO easily be misinterpreted as if (value == condition).

icelava · 2005-03-08 Reply Admin

GWheeler:
which obviously can be expressed more succinctly as
valid = condition;

The problem is, his condition is already unreadable.

2005-03-08 Reply Admin

> Text -- Tekst

Maybe he is Pole. [;)]

2005-03-08 Reply Admin

AndrewVos:
err. some anonymouse guy said: maybe it would be faster to trim first then tolower second, hmmmmm. dont both those functions scan chars anyway, u not talking sense son.

Ummm no. Actually, trim does something like this:

trim()
{

while(str[0] == SPACE || str[0] == '\t' || str[0] == '\r' || str[0] == '\n')
{
    str.remove (0);
}

char c = str[str.length() -1];
while (c == SPACE || c == '\t' || c == '\r' || c == '\n')
{
    str.remove (str.length() -1);
}

} // trim

As you can see, trim has to scan characters. This is typically inexpensive, since most of the time, you only want to get rid of eol characters, or spaces at the beginning of a line, so you'll (typically) trim about 5 (maybe 10) characters. What is expensive, is that every character has to be compared about 4 or 5 times (vetical tabs, etc.) That would be bad if done to a (very) large string.

LCase does something like this:
lcase()
{
    for each char (c) in string
    {
          c = c xor 64; // Note that there are no comparisons here.
    }
}

Comparisons are expensive because the processor does something called branch (if) prediction. If it's prediction is wrong, it has to flush the whole instruction cache.

In conclusion, it might be faster to trim first, but only by a few clock cycles. You would not notice the difference, unless maybe you did 1,000,000,000 trims vs just as many LCases.

The Scary thing about the other piece of code is that he uses a variant to pass the data (by not declaring sTekst as a string), thus causing VB(6 and lower) to (internally) convert the string he passes in into a variant for passing into the function, and then convert it back to a string, to pass it to the messagebox function.

P.s.
(Typically means in a tipical string processing program. Don't reply with "but my program trims 500 spaces...".)

2005-03-08 Reply Admin

:

> What an 'ingenious' way to spell Text -- Tekst

It is Dutch for text. I'd guess the author is Dutch or Flemish.

Or Afrikaans.

Jeff S · 2005-03-08 Reply Admin

:

[image] AndrewVos wrote:

err. some anonymouse guy said: maybe it would be faster to trim first then tolower second, hmmmmm. dont both those functions scan chars anyway, u not talking sense son.

Ummm no. Actually, trim does something like this:

trim()
{

while(str[0] == SPACE || str[0] == '\t' || str[0] == '\r' || str[0] == '\n')
{
    str.remove (0);
}

char c = str[str.length() -1];
while (c == SPACE || c == '\t' || c == '\r' || c == '\n')
{
    str.remove (str.length() -1);
}

} // trim

As you can see, trim has to scan characters. This is typically inexpensive, since most of the time, you only want to get rid of eol characters, or spaces at the beginning of a line, so you'll (typically) trim about 5 (maybe 10) characters. What is expensive, is that every character has to be compared about 4 or 5 times (vetical tabs, etc.) That would be bad if done to a (very) large string.

LCase does something like this:
lcase()
{
    for each char (c) in string
    {
          c = c xor 64; // Note that there are no comparisons here.
    }
}

I sure hope you aren't the guy who implemented these functions! The first would be horribly inefficient (str.remove for each whitespace?) and the second doesn't even come close to working.

(Here are my "poster intention odds": WTF: 78%; Troll: 20%; Clever Ironic Sarcasm: 2%; Devil's Advocate: 0%)

JamesCurran · 2005-03-08 Reply Admin

:

while(str[0] == SPACE || str[0] == '\t' || str[0] == '\r' || str[0] == '\n')
{
str.remove (0);
}

What is expensive, is that every character has to be compared about 4 or 5 times (vetical tabs, etc.) That would be bad if done to a (very) large string.

Eeeek! That's bad code. You really don't understand what's going on.

It would be more like (in C/C++):

int i =0;
while (isspace(str[i]))
      ++i;
int j = strlen(str)-1;
while (isspace(str[j])
      --j;

strncpy(str, &str[i], j-i);

isspace() would be a macro/inline function which would expand to:

(attribtab[str[i]] & WHITESPACE)

attribtab would have one entry for each character, each with flags describing that character so that the entry for 'A' would be something like ALPHA|UPPERCASE|PRINTABLE.

So, not four comparisions, but one lookup, one AND and one boolean comparision.

Oh, and we shift the characters just once, instead of once for each character removed.

Next up, there's LCase:

LCase does something like this:
lcase()
{
    for each char (c) in string
    {
          c = c xor 64; // Note that there are no comparisons here.
    }
}

This will only work if you can guarenty that the string contains ONLY uppercase characters. Actually, it won't work even than because you want to XOR with 32, but we know what you meant (actually, we really want to OR with 32 to force everything to lowercase. XORing would flip upper & lower case)

In reality LCase would be closer to:

lcase(string str)
{
    foreach (char c in str)
    {
          if (isupper(c))
                c = c or 32;
    }
}

So we really can't get away from a comparision.

isupper is much like isspace above.

2005-03-09 Reply Admin

[:P][;)][;)][&][I][*][~][G][H][:O]

2005-03-09 Reply Admin

'Tekst' happens to be Dutch for 'text'... [8-|]

2005-03-11 Reply Admin

Jeff S is right; AndrewVos's xor with 64 (meaning hex 40?) would flip a bit, not lowercase anything.

JamesCurran · 2005-03-14 Reply Admin

:
Jeff S is right; AndrewVos's xor with 64 (meaning hex 40?) would flip a bit, not lowercase anything.

Actually, that's not quite true. XORing with 32 (0x20), would convert an uppercase letter to lowercase (eg. 'A' == 0x41, XOR 20 = 0x61 == 'a'). Unfortunately, it was also makes lowercase letters upper case, and just make a mess out of non-alphabetic characters ('*' == 0x2A, XOR 20 = 0x0A == '\n')

2005-03-22 Reply Admin

It's called Dutch, which truly is ingenious [;)]

2006-03-09 Reply Admin

Anonymous:

What do you think about this then. Which is the right way to do it?

//1
if (condition)
{
 //do some stuff
 valid = true;
}
else
{
 valid = false;
}

//2
if (condition)
{
 //do some stuff
}
valid = condition;

//3
valid = condition;
if (valid)
{
 //do some stuff
}

/Erik

Definitely number 3.

It wins over number 2, because it only tests the condition once, and the condition might be expensive to test, for example, it might involve a complex algorithm or a database hit.

It winds over number 1 by being less complex and more compact.

PPP · 2006-05-12 Reply Admin

Alex Papadimoulis:
You know, because you never know if Trim() will actually trim both upper- and lower-case whitespace.

This coder needs to be told that spaces have been lower-case from the beginning.

2006-05-12 Reply Admin

All this talk about whitespace reminds me of the purposefully WTF language called Whitespace

2008-07-02 Reply Admin

:
Ah, crap, I totally have to check all of my code... I've never accounted for lowercase spaces before.

Some of the first programming* I did was in a computer music patching thing called Max/MSP. At one point, I had an absolutely vicious bug which invalidated my patch; some bits seemed to function one way, while other visually similar bits were broken. I read and reread the names I'd typed into the little boxes... all present and correct.

Eventually, I somehow noticed that some invisible escaped characters had made their way in there, as the caret seemed to sometimes stick at the same place when it should be moving on a character... I somehow managed to salvage my patch fairly quickly, although it contained a somewhat non-trivial number of these objects, an arbitrary selection of which were invisibly corrupted...

Oh the joys of plain text. You people don't know you're born.

*I refrain from using inverted commas; but don't tell new users of the software that it's a programming language, they tend to get scared.

The Ingenious DBox with the Double Trim

Leave a comment on “The Ingenious DBox with the Double Trim”