• (cs)

    I think you guys are forgetting one ingenious way of dealing with this.

    if(true)  // may have unpredictable results for unusually large values of true
    {
      return false;
    }
    else
    {
      return true;
    }

    Obviously conditionals are much slower than raw booleans and must be avoided for scalable code.

  • (cs) in reply to ftumph
    ftumph:
    [image] GWheeler wrote:

    I'm surprised no one has commented on the silly pattern of

    <font face="Courier New">if condition {
        valid = true;
    }
    else {
        valid = false;
    }</font>

    which obviously can be expressed more succinctly as

    <font face="Courier New">valid = condition;</font>



    You beat me to it.  This is a WTF that I see ALL THE TIME.
  • (cs) in reply to AndrewB

    Is that a performance hit of one instruction, or are both implementations compiled identically? Either way, it's a harmless WTF.


    Also, I think that

    [code language="c#"]
    if (value = condition)
    {
    // Do stuff
    }
    [/code]
    is

    a very dangerous way of doing it; dangerous to the reader, of course. That can SO easily be misinterpreted as if (value == condition).

  • (cs) in reply to GWheeler
    GWheeler:
    which obviously can be expressed more succinctly as

    <font face="Courier New">valid = condition;</font>

    The problem is, his condition is already unreadable.
  • (unregistered) in reply to RyGuy

    > Text -- Tekst

    Maybe he is Pole. [;)]

  • (unregistered) in reply to AndrewVos
    AndrewVos:
    err. some anonymouse guy said: maybe it would be faster to trim first then tolower second, hmmmmm. dont both those functions scan chars anyway, u not talking sense son.



    Ummm no. Actually, trim does something like this:

    trim()
    {

    while(str[0] == SPACE || str[0] == '\t' || str[0] == '\r' || str[0] == '\n')
    {
        str.remove (0);
    }

    char c = str[str.length() -1];
    while (c == SPACE || c == '\t' || c == '\r' || c == '\n')
    {
        str.remove (str.length() -1);
    }

    } // trim

    As  you can see, trim has to scan characters. This is typically inexpensive, since most of the time, you only want to get rid of eol characters, or spaces at the beginning of  a line, so you'll (typically) trim about 5 (maybe 10) characters. What is expensive, is that every character has to be compared about 4 or 5 times (vetical tabs, etc.) That would be bad if done to a (very) large string.

    LCase does something like this:
    lcase()
    {
        for each char (c) in string
        {
              c = c xor 64; // Note that there are no comparisons here.
        }
    }

    Comparisons are expensive because the processor does something called branch (if) prediction. If it's prediction is wrong, it has to flush the whole instruction cache.

    In conclusion, it might be faster to trim first, but only by a few clock cycles. You would not notice the difference, unless maybe you did 1,000,000,000 trims vs just as many LCases.

    The Scary thing about the other piece of code is that he uses a variant to pass the data (by not declaring sTekst as a string), thus causing VB(6 and lower) to (internally) convert the string he passes in into a variant for passing into the function, and then convert it back to a string, to pass it to the messagebox function.


    P.s.
    (Typically means in a tipical string processing program. Don't reply with "but my program trims 500 spaces...".)



  • (unregistered) in reply to
    :
    > What an 'ingenious' way to spell Text -- Tekst

    It is Dutch for text. I'd guess the author is Dutch or Flemish.


    Or Afrikaans.
  • (cs) in reply to
    :
    [image] AndrewVos wrote:
    err. some anonymouse guy said: maybe it would be faster to trim first then tolower second, hmmmmm. dont both those functions scan chars anyway, u not talking sense son.




    Ummm no. Actually, trim does something like this:

    trim()
    {

    while(str[0] == SPACE || str[0] == '\t' || str[0] == '\r' || str[0] == '\n')
    {
        str.remove (0);
    }

    char c = str[str.length() -1];
    while (c == SPACE || c == '\t' || c == '\r' || c == '\n')
    {
        str.remove (str.length() -1);
    }

    } // trim

    As  you can see, trim has to scan characters. This is typically inexpensive, since most of the time, you only want to get rid of eol characters, or spaces at the beginning of  a line, so you'll (typically) trim about 5 (maybe 10) characters. What is expensive, is that every character has to be compared about 4 or 5 times (vetical tabs, etc.) That would be bad if done to a (very) large string.

    LCase does something like this:
    lcase()
    {
        for each char (c) in string
        {
              c = c xor 64; // Note that there are no comparisons here.
        }
    }

    I sure hope you aren't the guy who implemented these functions!   The first would be horribly inefficient (str.remove for each whitespace?) and the second doesn't even come close to working.

    (Here are my "poster intention odds":  WTF: 78%; Troll: 20%; Clever Ironic Sarcasm: 2%; Devil's Advocate: 0%)

  • (cs) in reply to

    :

    while(str[0] == SPACE || str[0] == '\t' || str[0] == '\r' || str[0] == '\n')
    {
        str.remove (0);
    }

    What is expensive, is that every character has to be compared about 4 or 5 times (vetical tabs, etc.) That would be bad if done to a (very) large string.

    Eeeek! That's bad code.    You really don't understand what's going on.

    It would be more like (in C/C++):

    int i =0;
    while (isspace(str[i]))   
          ++i;
    int j = strlen(str)-1;
    while (isspace(str[j])
          --j;

    strncpy(str, &str[i], j-i);

    isspace() would be a macro/inline function which would expand to:

    (attribtab[str[i]] & WHITESPACE)

    attribtab would have one entry for each character, each with flags describing that character so that the entry for 'A' would be something like ALPHA|UPPERCASE|PRINTABLE.

    So, not four comparisions, but one lookup, one AND and one boolean comparision.

    Oh, and we shift the characters just once, instead of once for each character removed.

    Next up, there's LCase:

    LCase does something like this:
    lcase()
    {
        for each char (c) in string
        {
              c = c xor 64; // Note that there are no comparisons here.
        }
    }

    This will only work if you can guarenty that the string contains ONLY uppercase characters.  Actually, it won't work even than because you want to XOR with 32, but we know what you meant (actually, we really want to OR with 32 to force everything to lowercase.  XORing would flip upper & lower case)

    In reality LCase would be closer to:

    lcase(string str)
    {
        foreach (char c in str)
        {
              if (isupper(c))
                    c = c or 32; 
        }
    }

    So we really can't get away from a comparision.

    isupper is much like isspace above.

  • (unregistered)

    [:P][;)][;)][&][I][*][~][G][H][:O]

  • (unregistered) in reply to RyGuy

    'Tekst' happens to be Dutch for 'text'... [8-|]

  • (unregistered)

    Jeff S is right; AndrewVos's xor with 64 (meaning hex 40?) would flip a bit, not lowercase anything.

  • (cs) in reply to

    :
    Jeff S is right; AndrewVos's xor with 64 (meaning hex 40?) would flip a bit, not lowercase anything.

    Actually, that's not quite true.  XORing with 32 (0x20), would convert an uppercase letter to lowercase (eg. 'A' == 0x41,  XOR 20 = 0x61 == 'a').  Unfortunately, it was also makes lowercase letters upper case, and just make a mess out of non-alphabetic characters ('*' == 0x2A, XOR 20 = 0x0A == '\n')

  • Thulack Onzipanter (unregistered) in reply to RyGuy

    It's called Dutch, which truly is ingenious [;)]

  • Stewart (unregistered) in reply to
    Anonymous:

    What do you think about this then. Which is the right way to do it?

    <FONT face="Courier New">//1
    if (condition)
    {
        //do some stuff</FONT><FONT face="Courier New">
        valid = true;
    }
    else
    {
        valid = false;
    }</FONT>

    <FONT face="Courier New">//2
    if (condition)
    {
        //do some stuff<FONT face="Courier New">
    }
    valid = condition;</FONT></FONT>

    //3
    <FONT face="Courier New">valid = condition;</FONT>
    <FONT face="Courier New">if (valid)
    {
        //do some stuff
    }</FONT>

    <FONT face="Courier New">/Erik</FONT>

    Definitely number 3.

    It wins over number 2, because it only tests the condition once, and the condition might be expensive to test, for example, it might involve a complex algorithm or a database hit.

    It winds over number 1 by being less complex and more compact.

  • (cs)
    Alex Papadimoulis:

    You know, because you never know if Trim() will actually trim both upper- and lower-case whitespace.

    This coder needs to be told that spaces have been lower-case from the beginning.

  • joey joejoe shabadoo jr. (unregistered) in reply to mugs

    All this talk about whitespace reminds me of the purposefully WTF language called Whitespace

  • EmperorsNewWhitespace (unregistered) in reply to
    :
    Ah, crap, I totally have to check all of my code... I've never accounted for lowercase spaces before.
    Some of the first programming* I did was in a computer music patching thing called Max/MSP. At one point, I had an absolutely vicious bug which invalidated my patch; some bits seemed to function one way, while other visually similar bits were broken. I read and reread the names I'd typed into the little boxes... all present and correct.

    Eventually, I somehow noticed that some invisible escaped characters had made their way in there, as the caret seemed to sometimes stick at the same place when it should be moving on a character... I somehow managed to salvage my patch fairly quickly, although it contained a somewhat non-trivial number of these objects, an arbitrary selection of which were invisibly corrupted...

    Oh the joys of plain text. You people don't know you're born.

    *I refrain from using inverted commas; but don't tell new users of the software that it's a programming language, they tend to get scared.

Leave a comment on “The Ingenious DBox with the Double Trim”

Log In or post as a guest

Replying to comment #:

« Return to Article