• Arngrim (unregistered)

    No, I cannot spot the bug, no matter how much I Zoom into the code. Nice normaliZation macro, cannot think of a way to optimiZe it anymore. As a matter of fact, I realiZe right now, that I need such an amaZing macro myself.

  • Patric (unregistered)

    Z is uppercase, so missing lowercase z

  • (cs)

    It doesn't handle lower-case 'z'.  It maps upcase Z to Z.

    Another bug is that it evaluates X multiple times.  It could produce astonishing results if you try TO_UPPER(getchar()).

    gene

  • Alex (unregistered)

    The macro also evaluates (x) 26 times.  So if you did something like this:

    TO_UPPER (*(p++))

    It would increment p 26 times.

  • Fabian (unregistered)

    Mark Twain probably would have responded along the lines of:
    And zen vinaly ze drem has com tru!

    Fabian (who probably should not have suggested knowing what Twain might have said...)

  • (cs)

    I wonder why someone would do case conversion like this.  Is it an attempt to be so portable that your code works in the face of ASCII, EBCDIC, or whatever else?  But then why not use C's toupper macro?

    gene

  • (cs)

    I wonder why someone would do case conversion like this.  Is it an attempt to be so portable that your code works in the face of ASCII, EBCDIC, or whatever else?  But then why not use C's toupper macro?

    gene

  • vdboor (unregistered)

    There is one other WTF.. This thing is not a normal function, but a macro! For the non-C programmers: a macro in your code is replaced with it's contents (like an inline function). So everywhere you write TO_UPPER("b"), that code will be replaced with the contents of that macro. In my example, x will be replaced with "b".

  • (cs) in reply to vdboor

    Wow, Who'da thunk this code could have been replaced by:

    char ucase_tbl[256] = { .. insert appropriate characters here .. };

    char chrupr(char c)
    {
       return ucase_tbl[c];
    }

    mmph. Actually, the casing table should probably be filled at runtime based on locale. And NO! I don't expect this to work with unicode.

     

     

  • andrey (unregistered)

    Of course, subtracting 32 from the lowercase letter (within appropriate bounds) is far too great of a task to accomplish.

  • (cs) in reply to andrey

    Ignoring all the other stupid things about this, wouldn't a switch statement fit a little better?

  • The letter nazi (unregistered)

    No zed for you!

  • (cs)

    Marco....[^o)]

  • (cs)

    The other classic WTF that you see happen in this sort of situation is a confusion of 'l' and '1'. If performance is that big of a concern, wouldn't you want to sort the characters based on frequency? Of course, when you care that much, don't you move to asm?

  • (cs) in reply to skicow
    skicow:

    Marco....[^o)]



    Polo...?
  • (cs) in reply to Mike R
    Mike R:
    skicow:

    Marco....[^o)]



    Polo...?

    Yes! I knew someone would answer. [:D]

  • (cs)

    How about the WTF of doing it twice, just to be sure...
    TO_UPPER(TO_UPPER('z'))
    still didnt work?  maybe try 3 times!

  • Jan (unregistered) in reply to andrey

    Anonymous:
    Of course, subtracting 32 from the lowercase letter (within appropriate bounds) is far too great of a task to accomplish.

    Aktshully, ASCII was defined so you can do

    int flipCase(int c)
    {
       return c ^ 0x20;
    }

    Pretty cool, uh

  • Greg Miller (unregistered) in reply to Jan

    The above doesn't work for characters greater than 127.

  • Greg Miller (unregistered) in reply to Greg Miller

    Ooops, more specifically it doesn't work for anything not known to be either an upper or lower case character.  { becomes [.

  • (cs) in reply to Jan
    Anonymous:

    Anonymous:
    Of course, subtracting 32 from the lowercase letter (within appropriate bounds) is far too great of a task to accomplish.

    Aktshully, ASCII was defined so you can do

    int flipCase(int c)
    {
       return c ^ 0x20;
    }

    Pretty cool, uh



    Both of you with your evil magic numbers.  :@
  • lw (unregistered)

    Capital, not lower-case, 'Z' in the test

  • k4_pacific (unregistered) in reply to Alex

    The macro also evaluates (x) 26 times.  So if you did something like this:

    TO_UPPER (*(p++))

    It would increment p 26 times.

    Actually, p would only increment 26 times if p[25] == 'Z' and p[0] != 'a' p[1] != 'b', p[2] != 'c', etc.

  • Tim Smith (unregistered) in reply to Mike R
    Mike R:

    Wow, Who'da thunk this code could have been replaced by:

    char ucase_tbl[256] = { .. insert appropriate characters here .. };

    char chrupr(char c)
    {
       return ucase_tbl[c];
    }

    mmph. Actually, the casing table should probably be filled at runtime based on locale. And NO! I don't expect this to work with unicode.



    Nor does it work half the time with if the platform defines a 'char' as a signed value.
  • (cs) in reply to Alex
    Anonymous:

    The macro also evaluates (x) 26 times.  So if you did something like this:

    TO_UPPER (*(p++))

    It would increment p 26 times.



    This is typical of macros, which is one reason why they are usually in upper case.

    This is a more usual implementation of TO_UPPER:
    #define TO_UPPER( x ) \
    ('a' <= (x) && (x) <= 'z' ? (x) + ('A' - 'a') : (x))

  • (cs) in reply to k4_pacific
    Anonymous:

    The macro also evaluates (x) 26 times.  So if you did something like this:

    TO_UPPER (*(p++))

    It would increment p 26 times.

    Actually, p would only increment 26 times if p[25] == 'Z' and p[0] != 'a' p[1] != 'b', p[2] != 'c', etc.



    Indeed! And if p[i] != 'a' + i (i = 0 to 24) and p[25] != 'Z', then p would be incremented 27 times.
  • Rob Meyer (unregistered)

    Too bad...if only there the letters of the alphabet were somehow mapped to numbers. Then if the lower case and upper case letters were in the same order, they'd have a linear relationship, and you could just do some math to move from one to another after testing if they were in the lower case range or not....

    Oh well, maybe someday computers will use some sort of numeric scheme behind the scenes to store characters.

  • (cs) in reply to Rob Meyer

    Classic case of RTFM?  

  • askme (unregistered) in reply to skicow
    skicow:

    Marco....[^o)]



    PONO


    "What's in your wallet?";)

  • Rob (unregistered)

    It's much better just to call a library function so you can do all different kinds of encodings.

  • (cs) in reply to Rob Meyer

    I heard that programmer works for AOL. His (or her) routine is used on some of AOL's mail server and causes ALL MESSAGES FROM AOL ADDRESSES TO BE IN ALL CAPS EXCEPT 4 TEH OCASIONLA LOWERCAES z LOL

    Ahem.

  • (cs) in reply to Greg Miller
    Anonymous:
    The above doesn't work for characters greater than 127.


    But, as I understand what the original macro 'code' is trying to do, the ASCII value of the letters A through Z is lower than 127.

         -dZ.
  • Zahlman (unregistered)

    Irony:

    1. The macro uses the convention of putting constants on the left-hand side of the expression, in order to avoid errors due to typoing '=' instead of '=='. Yet that wouldn't help here since the statements were almost certainly copied and pasted, and one of the problems is a typo (of 'Z' for 'z').

    2) The author is so "careful" to wrap 'x' in parentheses in case it is an expression, but no consideration is given to the effect of that expression being evaluated multiple times.

    ~ zahlman, er, I mean zahlman... dammit... Zahlman... there we go. ;)

    (PS I just tried to sign up here, but my login seems not to work? o_O)
  • Your Name: (unregistered)

    Actually I can think of one case where a macro like this would actually be more efficient than the more usual table based implementation: a constant argument.

    The above macro called on to_upper('y') would be completely constant-folded at compile-time into just a constant 'Y'.

    The normal table based implementation would normally not be eligible for constant folding.

    So for those cases where you have to_upper() calls in the midst of your critical inner loops (you know, for that super-duper unbreakable encryption scheme you cooked up)...

  • (cs)

    I don't really know mutch about C, but wouldn't this function only work on strings of length one? Forcing you to break apart the word before evaluating each character separately?

    So TO_UPPER("Account") would still be "Account"

     

     

  • rpresser (unregistered) in reply to Fabian
    Anonymous:
    Mark Twain probably would have responded along the lines of:
    And zen vinaly ze drem has com tru!

    Fabian (who probably should not have suggested knowing what Twain might have said...)


    Indeed, you probably should not have suggested it, because it was George Bernard Shaw who proposed the simplified spelling system to which you are alluding.
  • (cs) in reply to BradC
    BradC:

    I don't really know mutch about C, but wouldn't this function only work on strings of length one? Forcing you to break apart the word before evaluating each character separately?

    So TO_UPPER("Account") would still be "Account"

    True, and TO_UPPER("account") would still be "account" because "account" is not a letter from a-z...

    Doh!

    Drak

  • Dirk (unregistered) in reply to BradC

    "I don't really know mutch about C", indeed.

    That macro is targetted at single characters. To convert a whole string you would have to loop it. Remember that C is often based on primitives where this doesn't reduce practicality. Higher-level libraries and hand hacking are often required for more convenient routines. On the other hand, super-high-level routines like *printf and *scanf are targetted at being maximally practical yet still standard (besides Microserf and GNU extensions for other types).

    The problem with this code is that a perfectly servicable and much more sensible macro exists in ctype.h, and some environments provide a library function as well. This is one of those "should have looked elsewhere first" cases. How the developer did not pick up on the z/Z mistake is a real testament to carelessness.

  • (cs)

    This does not convert lowercase spaces to uppercase spaces. Wtf?

  • (cs) in reply to nonDev
    twenty-six ternary operators


    Which is bad because... it keeps every previous one 'open' while it evaluates the current one, making it less efficient than an if else if else if or a case switch?
  • Purplet (unregistered) in reply to Jan

    if ((c>='a')&&(c<='z')) c -= 'a' - 'A';

    It works for every character codification where alphabetic letters are consecutive (so no ASCII/EBCDIC problems).

  • Ian Horwill (unregistered)

    I'm just waiting for Alex to say the uppercase Z was his typo!

  • (cs) in reply to dhromed
    dhromed:
    twenty-six ternary operators


    Which is bad because... it keeps every previous one 'open' while it evaluates the current one, making it less efficient than an if else if else if or a case switch?


    WTF?  Wouldn't the compiler treat them the same?  I mean, its just a series of JNZ or such operations, just like a bunch of if/else/if/else or a case switch... no?

        dZ.
  • Granma (unregistered) in reply to Rob
    Anonymous:
    It's much better just to call a library function so you can do all different kinds of encodings.

    WELL ISN'T IT NICE THAT SOMEONE IS GOOD ENOUGH CODER TO MAKE A CORRECT VERSION OF A WTF!!!1111oneoneone
  • Granma (unregistered) in reply to Drak
    Drak:
    True, and TO_UPPER("account") would still be "account" because "account" is not a letter from a-z...

    Unless the caller gets lucky and the pointer just happens to be (char *) 'a' - then he'd get a mangled pointer.
  • cp (unregistered) in reply to rpresser

    Anonymous:
    Anonymous:
    Mark Twain probably would have responded along the lines of:
    And zen vinaly ze drem has com tru!

    Fabian (who probably should not have suggested knowing what Twain might have said...)


    Indeed, you probably should not have suggested it, because it was George Bernard Shaw who proposed the simplified spelling system to which you are alluding.

    What puzzles me is why Mark Twain ( from Missouri ) has a German accent?

  • (cs)

    WTF++

  • (cs)

    Wonder why zat damn function doezn't work?

  • (cs) in reply to andrey

    In fairness, that would only work in ASCII

  • Dave Brosius (unregistered) in reply to Patric

    It also would be especially fun if the parameter was a function say

    (((int)(rand() * 26)) + 'a')

     

Leave a comment on “Macro Polo”

Log In or post as a guest

Replying to comment #34762:

« Return to Article