• MightyM (unregistered)

    It's code that covers all bases (at least up to base 36)

  • (cs)

    well... atoi takes a null terminated string (rather than a char).

  • slipstream (unregistered)

    atoi() that takes a char and returns a char?!

    atoi() that actually allows LETTERS to be used?!

    ...I have nothing to say here.

  • (cs)

    I recall writing very similar code, long long ago, but mine only went up to 'f' or 'F'. This goes all the way up to eleven and beyond !

  • Moi (unregistered) in reply to Roby McAndrew

    You might want to use int64_t instead of "long long".

  • faoileag (unregistered) in reply to Roby McAndrew
    Roby McAndrew:
    I recall writing very similar code, long long ago, but mine only went up to 'f' or 'F'. This goes all the way up to eleven and beyond !
    This is because it is meant to convert a hexacosimal number to a decimal one.

    The hexacosimal system (0..9a..z) is the most compact of the numbering systems - with only 2 digits you can cover up to 1296 numbers!

    Be careful not to mistake hexacosimal numbers with base64 though - there are subtle differences.

  • History Teacher (unregistered) in reply to slipstream
    slipstream:
    atoi() that actually allows LETTERS to be used?!

    ...I have nothing to say here.

    There's nothing strange about that. Many string-to-integer conversion functions take the base, and accept bases up to 36. But since this function converts just single char, it is not needed. This actually reduces the WTFiness a bit, I think it is a lost opportunity...

  • Jo (unregistered)

    That's a perfectly valid base-36 encoder. Which happens to work for everything down to base-1.

    Possible problems:

    • Parameter size may be wrong (no Obj-C knowledge here). Let him who is without sin cast the first stone though, that's not WTF.
    • The error case should throw an exception. Or assert. Or whatever you do in Obj-C. Again, this kind of error is so common that I wouldn't go WTF, just a weary sigh.
  • faoileag (unregistered) in reply to faoileag
    faoileag:
    Roby McAndrew:
    I recall writing very similar code, long long ago, but mine only went up to 'f' or 'F'. This goes all the way up to eleven and beyond !
    This is because it is meant to convert a hexacosimal number to a decimal one.

    The hexacosimal system (0..9a..z) is the most compact of the numbering systems - with only 2 digits you can cover up to 1296 numbers!

    Be careful not to mistake hexacosimal numbers with base64 though - there are subtle differences.

    Of course I meant the hextricontesimal system, not the hexacosimal. Sheesh, these numerical multipliers are difficult :-)
  • Zaitsev (unregistered)

    So the WTF is that it takes a char not an array?

  • Crispy (unregistered) in reply to Zaitsev

    Sorry, the Church of Objective-C-logy states clearly that all data types must be NSSomething, like NSInteger, NSString, NSArray... If there's no NSChar, then you must invent it. Raw C data types are abhorrents, and should never crawl in decent, honest Objective C.

    Captcha: pecus, using char in Objective-C is a sin

  • gnasher729 (unregistered) in reply to Crispy
    Crispy:
    Sorry, the Church of Objective-C-logy states clearly that all data types must be NSSomething, like NSInteger, NSString, NSArray... If there's no NSChar, then you must invent it. Raw C data types are abhorrents, and should never crawl in decent, honest Objective C.

    Captcha: pecus, using char in Objective-C is a sin

    The real Objective-C sin is to pass anything other than a UTF-8 encoded string or NSString* to such a function. The type "char" is basically useless when you use Unicode. For example, the cyrillic small letter es "с" has a Unicode code point of 0x441, which this code would probably interpret as an "A".

    And it seems the comparison (c >= 'A' && c <= 'z') with an uppercase A and lowercase z went unnoticed. Then there's the problem that the legal input '0' and any unrecognised character give the same output 0. All in all, utter rubbish.

  • gnasher729 (unregistered) in reply to Jo
    Jo:
    That's a perfectly valid base-36 encoder. Which happens to work for everything down to base-1.

    Possible problems:

    • Parameter size may be wrong (no Obj-C knowledge here). Let him who is without sin cast the first stone though, that's not WTF.
    • The error case should throw an exception. Or assert. Or whatever you do in Obj-C. Again, this kind of error is so common that I wouldn't go WTF, just a weary sigh.

    In Objective-C, assert and exceptions are there to identify errors made by the programmer which are fixed by changing the code, not unexpected situations at runtime. It would be pointless for the caller to check the input value, because that's more than half the work of the function, so I would assume that calling this with non-integer/letter chars is fine. The function should of course return an indication that a character was passed that isn't accepted.

  • (cs) in reply to gnasher729
    gnasher729:
    And it seems the comparison (c >= 'A' && c <= 'z') with an uppercase A and lowercase z went unnoticed. Then there's the problem that the legal input '0' and any unrecognised character give the same output 0. All in all, utter rubbish.
    And you forgot the fact that the ranges 'A'..'Z' and 'a'..'z' (to use Pascal notation) are not necessarily internally contiguous. At least one(*) important character encoding has two 6 or 7 character gaps in each range.

    Down with EBCDIC!

    (*) It may look like one encoding to someone who isn't paying attention, but there are many versions of EBCDIC that vary subtly or not so subtly in their encoding of characters that aren't also encodable in strict 7-bit ASCII.

    EDIT: I'm willing to be educated by someone who knows whether the non-contiguity feature of C/C++ characters was inherited by Objective C.

  • (cs) in reply to gnasher729
    gnasher729:
    In Objective-C, assert and exceptions are there to identify errors made by the programmer which are fixed by changing the code, not unexpected situations at runtime.
    I've got news for you...

    In C and C++, assert serves that very same purpose.

    Calling a dedicated character(s)-to-integer conversion with unverified characters is a bug that can only be fixed by changing code somewhere.

    In C++, exceptions should(*) in general be reserved for weird conditions, although the "fixed by changing the code" aspect doesn't apply, because "I ran out of memory" is regarded by the definition of operator new() as sufficiently weird to warrant an exception rather than a NULL return. Our good buddy dynamic_cast<>(), when used on references rather than pointers, has no non-exception method of reporting a type incompatibility, so it throws std::bad_cast.

    (*) That's an opinion. C++ exception handling is relatively expensive, and fraught with terminally lethal edge cases. Consider a destructor that can throw (not explicitly disabled by most compilers as far as I know), and what happens if you call it while unwinding the stack because of another exception (i.e. after calling throw and before hitting a catch() block). (Hint: it calls terminate(), and that's the end of that.)

  • (cs)

    I think The Real WTF is simply calling it atoi. Otherwise, this function is perfectly valid, and very often used when actually implementing atoi(), strtol() etc.

    It's a useful component block for building a number-parsing function, that's valid for common bases (2, 8, 10 and 16) in ASCII and EBCDIC alike, though in the latter it only goes up to base 19 rather than base 36.

  • iWantToKeepAnon (unregistered) in reply to Jo
    Jo:
    Let him who is without sin cast the first stone
       if (! person.sin)
          return((int)stone[0]);
    

    done.

  • TRWTF is... (unregistered) in reply to iWantToKeepAnon

    That's the zeroth stone.

  • (cs) in reply to iWantToKeepAnon

    It should only cast the stone, not return it

  • anonymous (unregistered) in reply to TRWTF is...
    TRWTF is...:
    That's the zeroth stone.
    But it is also the first. Zero is its nominal number and one is its ordinal number.
  • anonymous (unregistered) in reply to sztupy
    sztupy:
    It should only cast the stone, not return it
    A "cast" in C is completely different than a "cast" in natural language. A synonym which is more closely adhering to the original meaning would be "throw".
  • Chelloveck (unregistered) in reply to Medinoc
    Medinoc:
    I think The Real WTF is simply calling it atoi. Otherwise, this function is perfectly valid, and very often used when actually implementing atoi(), strtol() etc.

    Agreed. Apart from the name, the only vaguely WTF-y thing about it is that the error case is indistinguishable from the legal case of passing in '0'. It should return -1 or CHAR_MAX or something similar on errors. Then again, it may be from a problem domain where it's desirable to treat out-of-bounds characters as zero.

    Not a WTF. I demand a replacement Daily WTF, or my money back for today.

  • Anonymous') OR 1=1 (unregistered)

    TRWTF is using an ordinary method -atoi: instead of a class method +atoi:, since this never accesses the instance of whatever class this is part of:

    @implementation MyClass
    +(char) atoi:(char) a { ... }
    @end
    ...
    // Can now be called outside the class without an instance
    char x = [MyClass atoi:'x'];
    
  • David M (unregistered) in reply to Medinoc
    Medinoc:
    I think The Real WTF is simply calling it atoi. Otherwise, this function is perfectly valid, and very often used when actually implementing atoi(), strtol() etc.

    It's a useful component block for building a number-parsing function, that's valid for common bases (2, 8, 10 and 16) in ASCII and EBCDIC alike, though in the latter it only goes up to base 19 rather than base 36.

    The two hardest problems in programming are naming, cache invalidation, and off by one errors.

  • Mike (unregistered)

    So, where exactly is the WTF here?

    Or is this a site for theoretical perfectionists that have nothing better to do than nit-pick.

    Without knowing the problem domain there is nothing to see here.

  • (cs) in reply to sztupy
    sztupy:
    It should only cast the stone, not return it

    In this way stones are unlike boomerangs. If you cast a boomerang, it is automatically returned.

  • skington (unregistered)

    A proper Unicode-aware atoi implementation should cover non-arabic number systems as well.

  • Scourge of Programmers! (unregistered)

    What is wrong with the code?

  • (cs)

    #define atoi TRWTF

  • F (unregistered) in reply to gnasher729
    gnasher729:
    Crispy:
    Sorry, the Church of Objective-C-logy states clearly that all data types must be NSSomething, like NSInteger, NSString, NSArray... If there's no NSChar, then you must invent it. Raw C data types are abhorrents, and should never crawl in decent, honest Objective C.

    Captcha: pecus, using char in Objective-C is a sin

    The real Objective-C sin is to pass anything other than a UTF-8 encoded string or NSString* to such a function. The type "char" is basically useless when you use Unicode. For example, the cyrillic small letter es "с" has a Unicode code point of 0x441, which this code would probably interpret as an "A".

    And it seems the comparison (c >= 'A' && c <= 'z') with an uppercase A and lowercase z went unnoticed.

    Well yes, of course. It went unnoticed because it isn't there. Along with the unicorns and the president's daughter that also aren't there.

  • gnasher729 (unregistered) in reply to skington
    skington:
    A proper Unicode-aware atoi implementation should cover non-arabic number systems as well.

    That depends. There are plenty of standards where text can contain numbers, but only in very specific formats, so let's say digits in one of the many indian scripts wouldn't be allowed. In other situations, they should be allowed. Which means you can't say "I have a function converting text to a number", you have to say "I have a function converting text to a number according to the following rules...".

    And then you might recognise roman numerals as well, so "Series 2, Episode 4" and "Series II, Episode IV" would be accepted as the same.

  • gnasher729 (unregistered) in reply to Scourge of Programmers!
    Scourge of Programmers!:
    What is wrong with the code?

    Run the code on your computer. Tell us what you except when the argument is '['. Tell us what you get when the argument is '['.

  • (cs) in reply to Chelloveck
    Chelloveck:
    Apart from the name, the only vaguely WTF-y thing about it is that the error case is indistinguishable from the legal case of passing in '0'. It should return -1 or CHAR_MAX or something similar on errors. Then again, it may be from a problem domain where it's desirable to treat out-of-bounds characters as zero.

    Not a WTF. I demand a replacement Daily WTF, or my money back for today.

    char moneyback = [myinstance atoi:'$'];
    
  • gnasher729 (unregistered) in reply to Steve The Cynic
    Steve The Cynic:
    And you forgot the fact that the ranges 'A'..'Z' and 'a'..'z' (to use Pascal notation) are not necessarily internally contiguous. At least one(*) important character encoding has two 6 or 7 character gaps in each range.

    Down with EBCDIC!

    This was as the article said intended for a "Web API". The internet doesn't care what character set your computer uses, it just sends you byte values that you have to interpret. If you received an ASCII 'A' (byte value 0x41 = 65), and your compiler uses EBDIC, then your compilers 'A' will not be the same number 65, so the whole code is rubbish.

    I have in fact seen code that was prepared to run on a machine with EBCDIC character set, and it had huge lists of defines like #define ASCII_A 65 so that it could process ASCII characters.

  • (cs) in reply to Steve The Cynic
    Steve The Cynic:

    Down with EBCDIC!

    While you may be down with EBCDIC, if you understand that not all character sets are ASCII derived, you will be in much better shape. As for EBCDIC, they did one thing correct: The translation table from Card code to EBCDIC was VERY well defined. Sure, you didn't use all of it in normal instances, but it was defined. Oh, and EBCDIC is a TRUE 8 bit code.

    Now why are the numbers collated above the upper case alphabet above the lower case alphabet.....

  • Jibble (unregistered) in reply to faoileag
    faoileag:
    Roby McAndrew:
    I recall writing very similar code, long long ago, but mine only went up to 'f' or 'F'. This goes all the way up to eleven and beyond !
    This is because it is meant to convert a hexacosimal number to a decimal one.

    Indeed.. itoa() takes a second parameter - the numerical base.

    It makes perfect sense that the complemtary atoi() would accept numbers in other bases.

    The real WTF is that somebody thinks this is a WTF.

  • (cs) in reply to faoileag
    faoileag:
    The hexacosimal system (0..9a..z) is the most compact of the numbering systems - with only 2 digits you can cover up to 1296 numbers!

    Yeah, but with 2 digits you'd have to rewrite this function to accept both of them. Better just stick to numbers lower than 36.

  • Z (unregistered) in reply to Mike
    Mike:
    Or is this a site for theoretical perfectionists that have nothing better to do than nit-pick.

    Bravo! You have finally uncovered the true purpose of this website.

  • (cs)

    Yth post!!!

  • anonymous (unregistered) in reply to herby
    herby:
    Steve The Cynic:

    Down with EBCDIC!

    While you may be down with EBCDIC, if you understand that not all character sets are ASCII derived, you will be in much better shape. As for EBCDIC, they did one thing correct: The translation table from Card code to EBCDIC was VERY well defined. Sure, you didn't use all of it in normal instances, but it was defined. Oh, and EBCDIC is a TRUE 8 bit code.

    Now why are the numbers collated above the upper case alphabet above the lower case alphabet.....

    No... if you understand that all character sets* are ASCII derived, you will be in much better shape.

    *that matter

    If non-ASCII character sets matter to you, I'm very sorry you're forced to work on systems that use non-ASCII character sets. I have the luxury of pretending they don't exist, and I'm quite happy about it.

  • (cs) in reply to gnasher729
    gnasher729:
    skington:
    A proper Unicode-aware atoi implementation should cover non-arabic number systems as well.

    That depends. There are plenty of standards where text can contain numbers, but only in very specific formats, so let's say digits in one of the many indian scripts wouldn't be allowed. In other situations, they should be allowed. Which means you can't say "I have a function converting text to a number", you have to say "I have a function converting text to a number according to the following rules...".

    And then you might recognise roman numerals as well, so "Series 2, Episode 4" and "Series II, Episode IV" would be accepted as the same.

    enterprisey_atoi() will have localization!

  • ph (unregistered) in reply to Steve The Cynic

    In C++, exceptions are used for - exceptional cases! Doh.

    What is an exceptional case? Anything which is out of common. One can communicate reaching the end of a search tree this way if the tree is deep enough that reaching the bottom of it is "uncommon".

    In performance-critical code this is enforced by performance requirements (exceptions are expensive and would slow down execution if used in "common" branches), in other code this is just a convention.

  • superman (unregistered)

    Most retarded post for a WTF ever. The code is fine! WTF!

  • fred (unregistered) in reply to faoileag
    faoileag:
    Roby McAndrew:
    I recall writing very similar code, long long ago, but mine only went up to 'f' or 'F'. This goes all the way up to eleven and beyond !
    This is because it is meant to convert a hexacosimal number to a decimal one.

    The hexacosimal system (0..9a..z) is the most compact of the numbering systems - with only 2 digits you can cover up to 1296 numbers!

    Be careful not to mistake hexacosimal numbers with base64 though - there are subtle differences.

    Why stop at letters and numbers? There's a lot of other printable characters we cna use - and we could differentiate between upper and lower case letters too....

    In fact, the entire works of shakespeare are just representations some government codes

  • Mark F (unregistered)

    This isn't a WTF it's more like a "what" or "huh".

    Calling it atoi() might be a bit confusing to those used to C's atoi() but if anything it's C's atoi() that's named confusingly - why not call that one stoi() since it clearly works on a string, while this one doesn't?

  • Pedant (unregistered) in reply to Crispy

    Sorry, this is just nonsense. Objective-C can take whatever argument types you want to fling at it.

  • Pedant (unregistered) in reply to Mark F

    Agreed.

    In fact it is not called atoi() - if all the armchair quarterbacks here could actually read, they would notice that it is called -atoi: which is a completely different thing.

  • whomsoever (unregistered)

    Of course the proper ObjC method declaration should be:

    • (char)a:(char)a toi;
  • (cs) in reply to anonymous
    anonymous:
    sztupy:
    It should only cast the stone, not return it
    A "cast" in C is completely different than a "cast" in natural language. A synonym which is more closely adhering to the original meaning would be "throw".
    Did you mean this?
    if (! person.has_sin)
       throw (stone[0]);
    
  • fan (unregistered) in reply to gnasher729
    gnasher729:
    In Objective-C, assert and exceptions are there to identify errors made by the programmer which are fixed by changing the code, not unexpected situations at runtime.

    But this case looks exdctly like that. It's either -wrong item passed (semantics or encoding etc) -memory corruption -unescaped user input All of which seems coding error need to be fixed by coder

Leave a comment on “Best atoi() Implementation Ever”

Log In or post as a guest

Replying to comment #:

« Return to Article