The Daily WTF: Curious Perversions in Information Technology

2013-11-13 Reply Admin

It's code that covers all bases (at least up to base 36)

TheCPUWizard · 2013-11-13 Reply Admin

well... atoi takes a null terminated string (rather than a char).

2013-11-13 Reply Admin

atoi() that takes a char and returns a char?!

atoi() that actually allows LETTERS to be used?!

...I have nothing to say here.

Roby McAndrew · 2013-11-13 Reply Admin

I recall writing very similar code, long long ago, but mine only went up to 'f' or 'F'. This goes all the way up to eleven and beyond !

2013-11-13 Reply Admin

You might want to use int64_t instead of "long long".

2013-11-13 Reply Admin

Roby McAndrew:
I recall writing very similar code, long long ago, but mine only went up to 'f' or 'F'. This goes all the way up to eleven and beyond !

This is because it is meant to convert a hexacosimal number to a decimal one.

The hexacosimal system (0..9a..z) is the most compact of the numbering systems - with only 2 digits you can cover up to 1296 numbers!

Be careful not to mistake hexacosimal numbers with base64 though - there are subtle differences.

2013-11-13 Reply Admin

slipstream:
atoi() that actually allows LETTERS to be used?!
...I have nothing to say here.

There's nothing strange about that. Many string-to-integer conversion functions take the base, and accept bases up to 36. But since this function converts just single char, it is not needed. This actually reduces the WTFiness a bit, I think it is a lost opportunity...

2013-11-13 Reply Admin

That's a perfectly valid base-36 encoder. Which happens to work for everything down to base-1.

Possible problems:

Parameter size may be wrong (no Obj-C knowledge here). Let him who is without sin cast the first stone though, that's not WTF.
The error case should throw an exception. Or assert. Or whatever you do in Obj-C. Again, this kind of error is so common that I wouldn't go WTF, just a weary sigh.

2013-11-13 Reply Admin

faoileag:
Roby McAndrew:
I recall writing very similar code, long long ago, but mine only went up to 'f' or 'F'. This goes all the way up to eleven and beyond !
This is because it is meant to convert a hexacosimal number to a decimal one.
The hexacosimal system (0..9a..z) is the most compact of the numbering systems - with only 2 digits you can cover up to 1296 numbers!

Be careful not to mistake hexacosimal numbers with base64 though - there are subtle differences.

Of course I meant the hextricontesimal system, not the hexacosimal. Sheesh, these numerical multipliers are difficult :-)

2013-11-13 Reply Admin

So the WTF is that it takes a char not an array?

2013-11-13 Reply Admin

Sorry, the Church of Objective-C-logy states clearly that all data types must be NSSomething, like NSInteger, NSString, NSArray... If there's no NSChar, then you must invent it. Raw C data types are abhorrents, and should never crawl in decent, honest Objective C.

Captcha: pecus, using char in Objective-C is a sin

2013-11-13 Reply Admin

Crispy:
Sorry, the Church of Objective-C-logy states clearly that all data types must be NSSomething, like NSInteger, NSString, NSArray... If there's no NSChar, then you must invent it. Raw C data types are abhorrents, and should never crawl in decent, honest Objective C.
Captcha: pecus, using char in Objective-C is a sin

The real Objective-C sin is to pass anything other than a UTF-8 encoded string or NSString* to such a function. The type "char" is basically useless when you use Unicode. For example, the cyrillic small letter es "с" has a Unicode code point of 0x441, which this code would probably interpret as an "A".

And it seems the comparison (c >= 'A' && c <= 'z') with an uppercase A and lowercase z went unnoticed. Then there's the problem that the legal input '0' and any unrecognised character give the same output 0. All in all, utter rubbish.

2013-11-13 Reply Admin

Jo:
That's a perfectly valid base-36 encoder. Which happens to work for everything down to base-1.
Possible problems:

Parameter size may be wrong (no Obj-C knowledge here). Let him who is without sin cast the first stone though, that's not WTF.

The error case should throw an exception. Or assert. Or whatever you do in Obj-C. Again, this kind of error is so common that I wouldn't go WTF, just a weary sigh.

In Objective-C, assert and exceptions are there to identify errors made by the programmer which are fixed by changing the code, not unexpected situations at runtime. It would be pointless for the caller to check the input value, because that's more than half the work of the function, so I would assume that calling this with non-integer/letter chars is fine. The function should of course return an indication that a character was passed that isn't accepted.

Steve The Cynic · 2013-11-13 Reply Admin

gnasher729:
And it seems the comparison (c >= 'A' && c <= 'z') with an uppercase A and lowercase z went unnoticed. Then there's the problem that the legal input '0' and any unrecognised character give the same output 0. All in all, utter rubbish.

And you forgot the fact that the ranges 'A'..'Z' and 'a'..'z' (to use Pascal notation) are not necessarily internally contiguous. At least one(*) important character encoding has two 6 or 7 character gaps in each range.

Down with EBCDIC!

(*) It may look like one encoding to someone who isn't paying attention, but there are many versions of EBCDIC that vary subtly or not so subtly in their encoding of characters that aren't also encodable in strict 7-bit ASCII.

EDIT: I'm willing to be educated by someone who knows whether the non-contiguity feature of C/C++ characters was inherited by Objective C.

Steve The Cynic · 2013-11-13 Reply Admin

gnasher729:
In Objective-C, assert and exceptions are there to identify errors made by the programmer which are fixed by changing the code, not unexpected situations at runtime.

I've got news for you...

In C and C++, assert serves that very same purpose.

Calling a dedicated character(s)-to-integer conversion with unverified characters is a bug that can only be fixed by changing code somewhere.

In C++, exceptions should(*) in general be reserved for weird conditions, although the "fixed by changing the code" aspect doesn't apply, because "I ran out of memory" is regarded by the definition of operator new() as sufficiently weird to warrant an exception rather than a NULL return. Our good buddy dynamic_cast<>(), when used on references rather than pointers, has no non-exception method of reporting a type incompatibility, so it throws std::bad_cast.

(*) That's an opinion. C++ exception handling is relatively expensive, and fraught with terminally lethal edge cases. Consider a destructor that can throw (not explicitly disabled by most compilers as far as I know), and what happens if you call it while unwinding the stack because of another exception (i.e. after calling throw and before hitting a catch() block). (Hint: it calls terminate(), and that's the end of that.)

Medinoc · 2013-11-13 Reply Admin

I think The Real WTF is simply calling it atoi. Otherwise, this function is perfectly valid, and very often used when actually implementing atoi(), strtol() etc.

It's a useful component block for building a number-parsing function, that's valid for common bases (2, 8, 10 and 16) in ASCII and EBCDIC alike, though in the latter it only goes up to base 19 rather than base 36.

2013-11-13 Reply Admin

Jo:
Let him who is without sin cast the first stone

   if (! person.sin)
      return((int)stone[0]);

done.

2013-11-13 Reply Admin

That's the zeroth stone.

sztupy · 2013-11-13 Reply Admin

It should only cast the stone, not return it

2013-11-13 Reply Admin

TRWTF is...:
That's the zeroth stone.

But it is also the first. Zero is its nominal number and one is its ordinal number.

2013-11-13 Reply Admin

sztupy:
It should only cast the stone, not return it

A "cast" in C is completely different than a "cast" in natural language. A synonym which is more closely adhering to the original meaning would be "throw".

2013-11-13 Reply Admin

Medinoc:
I think The Real WTF is simply calling it atoi. Otherwise, this function is perfectly valid, and very often used when actually implementing atoi(), strtol() etc.

Agreed. Apart from the name, the only vaguely WTF-y thing about it is that the error case is indistinguishable from the legal case of passing in '0'. It should return -1 or CHAR_MAX or something similar on errors. Then again, it may be from a problem domain where it's desirable to treat out-of-bounds characters as zero.

Not a WTF. I demand a replacement Daily WTF, or my money back for today.

2013-11-13 Reply Admin

TRWTF is using an ordinary method -atoi: instead of a class method +atoi:, since this never accesses the instance of whatever class this is part of:

@implementation MyClass
+(char) atoi:(char) a { ... }
@end
...
// Can now be called outside the class without an instance
char x = [MyClass atoi:'x'];

2013-11-13 Reply Admin

Medinoc:
I think The Real WTF is simply calling it atoi. Otherwise, this function is perfectly valid, and very often used when actually implementing atoi(), strtol() etc.
It's a useful component block for building a number-parsing function, that's valid for common bases (2, 8, 10 and 16) in ASCII and EBCDIC alike, though in the latter it only goes up to base 19 rather than base 36.

The two hardest problems in programming are naming, cache invalidation, and off by one errors.

2013-11-13 Reply Admin

So, where exactly is the WTF here?

Or is this a site for theoretical perfectionists that have nothing better to do than nit-pick.

Without knowing the problem domain there is nothing to see here.

Roby McAndrew · 2013-11-13 Reply Admin

sztupy:
It should only cast the stone, not return it

In this way stones are unlike boomerangs. If you cast a boomerang, it is automatically returned.

2013-11-13 Reply Admin

A proper Unicode-aware atoi implementation should cover non-arabic number systems as well.

2013-11-13 Reply Admin

What is wrong with the code?

chubertdev · 2013-11-13 Reply Admin

#define atoi TRWTF

2013-11-13 Reply Admin

gnasher729:
Crispy:
Sorry, the Church of Objective-C-logy states clearly that all data types must be NSSomething, like NSInteger, NSString, NSArray... If there's no NSChar, then you must invent it. Raw C data types are abhorrents, and should never crawl in decent, honest Objective C.
Captcha: pecus, using char in Objective-C is a sin

The real Objective-C sin is to pass anything other than a UTF-8 encoded string or NSString* to such a function. The type "char" is basically useless when you use Unicode. For example, the cyrillic small letter es "с" has a Unicode code point of 0x441, which this code would probably interpret as an "A".

And it seems the comparison (c >= 'A' && c <= 'z') with an uppercase A and lowercase z went unnoticed.

Well yes, of course. It went unnoticed because it isn't there. Along with the unicorns and the president's daughter that also aren't there.

2013-11-13 Reply Admin

skington:
A proper Unicode-aware atoi implementation should cover non-arabic number systems as well.

That depends. There are plenty of standards where text can contain numbers, but only in very specific formats, so let's say digits in one of the many indian scripts wouldn't be allowed. In other situations, they should be allowed. Which means you can't say "I have a function converting text to a number", you have to say "I have a function converting text to a number according to the following rules...".

And then you might recognise roman numerals as well, so "Series 2, Episode 4" and "Series II, Episode IV" would be accepted as the same.

2013-11-13 Reply Admin

Scourge of Programmers!:
What is wrong with the code?

Run the code on your computer. Tell us what you except when the argument is '['. Tell us what you get when the argument is '['.

no laughing matter · 2013-11-13 Reply Admin

Chelloveck:
Apart from the name, the only vaguely WTF-y thing about it is that the error case is indistinguishable from the legal case of passing in '0'. It should return -1 or CHAR_MAX or something similar on errors. Then again, it may be from a problem domain where it's desirable to treat out-of-bounds characters as zero.
Not a WTF. I demand a replacement Daily WTF, or my money back for today.

char moneyback = [myinstance atoi:'$'];

2013-11-13 Reply Admin

Steve The Cynic:
And you forgot the fact that the ranges 'A'..'Z' and 'a'..'z' (to use Pascal notation) are not necessarily internally contiguous. At least one(*) important character encoding has two 6 or 7 character gaps in each range.
Down with EBCDIC!

This was as the article said intended for a "Web API". The internet doesn't care what character set your computer uses, it just sends you byte values that you have to interpret. If you received an ASCII 'A' (byte value 0x41 = 65), and your compiler uses EBDIC, then your compilers 'A' will not be the same number 65, so the whole code is rubbish.

I have in fact seen code that was prepared to run on a machine with EBCDIC character set, and it had huge lists of defines like #define ASCII_A 65 so that it could process ASCII characters.

herby · 2013-11-13 Reply Admin

Steve The Cynic:

Down with EBCDIC!

While you may be down with EBCDIC, if you understand that not all character sets are ASCII derived, you will be in much better shape. As for EBCDIC, they did one thing correct: The translation table from Card code to EBCDIC was VERY well defined. Sure, you didn't use all of it in normal instances, but it was defined. Oh, and EBCDIC is a TRUE 8 bit code.

Now why are the numbers collated above the upper case alphabet above the lower case alphabet.....

2013-11-13 Reply Admin

faoileag:
Roby McAndrew:
I recall writing very similar code, long long ago, but mine only went up to 'f' or 'F'. This goes all the way up to eleven and beyond !
This is because it is meant to convert a hexacosimal number to a decimal one.

Indeed.. itoa() takes a second parameter - the numerical base.

It makes perfect sense that the complemtary atoi() would accept numbers in other bases.

The real WTF is that somebody thinks this is a WTF.

vt_mruhlin · 2013-11-13 Reply Admin

faoileag:
The hexacosimal system (0..9a..z) is the most compact of the numbering systems - with only 2 digits you can cover up to 1296 numbers!

Yeah, but with 2 digits you'd have to rewrite this function to accept both of them. Better just stick to numbers lower than 36.

2013-11-13 Reply Admin

Mike:
Or is this a site for theoretical perfectionists that have nothing better to do than nit-pick.

Bravo! You have finally uncovered the true purpose of this website.

fennec · 2013-11-13 Reply Admin

Yth post!!!

2013-11-13 Reply Admin

herby:
Steve The Cynic:

Down with EBCDIC!

While you may be down with EBCDIC, if you understand that not all character sets are ASCII derived, you will be in much better shape. As for EBCDIC, they did one thing correct: The translation table from Card code to EBCDIC was VERY well defined. Sure, you didn't use all of it in normal instances, but it was defined. Oh, and EBCDIC is a TRUE 8 bit code.

Now why are the numbers collated above the upper case alphabet above the lower case alphabet.....

No... if you understand that all character sets* are ASCII derived, you will be in much better shape.

*that matter

If non-ASCII character sets matter to you, I'm very sorry you're forced to work on systems that use non-ASCII character sets. I have the luxury of pretending they don't exist, and I'm quite happy about it.

chubertdev · 2013-11-13 Reply Admin

gnasher729:
skington:
A proper Unicode-aware atoi implementation should cover non-arabic number systems as well.

That depends. There are plenty of standards where text can contain numbers, but only in very specific formats, so let's say digits in one of the many indian scripts wouldn't be allowed. In other situations, they should be allowed. Which means you can't say "I have a function converting text to a number", you have to say "I have a function converting text to a number according to the following rules...".

And then you might recognise roman numerals as well, so "Series 2, Episode 4" and "Series II, Episode IV" would be accepted as the same.

enterprisey_atoi() will have localization!

2013-11-13 Reply Admin

In C++, exceptions are used for - exceptional cases! Doh.

What is an exceptional case? Anything which is out of common. One can communicate reaching the end of a search tree this way if the tree is deep enough that reaching the bottom of it is "uncommon".

In performance-critical code this is enforced by performance requirements (exceptions are expensive and would slow down execution if used in "common" branches), in other code this is just a convention.

2013-11-13 Reply Admin

Most retarded post for a WTF ever. The code is fine! WTF!

2013-11-13 Reply Admin

faoileag:
Roby McAndrew:
I recall writing very similar code, long long ago, but mine only went up to 'f' or 'F'. This goes all the way up to eleven and beyond !
This is because it is meant to convert a hexacosimal number to a decimal one.
The hexacosimal system (0..9a..z) is the most compact of the numbering systems - with only 2 digits you can cover up to 1296 numbers!

Be careful not to mistake hexacosimal numbers with base64 though - there are subtle differences.

Why stop at letters and numbers? There's a lot of other printable characters we cna use - and we could differentiate between upper and lower case letters too....

In fact, the entire works of shakespeare are just representations some government codes

2013-11-13 Reply Admin

This isn't a WTF it's more like a "what" or "huh".

Calling it atoi() might be a bit confusing to those used to C's atoi() but if anything it's C's atoi() that's named confusingly - why not call that one stoi() since it clearly works on a string, while this one doesn't?

2013-11-13 Reply Admin

Sorry, this is just nonsense. Objective-C can take whatever argument types you want to fling at it.

2013-11-13 Reply Admin

Agreed.

In fact it is not called atoi() - if all the armchair quarterbacks here could actually read, they would notice that it is called -atoi: which is a completely different thing.

2013-11-13 Reply Admin

Of course the proper ObjC method declaration should be:

(char)a:(char)a toi;

Nutster · 2013-11-14 Reply Admin

anonymous:
sztupy:
It should only cast the stone, not return it
A "cast" in C is completely different than a "cast" in natural language. A synonym which is more closely adhering to the original meaning would be "throw".

Did you mean this?

if (! person.has_sin)
   throw (stone[0]);

2013-11-14 Reply Admin

gnasher729:
In Objective-C, assert and exceptions are there to identify errors made by the programmer which are fixed by changing the code, not unexpected situations at runtime.

But this case looks exdctly like that. It's either -wrong item passed (semantics or encoding etc) -memory corruption -unescaped user input All of which seems coding error need to be fixed by coder

Best atoi() Implementation Ever

Leave a comment on “Best atoi() Implementation Ever”