- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Admin
The biggest WTF in the implementation has not been pointed out yet: assembling the result String one character at a time using the + operator. That's one of the most well-known performance No-Nos in Java, since Strings are immutable, which makes this an O(n^2) operation.
Also, I bet you 1:10 that 95% of all "homebrew" implementations including the original WTF and those posted in the comments do not correctly handle characters like the German ß (which is transformed into TWO characters when uppercased, thus the comments about the result "growing" in the posted Java API implementation) or the Turkish dotted and dotless i (which each have a lowercase and uppercase version).
Admin
And then there's collation rules...
Admin
hmmm reminds me of the days when we used to check if the letters ascii value fell berwwen a certain range then added i think 64 to the value to get the uppercase... or was it the other way round...
Admin
Doing this properly requires you to have the full Unicode specification tables somewhere in memory where you can access it. Java (which this snippet was written in) doesn't need to bother about encoding because Strings are always encoded in UTF-16. All the character conversions are done by following whatever information is available in the Unicode tables (which includes a hell of a lot of transformations: uppercase, lowercase, titlecase (yes this is different in some languages), normalization, numeric value, directionality, etc...).
So if you really wanted to, you could rewrite the toUpperCase() function with a 7bit ASCII lookup table and it would a lot faster than the built in function. However, you would lose all the benefits of using UTF-16 in the first place...
Things that seem simple at first sight, are actually quite complicated when you enter the international domain... Java Character documentation
Admin
This is utterly retarded. Sure, you can argue that some built in functions can be a little hard to find or are not named as well as they could be. But if you wanted to uppercase a string surely you would immediately look for a upper/toUpper/toUpperCase method on the String class. And what do you know, it's right there. Let's hope this developer never needs to do any calculations - there's no way he'd find a cryptically named class like "Math".
If you understand English there is no excuse for moronic re-implementations like this.
Admin
EDIT: geez, I'm an angry young man today! Sorry everyone...
Admin
Hashmap anyone... if he wanted to code a wtf... wouldn't a hashmap make for a more elegant wtf?
Admin
When will people learn how to use the else if statements? WTF!!!
Admin
Admin
Why are 'a' through 'z' and 'A' through 'Z' all hard-coded?
Why don't they use a data file in XML format to supply the data (as to what lower-case characters there are, and what lower-case character corresponds to what upper-case character)?
Admin
How about calling the unchanging Account.length() function once for every iteration of the loop? Do it once at the beginning instead.
Admin
Damn, I'm slow. Also, pulling the character to be tested out of the array up to ~26 times per loop. Pull it once into a local variable at the top of the loop.
Admin
The only flaw I see is that it does not order the if statements by the relative frequency we expect to see in the language:)
Admin
Actually, that's very badly indented. It would be much easier to read if each "if" was at the same indentation level.
Admin
That's terrible, it doesn't even work right (drops characters that are non-lowercase). Hopefully this is from a data structures course and the student received an F.
Re: the xml lookup table... give it a break guys I know xml rears it's head in the majority of the wtf's round these parts but that's particularly asinine.
Admin
Admin
Admin
I like the second way better, too. The first was relevant in the old VT100 days when you tried to conserve screen height to see more of the functions.
Admin
Actually, I find the first style better to read. But it's just a matter of what you're used to between these two styles.
on the other hand really hurts me.
Admin
It is subtract 32. Ascii value of capital A is 65. Ascii value of lowercase A is 97.
97-65 is 32.
So to convert lowercase to uppercase, you subtract 32 if it is a lowercase letter.
Admin
In my travels I think I might have bumped into this programmer, except they did a custom IsNumeric checking that each char in the string was not a or not A .....
Admin
Except those cases where the uppercase doesn't translate into one character. Not only the German SS but those single ligatures for ff, fl, ffl, fi, ffi, and st which translate to multiple characters as uppercase.
Admin
You're not going to tell me people actually do that? Indent the bracket itself?
Admin
Because it's not in the code snippet. One thing I have learned in 5 years as a salaried programmer is that assumptions / preconditions are rarely true and must be explicitly verified.
Admin
Admin
Admin
I actually prefer the style of
someFunction() { // some code }
since it keeps the braces on the same level as the code that they're holding.
However if it's a short one-liner in the braces, I'll often use
if (condition) {someFunction();}
And yes I know that the semi-colon and the braces are redundant in 'C' style languages. I still use them.
Admin
Core? probably not.
Standard Libraries - most of them.
The .Net ones are frickin' HUGE (and there are often many even there that do the same thing - e.g. the plethora of ways to convert an object/value of one data type to another).
Admin
Addendum (2008-12-01 12:24): Oh, it took me a few minutes to find it again, but http://thedailywtf.com/Comments/Argument_About_Argument_Validation.aspx?pg=2#47352 started a nice flame war about indent styles, read and enjoy.
Admin
TRWTF is calling these 'java' and '.net' brace styles when they're a hell of a lot older than either. The first is usually called "K'n'R style", the second is apparently known as the "Allman" or "BSD" style (that's news to me too, I had to look it up).
Oh, and the hideous style mentioned by Ilya in a follow-up is apparently called "Whitesmiths" style, although I also feel it should be called "urrghyuck" or some similarly guttural noise.
Admin
Actually, if you want to handle different charset encodings and everything, that's all the more reason to say that the only sane approach is to use the standard system libraries. Which handle all that stuff for you automatically.
Admin
You may like the first style now, but wait until you inherit some code base that does this:
It's hard to illustrate here, but basically that condition stretched off the page so the first { was NOT visible. Just a big blob of text is what it ended up looking like.
I can see needing a lot of conditionals at times but for god's sake, try and make it readable!!
Admin
Admin
Admin
Admin
is bad enough, but I've yet to see a condition that required worse formatting.
Admin
Here in the real world account numbers don't tend to have language specific latin characters in them.
If your client wants to include ë or ß or Σ in an account number, they are [blank or blanks].
Advise them against it.
If they insist, insist on advising them against it (maybe refering them to this website).
If they still insist, do it, take the money, run (leaving appropriate apology in the code for the poor soul who has to replace you).
Admin
Admin
Admin
Yeah, I know but in this case, putting the brace on the next line would have done wonders for readability all by itself, nevermind the fact that all the conditions should have been separated out better.
I've managed to show the coder who wrote that originally the light as far as formatting goes though :)
Admin
Admin
Flame war starting in 3 ... 2 ... 1 ...
Admin
Admin
Those requirements again:
Mine looks like this(*): Let the system libraries take care of that ugly business with locales and with translating charset encodings to/from UTF-/UCS-whatever on input and output - it's what they're there for. Anything else is a WTF composed of 50% NIH syndrome and 50% reinvented wheels.(*) - modulo any minor syntax errors, I don't actually speak Java but the intent should be clear enough.
Admin
Admin
Admin
Admin
Worse. Code formatting. Ever.
Admin
let char_to_upper = function | 'A' -> 'a' | 'B' -> 'b' | 'C' -> 'c' ... | 'Y' -> 'y' | 'Z' -> 'z' | other -> other
let string_to_upper s = String.map char_to_upper s
Good: Very simple and readable, very fast, character-set agnostic. Bad: Wordy.