The Daily WTF: Curious Perversions in Information Technology

2008-12-02 Reply Admin

captain obvious:
What percentage of code WTF's are re-coding functions that already exist in the langauge's core?

I'm bored with re-coding pseudo-WTF's. How many variations on that theme will we see?

I want to see some real WTF's. Code that is so geniously insane or so utterly b0rken that it blows my mind. I want to see heisenbugs and mandelbugs.

dkf · 2008-12-02 Reply Admin

Herby:
Shouldn't the tests be in order of common usage e, t, i, o, n, i, s... (watch Wheel of Fortune).

I see you like the letter ‘i’ very much, but ‘a’ must have done something to offend you.

vr602 · 2008-12-02 Reply Admin

Utterly tedious, both the original post and the comments on it. Do we really believe that anyone would write their own toUpperCase function? No, not really. Come on Alex, scrape the barrel a bit harder please.

2008-12-02 Reply Admin

trwtf is lower and upper case characters

JimM · 2008-12-02 Reply Admin

DaveK:
MoffDub:
The real fail is using a variable name with an upper case first letter. That naming style is discouraged according to ParaSoft!
Then shouldn't it be
That naming style is discouraged according to paraSoft
... ?

No. ParaSoft is clearly a Class name, and therefore starts with an Upper Case letter (like String, Math, MyClass etc.). Were you to instantiate an object of type ParaSoft, you may choose to call it paraSoft, although some would consider that a foolish decision... ;^)

2008-12-02 Reply Admin

rt:
Let's start in a different way: how would YOU write an uppercase transformation function? Come up with some approach.
Now...

Make sure it works with non-English characters (e.g. accented letters).

Make sure it works with non-ascii (ANSI) string encodings (i.e. it works with UTF-xx, UCS-xx, etc).

Optimize.

What does your function look like?

Easy:

    private static String toUpper( String text )
    {
        StringBuilder result = new StringBuilder();
        for( int index = 0; index < text.length(); index++ )
        {
            int codePoint1 = text.codePointAt( index );
            if( Character.isLowerCase( codePoint1 ) )
            {
                for( int codePoint2 = 0; codePoint2 < 0xffff; codePoint2++ )
                {
                    if( Character.toLowerCase( codePoint2 ) == codePoint1 && !Character.isLowerCase( codePoint2 ) )
                    {
                        result.appendCodePoint( codePoint2 );
                        break;
                    }
                }
            }
            else
                result.appendCodePoint( codePoint1 );
        }
        return result.toString();
    }

It exits from the loop as soon as it finds a match, so it's pretty much optimised. Only if you have a weird character set it might take a bit longer, but you can easily solve that by throwing in some extra hardware. This algorithm would be well suited for parallelisation, so imagine the sort of performance you could achieve on a Blue Gene/L.

Code Dependent · 2008-12-02 Reply Admin

Scarlet Manuka:
Right, because uppercasing a string that contains numbers and punctuation marks really SHOULD fill it with random control characters instead. And [\]^ is the uppercase version of {|}~ too. Everybody knows that.

Note to self: bad memory segment. Always double-check assertions before relating experiences from early 80s.

Well, that's how it was in my memory, anyway...

2008-12-02 Reply Admin

Yea, another brilliant solution. Why is it that everyone thinks they have the magical solution to this issue that is better than any other possible solution? Why doesn't everyone realize the obvious that anyone in IT should readily understand... If you give 50 programmers a task, they're going to implement it in 50 different ways and they'll each think theirs is the best way.

BTW, the URL in your post returns a "Network Error (tcp_error)" error this morning. There's nothing like relying on an external resource in the constructor of an object that may or not be responding with the expected data. Frankly, I would love hung threads in an environment with a limited thread pool or the fact that this implementation just flat out wouldn't work if the data input can't be successfully retrieved.

2008-12-02 Reply Admin

The real WTF obviously is all the people on this board assuming an ASCII character set and even opposing UTF-8 as being "bullshit"

2008-12-02 Reply Admin

Init... cond... inc... looks like a job for the for loop!

2008-12-02 Reply Admin

All your solutions are inefficient according to my efficiency rules.

I am paid by the line, and the original solution outperforms your solutions 100:1 - or $100 to $1, to be more specific

2008-12-02 Reply Admin

And from the depths of COBOL...

000000 05 WS-LOWER PIC X(47) VALUE 000000 "abcdefghijklmnopqrstuvwxyzáàâäçéèêëíìîïóòôöúùûü". 000000 05 WS-UPPER PIC X(47) VALUE 000000 "ABCDEFGHIJKLMNOPQRSTUVWXYZÁÀÂÄÇÉÈÊËÍÌÎÏÓÒÔÖÚÙÛÜ".

000000 INSPECT STRING-NAME CONVERTING WS-LOWER 000000 TO WS-UPPER

2008-12-02 Reply Admin

dhasenan:
The obvious way to write it is: foreach (c; input) { if (contains (lower, c)) result ~= upper[find(lower, c)]; else result ~= c; }

I do believe that is the first D code I've seen here. Hopefully, I won't see any as the article content!

2008-12-02 Reply Admin

rt:
Guybrush Threepwood:
2) Calling substr() up to 26 times per loop iteration
I am scared to think HOW you came up with this "26"...
Guybrush Threepwood:
3) Usage of else { if (..) {} } syntax
What's wrong with that, in this specific case?
Guybrush Threepwood:
8) The fact that somebody got paid for writing this mess
Hmmmm...
Guybrush Threepwood:
Did I forget something?
Quite a lot, it seems.

Kudos on the awesome nick tho.

2008-12-02 Reply Admin

In my company we really follow the true UNIX way: if there's a small, efficient tool we use it! Why reinvent the wheel if there's a tool doing already what's necessary?

We'd use some method to use the system (dependent on programming language, of course - .Net is preferred, Java is too old and not from a great company like MicroSoft (yes, we still use the old spelling)) to do sth. like echo "<insert string to convert here, preferred by string concatenation with the rest of the command line>" | tr '[a-z]' '[A-Z]' Use the resulting output as uppercased string.

You say: "That's not enough for non-english languages"? Well, that's their fault, not mine. I use the superior english language!

</TopCoderTraineeMode>

Ilya Ehrenburg · 2008-12-02 Reply Admin

Engywuck:
<TopCoderTraineeMode></TopCoderTraineeMode>

It's an okay start, but you still have a long way to go.

2008-12-02 Reply Admin

Anonymous:
If you give 50 programmers a task, they're going to implement it in 50 different ways and they'll each think theirs is the best way.

More like: if you give 50 programmers a task, they're going to implement it in 75 different ways and they'll each think theirs is the best way.

2008-12-02 Reply Admin

I can see the original programmer thinking: "DAMN! I should have used case..."

(or switch, or whatever it is called in Java)

2008-12-02 Reply Admin

Mr.\'; Drop Database --:
More like: if you give 50 programmers a task, they're going to implement it in 75 different ways and they'll each think theirs is the best way.

But 15 of those will look like:

public class JoeBean {
    public text joe = "brillant";
    private string getJoe() {
        /* return joe; */
        /* FIXED [email protected], 03/04 */
        return "brlliant";
    }
}

2008-12-02 Reply Admin

Ilya Ehrenburg:
Engywuck:
<TopCoderTraineeMode></TopCoderTraineeMode>
It's an okay start, but you still have a long way to go.

Well, I think I'm going The Long Way toUpper...

2008-12-02 Reply Admin

Too much circle-jerkery here. The topic is "The Long Way toUpper". There is a Short Way toUpper. Use It. All these digressions about "An Alternate Way toUpper, albeit more clever, but certainly No Less Bogus", is exceedingly tedious.

And the chap who wished cancer upon the family of his opponents in this pointless pissing contest seriously needs a vacation. Read: Professional Counseling.

Alexis de Torquemada · 2008-12-03 Reply Admin

Pedant:

Haskell:

import Data.Char
upperCaseIt = map toUpper

That's a good start, but it's certainly not general (and therefore reusable) enough. I suggest:

module ToUpper where

import Prelude hiding (mapM, foldr)
import Control.Monad (MonadPlus(..))
import Data.Foldable (foldr)
import Data.Maybe (fromJust)
import Data.Traversable (Traversable(..))

msum :: (Traversable t, MonadPlus m) => t (m a) -> m a
msum = foldr mplus mzero

trBySum :: (Traversable t, MonadPlus m) => (a -> b -> Bool) -> t (b,c) -> a -> m c
trBySum eq mapping a =
  msum (traverse (\(b,c) a -> if eq a b then return c else mzero) mapping a)

trBy :: (Traversable t) => (a -> b -> Bool) -> t (b,a) -> a -> a
trBy eq mapping a = fromJust (trBySum eq mapping a `mplus` return a)

tr :: (Eq a, Traversable t) => t (a,a) -> a -> a
tr = trBy (==)

upperCaseChar :: Char -> Char
upperCaseChar = tr $ zip ['a'..'z'] ['A'..'Z']

upperCaseIt :: Functor f => f Char -> f Char
upperCaseIt = fmap upperCaseChar

Now we have immediate holistic-synergistic value-added network effects that are so important for leveraging empowering new paradigms in the successful enterprise. For instance, we can now rewrite that goofy modulus math as:

nice_modulo :: Integral i => i -> i -> i
nice_modulo = flip $ tr . zip (negInterleave [0..]) . cycle . negInterleave . enumFromTo 0 . pred
  where negInterleave = (>>= \x -> [x,-x])

Ilya Ehrenburg · 2008-12-04 Reply Admin

Alexis de Torquemada:

That's a good start, but it's certainly not general (and therefore reusable) enough. I suggest:

module ToUpper where

import Prelude hiding (mapM, foldr)
import Control.Monad (MonadPlus(..))
import Data.Foldable (foldr)
import Data.Maybe (fromJust)
import Data.Traversable (Traversable(..))

msum :: (Traversable t, MonadPlus m) => t (m a) -> m a
msum = foldr mplus mzero

trBySum :: (Traversable t, MonadPlus m) => (a -> b -> Bool) -> t (b,c) -> a -> m c
trBySum eq mapping a =
  msum (traverse (\(b,c) a -> if eq a b then return c else mzero) mapping a)

trBy :: (Traversable t) => (a -> b -> Bool) -> t (b,a) -> a -> a
trBy eq mapping a = fromJust (trBySum eq mapping a `mplus` return a)

tr :: (Eq a, Traversable t) => t (a,a) -> a -> a
tr = trBy (==)

upperCaseChar :: Char -> Char
upperCaseChar = tr $ zip ['a'..'z'] ['A'..'Z']

upperCaseIt :: Functor f => f Char -> f Char
upperCaseIt = fmap upperCaseChar

Now we have immediate holistic-synergistic value-added network effects that are so important for leveraging empowering new paradigms in the successful enterprise. For instance, we can now rewrite that goofy modulus math as:

nice_modulo :: Integral i => i -> i -> i
nice_modulo = flip $ tr . zip (negInterleave [0..]) . cycle . negInterleave . enumFromTo 0 . pred
  where negInterleave = (>>= \x -> [x,-x])

A worthy entry for the Blackholeth IOHCC. Chapeau!

2008-12-04 Reply Admin

rt:
That's... interesting. It really is though I think I find it interesting in a different way than you do.
Let's start in a different way: how would YOU write an uppercase transformation function? Come up with some approach.

Now...

Make sure it works with non-English characters (e.g. accented letters).

Make sure it works with non-ascii (ANSI) string encodings (i.e. it works with UTF-xx, UCS-xx, etc).

Optimize.

What does your function look like?

Table lookup.

2008-12-04 Reply Admin

Once you have decent strings, a string is plain text and a character set (or "encoding") is only a mapping between byte arrays and plain text.

Oh god, please tell me that you are trolling. Failing that, please tell me that whatever code base you are working on doesn't do anything important.

2008-12-05 Reply Admin

Java is a Unicode beast and uppercasing unicode characters is a non-trivial exercise. There is no need to reverse-engineer it either, the source code as always been available.

Your comment should be tommorrow's WTF.

2008-12-05 Reply Admin

asifyoucare :
Either the trolls are outnumbering the genuine posters or this site is read mainly by non-programmers. The code is obviously WTF, yet we have many posters defending it and geniuses offering their own solutions to this very tricky problem.

Yeah I agree with you. It's a waste of time even thinking about how to write code for this. The source code for doing this in the JDK is very complex, and impossible to understand unless you spend months studying Unicode. There are hundreds of special cases, and the classes are full of large data arrays, etc, etc.

The original WTF truly is a WTF and is on the same level as a programmer that types a for loop as Ctrl-C, Ctrl-V, Ctrl-V, Ctrl-V, Ctrl-V...etc. Almost total lack of knowledge on the language they are using.

cmccormick · 2008-12-06 Reply Admin

The Java toUpperCase() method is truly 'the long way toUpper', especially since they call Character.toUpperCaseEx. I'm too lazy to decipher why they can't just iterate a char array and use that method, which could hold the various internationalization rules. An interesting WTF: 'the resulting String may be a different length than the original'

/**
     * Converts all of the characters in this String to upper
     * case using the rules of the given Locale. Case mapping is based
     * on the Unicode Standard version specified by the {@link java.lang.Character Character}
     * class. Since case mappings are not always 1:1 char mappings, the resulting
     * String may be a different length than the original String.
     * 
     * Examples of locale-sensitive and 1:M case mappings are in the following table.
     * 
SNIP internationalization stuff
     
     * @param locale use the case transformation rules for this locale
     * @return the String, converted to uppercase.
     * @see     java.lang.String#toUpperCase()
     * @see     java.lang.String#toLowerCase()
     * @see     java.lang.String#toLowerCase(Locale)
     * @since   1.1
     */
    public String toUpperCase(Locale locale) {
	if (locale == null) {
	    throw new NullPointerException();
        }

        int     firstLower;

	/* Now check if there are any characters that need to be changed. */
	scan: {
	    for (firstLower = 0 ; firstLower < count; ) {
		int c = (int)value[offset+firstLower];
		int srcCount;
		if ((c >= Character.MIN_HIGH_SURROGATE) &&
		    (c <= Character.MAX_HIGH_SURROGATE)) {
		    c = codePointAt(firstLower);
		    srcCount = Character.charCount(c);
		} else {
		    srcCount = 1;
		}
		int upperCaseChar = Character.toUpperCaseEx(c);
		if ((upperCaseChar == Character.ERROR) ||
		    (c != upperCaseChar)) {
		    break scan;
		}
		firstLower += srcCount;
	    }
	    return this;
	}

        char[]  result       = new char[count]; /* may grow */
	int     resultOffset = 0;  /* result may grow, so i+resultOffset
				    * is the write location in result */

	/* Just copy the first few upperCase characters. */
	System.arraycopy(value, offset, result, 0, firstLower);

	String lang = locale.getLanguage();
	boolean localeDependent =
            (lang == "tr" || lang == "az" || lang == "lt");
        char[] upperCharArray;
        int upperChar;
        int srcChar;
        int srcCount;
        for (int i = firstLower; i < count; i += srcCount) {
	    srcChar = (int)value[offset+i];
	    if ((char)srcChar >= Character.MIN_HIGH_SURROGATE &&
	        (char)srcChar <= Character.MAX_HIGH_SURROGATE) {
		srcChar = codePointAt(i);
		srcCount = Character.charCount(srcChar);
	    } else {
	        srcCount = 1;
	    }
            if (localeDependent) {
                upperChar = ConditionalSpecialCasing.toUpperCaseEx(this, i, locale);
            } else {
                upperChar = Character.toUpperCaseEx(srcChar);
            }
            if ((upperChar == Character.ERROR) ||
                (upperChar >= Character.MIN_SUPPLEMENTARY_CODE_POINT)) {
                if (upperChar == Character.ERROR) {
                    if (localeDependent) {
                        upperCharArray =
                            ConditionalSpecialCasing.toUpperCaseCharArray(this, i, locale);
                    } else {
                        upperCharArray = Character.toUpperCaseCharArray(srcChar);
                    }
                } else if (srcCount == 2) {
		    resultOffset += Character.toChars(upperChar, result, i + resultOffset) - srcCount;
		    continue;
                } else {
                    upperCharArray = Character.toChars(upperChar);
		}

                /* Grow result if needed */
                int mapLen = upperCharArray.length;
		if (mapLen > srcCount) {
                    char[] result2 = new char[result.length + mapLen - srcCount];
                    System.arraycopy(result, 0, result2, 0,
                        i + resultOffset);
                    result = result2;
		}
                for (int x=0; x<mapLen; ++x) {
                    result[i+resultOffset+x] = upperCharArray[x];
                }
                resultOffset += (mapLen - srcCount);
            } else {
                result[i+resultOffset] = (char)upperChar;
            }
        }
        return new String(0, count+resultOffset, result);
    }</pre>

Ilya Ehrenburg · 2008-12-07 Reply Admin

cmccormick:
The Java toUpperCase() method is truly 'the long way toUpper', especially since they call Character.toUpperCaseEx. I'm too lazy to decipher why they can't just iterate a char array and use that method, which could hold the various internationalization rules. An interesting WTF: 'the resulting String may be a different length than the original'

Perhaps because of the reason quoted below?

     * Since case mappings are not always 1:1 char mappings, the resulting
     * String may be a different length than the original String.

2008-12-22 Reply Admin

regardless if anyone cares about C or not, if I was to do such a function for my own use, it would look something like this, heck most C newbies would make something like this for lower to upper case conversion :P

yay for pointers :D

const char *upcase(char *var) { if(!var[0]) return "ERROR"; printf("given string = %s\n", var);

char *result = malloc(sizeof(var)); int delta = 'A' - 'a', i = 0;

for(; i < strlen(var); i++) { if(var[i] >= 'a' && var[i] <= 'z') result[i] = var[i] + delta; else result[i] = var[i]; //either not an alphabetic character or it's already upper cased } return result; }

Ilya Ehrenburg · 2008-12-23 Reply Admin

Hirato:
regardless if anyone cares about C or not, if I was to do such a function for my own use, it would look something like this, heck most C newbies would make something like this for lower to upper case conversion :P
yay for pointers :D

const char *upcase(char *var) { if(!var[0]) return "ERROR"; printf("given string = %s\n", var);

char *result = malloc(sizeof(var)); int delta = 'A' - 'a', i = 0;

for(; i < strlen(var); i++) { if(var[i] >= 'a' && var[i] <= 'z') result[i] = var[i] + delta; else result[i] = var[i]; //either not an alphabetic character or it's already upper cased } return result; }

Wow! I bet it wasn't easy to make it that wrong. Hopefully, if you were to do such a function for anybody else's use, you'd write something a little more useable. The uppercase version of "" is "ERROR"? Four bytes (eight on a 64-bit machine) should be enough for any string? Terminating C-strings with a '\0' is for sissies, real memory is zeroed anyway? Precalculate the offset for speed, then we needn't bother about strlen()? Unicode? Nobody would ever use it, not even stuff like Ümlåütê, would they?

2009-01-18 Reply Admin

I had this as an assignment for an intro level comp sci class. The answer was supposed to look like what i replied to looks like. However I was asked by several people to check theirs for correctness and sadly there was more than one guy whos result looked closer to what was posted in the article

2009-03-14 Reply Admin

Easy way to upper: loop through every character, and if its' ASCII value is between what, 64 and 90 (whatever the lower case range is), add 32 to it.

2009-04-17 Reply Admin

The obvious solution... in C because I don't know Java...

char * to_upper(char * input)
{
    // dep: requires stdio.h
    char * buf = (char *)malloc(strlen(input)+1); // in practice I'd use a dynamically allocating function to avoid overruns but you get the idea
    printf("Please type in the uppercase equivalent of %s: ", input);
    scanf("%s", buf);
    return(buf);
}

Simple! And no locale issues!

2017-04-13 Reply Admin

TRWTF is using a tree of ifs when a switch statement would at least be slightly better

The Long Way toUpper

Leave a comment on “The Long Way toUpper”