• IHasYerCheezburger (unregistered) in reply to captain obvious
    captain obvious:
    What percentage of code WTF's are re-coding functions that already exist in the langauge's core?
    I'm bored with re-coding pseudo-WTF's. How many variations on that theme will we see?

    I want to see some real WTF's. Code that is so geniously insane or so utterly b0rken that it blows my mind. I want to see heisenbugs and mandelbugs.

  • (cs) in reply to Herby
    Herby:
    Shouldn't the tests be in order of common usage e, t, i, o, n, i, s... (watch Wheel of Fortune).
    I see you like the letter ‘i’ very much, but ‘a’ must have done something to offend you.
  • (cs)

    Utterly tedious, both the original post and the comments on it. Do we really believe that anyone would write their own toUpperCase function? No, not really. Come on Alex, scrape the barrel a bit harder please.

  • tp (unregistered)

    trwtf is lower and upper case characters

  • (cs) in reply to DaveK
    DaveK:
    MoffDub:
    The real fail is using a variable name with an upper case first letter. That naming style is discouraged according to ParaSoft!
    Then shouldn't it be
    That naming style is discouraged according to paraSoft
    ... ?
    No. ParaSoft is clearly a Class name, and therefore starts with an Upper Case letter (like String, Math, MyClass etc.). Were you to instantiate an object of type ParaSoft, you may choose to call it paraSoft, although some would consider that a foolish decision... ;^)
  • Peter (unregistered) in reply to rt
    rt:
    Let's start in a different way: how would YOU write an uppercase transformation function? Come up with some approach.

    Now...

    Make sure it works with non-English characters (e.g. accented letters).

    Make sure it works with non-ascii (ANSI) string encodings (i.e. it works with UTF-xx, UCS-xx, etc).

    Optimize.

    What does your function look like?

    Easy:

        private static String toUpper( String text )
        {
            StringBuilder result = new StringBuilder();
            for( int index = 0; index < text.length(); index++ )
            {
                int codePoint1 = text.codePointAt( index );
                if( Character.isLowerCase( codePoint1 ) )
                {
                    for( int codePoint2 = 0; codePoint2 < 0xffff; codePoint2++ )
                    {
                        if( Character.toLowerCase( codePoint2 ) == codePoint1 && !Character.isLowerCase( codePoint2 ) )
                        {
                            result.appendCodePoint( codePoint2 );
                            break;
                        }
                    }
                }
                else
                    result.appendCodePoint( codePoint1 );
            }
            return result.toString();
        }

    It exits from the loop as soon as it finds a match, so it's pretty much optimised. Only if you have a weird character set it might take a bit longer, but you can easily solve that by throwing in some extra hardware. This algorithm would be well suited for parallelisation, so imagine the sort of performance you could achieve on a Blue Gene/L.

  • (cs) in reply to Scarlet Manuka
    Scarlet Manuka:
    Right, because uppercasing a string that contains numbers and punctuation marks really SHOULD fill it with random control characters instead. And [\]^ is the uppercase version of {|}~ too. Everybody knows that.
    Note to self: bad memory segment. Always double-check assertions before relating experiences from early 80s.

    Well, that's how it was in my memory, anyway...

  • Anonymous (unregistered) in reply to mihi

    Yea, another brilliant solution. Why is it that everyone thinks they have the magical solution to this issue that is better than any other possible solution? Why doesn't everyone realize the obvious that anyone in IT should readily understand... If you give 50 programmers a task, they're going to implement it in 50 different ways and they'll each think theirs is the best way.

    BTW, the URL in your post returns a "Network Error (tcp_error)" error this morning. There's nothing like relying on an external resource in the constructor of an object that may or not be responding with the expected data. Frankly, I would love hung threads in an environment with a limited thread pool or the fact that this implementation just flat out wouldn't work if the data input can't be successfully retrieved.

  • Anonymous coward (unregistered)

    The real WTF obviously is all the people on this board assuming an ASCII character set and even opposing UTF-8 as being "bullshit"

  • Twey (unregistered)

    Init... cond... inc... looks like a job for the for loop!

  • Uhh (unregistered)

    All your solutions are inefficient according to my efficiency rules.

    I am paid by the line, and the original solution outperforms your solutions 100:1 - or $100 to $1, to be more specific

  • Anonymous Coward (unregistered)

    And from the depths of COBOL...

    000000 05 WS-LOWER PIC X(47) VALUE 000000 "abcdefghijklmnopqrstuvwxyzáàâäçéèêëíìîïóòôöúùûü". 000000 05 WS-UPPER PIC X(47) VALUE 000000 "ABCDEFGHIJKLMNOPQRSTUVWXYZÁÀÂÄÇÉÈÊËÍÌÎÏÓÒÔÖÚÙÛÜ".

    000000 INSPECT STRING-NAME CONVERTING WS-LOWER 000000 TO WS-UPPER

  • d-man (unregistered) in reply to dhasenan
    dhasenan:
    The obvious way to write it is: foreach (c; input) { if (contains (lower, c)) result ~= upper[find(lower, c)]; else result ~= c; }

    I do believe that is the first D code I've seen here. Hopefully, I won't see any as the article content!

  • Chuck (unregistered) in reply to rt
    rt:
    Guybrush Threepwood:
    2) Calling substr() up to 26 times per loop iteration
    I am scared to think HOW you came up with this "26"...
    Guybrush Threepwood:
    3) Usage of else { if (..) {} } syntax
    What's wrong with that, in this specific case?
    Guybrush Threepwood:
    8) The fact that somebody got paid for writing this mess
    Hmmmm...
    Guybrush Threepwood:
    Did I forget something?
    Quite a lot, it seems.

    Kudos on the awesome nick tho.

  • Engywuck (unregistered)
    <TopCoderTraineeMode>

    In my company we really follow the true UNIX way: if there's a small, efficient tool we use it! Why reinvent the wheel if there's a tool doing already what's necessary?

    We'd use some method to use the system (dependent on programming language, of course - .Net is preferred, Java is too old and not from a great company like MicroSoft (yes, we still use the old spelling)) to do sth. like echo "<insert string to convert here, preferred by string concatenation with the rest of the command line>" | tr '[a-z]' '[A-Z]' Use the resulting output as uppercased string.

    You say: "That's not enough for non-english languages"? Well, that's their fault, not mine. I use the superior english language!

    </TopCoderTraineeMode>
  • (cs) in reply to Engywuck
    Engywuck:
    <TopCoderTraineeMode></TopCoderTraineeMode>
    It's an okay start, but you still have a long way to go.
  • Mr.'; Drop Database -- (unregistered) in reply to Anonymous
    Anonymous:
    If you give 50 programmers a task, they're going to implement it in 50 different ways and they'll each think theirs is the best way.
    More like: if you give 50 programmers a task, they're going to implement it in 75 different ways and they'll each think theirs is the best way.
  • SH Code (unregistered)

    I can see the original programmer thinking: "DAMN! I should have used case..."

    (or switch, or whatever it is called in Java)

  • Iain Collins (unregistered) in reply to Mr.'; Drop Database --
    Mr.\'; Drop Database --:
    More like: if you give 50 programmers a task, they're going to implement it in 75 different ways and they'll each think theirs is the best way.
    But 15 of those will look like:
    public class JoeBean {
        public text joe = "brillant";
        private string getJoe() {
            /* return joe; */
            /* FIXED [email protected], 03/04 */
            return "brlliant";
        }
    }
    
  • Engywuck (unregistered) in reply to Ilya Ehrenburg
    Ilya Ehrenburg:
    Engywuck:
    <TopCoderTraineeMode></TopCoderTraineeMode>
    It's an okay start, but you still have a long way to go.

    Well, I think I'm going The Long Way toUpper...

  • Le Trôle (unregistered)

    Too much circle-jerkery here. The topic is "The Long Way toUpper". There is a Short Way toUpper. Use It. All these digressions about "An Alternate Way toUpper, albeit more clever, but certainly No Less Bogus", is exceedingly tedious.

    And the chap who wished cancer upon the family of his opponents in this pointless pissing contest seriously needs a vacation. Read: Professional Counseling.

  • (cs) in reply to Pedant
    Pedant:
    Haskell:
    import Data.Char
    

    upperCaseIt = map toUpper

    That's a good start, but it's certainly not general (and therefore reusable) enough. I suggest:

    module ToUpper where
    
    import Prelude hiding (mapM, foldr)
    import Control.Monad (MonadPlus(..))
    import Data.Foldable (foldr)
    import Data.Maybe (fromJust)
    import Data.Traversable (Traversable(..))
    
    msum :: (Traversable t, MonadPlus m) => t (m a) -> m a
    msum = foldr mplus mzero
    
    trBySum :: (Traversable t, MonadPlus m) => (a -> b -> Bool) -> t (b,c) -> a -> m c
    trBySum eq mapping a =
      msum (traverse (\(b,c) a -> if eq a b then return c else mzero) mapping a)
    
    trBy :: (Traversable t) => (a -> b -> Bool) -> t (b,a) -> a -> a
    trBy eq mapping a = fromJust (trBySum eq mapping a `mplus` return a)
    
    tr :: (Eq a, Traversable t) => t (a,a) -> a -> a
    tr = trBy (==)
    
    upperCaseChar :: Char -> Char
    upperCaseChar = tr $ zip ['a'..'z'] ['A'..'Z']
    
    upperCaseIt :: Functor f => f Char -> f Char
    upperCaseIt = fmap upperCaseChar
    

    Now we have immediate holistic-synergistic value-added network effects that are so important for leveraging empowering new paradigms in the successful enterprise. For instance, we can now rewrite that goofy modulus math as:

    nice_modulo :: Integral i => i -> i -> i
    nice_modulo = flip $ tr . zip (negInterleave [0..]) . cycle . negInterleave . enumFromTo 0 . pred
      where negInterleave = (>>= \x -> [x,-x])
    
  • (cs) in reply to Alexis de Torquemada
    Alexis de Torquemada:

    That's a good start, but it's certainly not general (and therefore reusable) enough. I suggest:

    module ToUpper where
    
    import Prelude hiding (mapM, foldr)
    import Control.Monad (MonadPlus(..))
    import Data.Foldable (foldr)
    import Data.Maybe (fromJust)
    import Data.Traversable (Traversable(..))
    
    msum :: (Traversable t, MonadPlus m) => t (m a) -> m a
    msum = foldr mplus mzero
    
    trBySum :: (Traversable t, MonadPlus m) => (a -> b -> Bool) -> t (b,c) -> a -> m c
    trBySum eq mapping a =
      msum (traverse (\(b,c) a -> if eq a b then return c else mzero) mapping a)
    
    trBy :: (Traversable t) => (a -> b -> Bool) -> t (b,a) -> a -> a
    trBy eq mapping a = fromJust (trBySum eq mapping a `mplus` return a)
    
    tr :: (Eq a, Traversable t) => t (a,a) -> a -> a
    tr = trBy (==)
    
    upperCaseChar :: Char -> Char
    upperCaseChar = tr $ zip ['a'..'z'] ['A'..'Z']
    
    upperCaseIt :: Functor f => f Char -> f Char
    upperCaseIt = fmap upperCaseChar
    

    Now we have immediate holistic-synergistic value-added network effects that are so important for leveraging empowering new paradigms in the successful enterprise. For instance, we can now rewrite that goofy modulus math as:

    nice_modulo :: Integral i => i -> i -> i
    nice_modulo = flip $ tr . zip (negInterleave [0..]) . cycle . negInterleave . enumFromTo 0 . pred
      where negInterleave = (>>= \x -> [x,-x])
    

    A worthy entry for the Blackholeth IOHCC. Chapeau!

  • tbrown (unregistered) in reply to rt
    rt:
    That's... interesting. It really is though I think I find it interesting in a different way than you do.

    Let's start in a different way: how would YOU write an uppercase transformation function? Come up with some approach.

    Now...

    Make sure it works with non-English characters (e.g. accented letters).

    Make sure it works with non-ascii (ANSI) string encodings (i.e. it works with UTF-xx, UCS-xx, etc).

    Optimize.

    What does your function look like?

    Table lookup.

  • gg (unregistered)
    Once you have decent strings, a string is plain text and a character set (or "encoding") is only a mapping between byte arrays and plain text.

    Oh god, please tell me that you are trolling. Failing that, please tell me that whatever code base you are working on doesn't do anything important.

  • Bob (unregistered) in reply to hikari

    Java is a Unicode beast and uppercasing unicode characters is a non-trivial exercise. There is no need to reverse-engineer it either, the source code as always been available.

    Your comment should be tommorrow's WTF.

  • Bob (unregistered) in reply to asifyoucare
    asifyoucare :
    Either the trolls are outnumbering the genuine posters or this site is read mainly by non-programmers. The code is obviously WTF, yet we have many posters defending it and geniuses offering their own solutions to this very tricky problem.

    Yeah I agree with you. It's a waste of time even thinking about how to write code for this. The source code for doing this in the JDK is very complex, and impossible to understand unless you spend months studying Unicode. There are hundreds of special cases, and the classes are full of large data arrays, etc, etc.

    The original WTF truly is a WTF and is on the same level as a programmer that types a for loop as Ctrl-C, Ctrl-V, Ctrl-V, Ctrl-V, Ctrl-V...etc. Almost total lack of knowledge on the language they are using.

  • (cs)

    The Java toUpperCase() method is truly 'the long way toUpper', especially since they call Character.toUpperCaseEx. I'm too lazy to decipher why they can't just iterate a char array and use that method, which could hold the various internationalization rules. An interesting WTF: 'the resulting String may be a different length than the original'

    /**
         * Converts all of the characters in this String to upper
         * case using the rules of the given Locale. Case mapping is based
         * on the Unicode Standard version specified by the {@link java.lang.Character Character}
         * class. Since case mappings are not always 1:1 char mappings, the resulting
         * String may be a different length than the original String.
         * 

    * Examples of locale-sensitive and 1:M case mappings are in the following table. *

    SNIP internationalization stuff * @param locale use the case transformation rules for this locale * @return the String, converted to uppercase. * @see java.lang.String#toUpperCase() * @see java.lang.String#toLowerCase() * @see java.lang.String#toLowerCase(Locale) * @since 1.1 */ public String toUpperCase(Locale locale) { if (locale == null) { throw new NullPointerException(); } int firstLower; /* Now check if there are any characters that need to be changed. */ scan: { for (firstLower = 0 ; firstLower < count; ) { int c = (int)value[offset+firstLower]; int srcCount; if ((c >= Character.MIN_HIGH_SURROGATE) && (c <= Character.MAX_HIGH_SURROGATE)) { c = codePointAt(firstLower); srcCount = Character.charCount(c); } else { srcCount = 1; } int upperCaseChar = Character.toUpperCaseEx(c); if ((upperCaseChar == Character.ERROR) || (c != upperCaseChar)) { break scan; } firstLower += srcCount; } return this; } char[] result = new char[count]; /* may grow */ int resultOffset = 0; /* result may grow, so i+resultOffset * is the write location in result */ /* Just copy the first few upperCase characters. */ System.arraycopy(value, offset, result, 0, firstLower); String lang = locale.getLanguage(); boolean localeDependent = (lang == "tr" || lang == "az" || lang == "lt"); char[] upperCharArray; int upperChar; int srcChar; int srcCount; for (int i = firstLower; i < count; i += srcCount) { srcChar = (int)value[offset+i]; if ((char)srcChar >= Character.MIN_HIGH_SURROGATE && (char)srcChar <= Character.MAX_HIGH_SURROGATE) { srcChar = codePointAt(i); srcCount = Character.charCount(srcChar); } else { srcCount = 1; } if (localeDependent) { upperChar = ConditionalSpecialCasing.toUpperCaseEx(this, i, locale); } else { upperChar = Character.toUpperCaseEx(srcChar); } if ((upperChar == Character.ERROR) || (upperChar >= Character.MIN_SUPPLEMENTARY_CODE_POINT)) { if (upperChar == Character.ERROR) { if (localeDependent) { upperCharArray = ConditionalSpecialCasing.toUpperCaseCharArray(this, i, locale); } else { upperCharArray = Character.toUpperCaseCharArray(srcChar); } } else if (srcCount == 2) { resultOffset += Character.toChars(upperChar, result, i + resultOffset) - srcCount; continue; } else { upperCharArray = Character.toChars(upperChar); } /* Grow result if needed */ int mapLen = upperCharArray.length; if (mapLen > srcCount) { char[] result2 = new char[result.length + mapLen - srcCount]; System.arraycopy(result, 0, result2, 0, i + resultOffset); result = result2; } for (int x=0; x<mapLen; ++x) { result[i+resultOffset+x] = upperCharArray[x]; } resultOffset += (mapLen - srcCount); } else { result[i+resultOffset] = (char)upperChar; } } return new String(0, count+resultOffset, result); }</pre>

  • (cs) in reply to cmccormick
    cmccormick:
    The Java toUpperCase() method is truly 'the long way toUpper', especially since they call Character.toUpperCaseEx. I'm too lazy to decipher why they can't just iterate a char array and use that method, which could hold the various internationalization rules. An interesting WTF: 'the resulting String may be a different length than the original'
    Perhaps because of the reason quoted below?
         * Since case mappings are not always 1:1 char mappings, the resulting
         * String may be a different length than the original String.
    
  • Hirato (unregistered)

    regardless if anyone cares about C or not, if I was to do such a function for my own use, it would look something like this, heck most C newbies would make something like this for lower to upper case conversion :P

    yay for pointers :D

    const char *upcase(char *var) { if(!var[0]) return "ERROR"; printf("given string = %s\n", var);

    char *result = malloc(sizeof(var)); int delta = 'A' - 'a', i = 0;

    for(; i < strlen(var); i++) { if(var[i] >= 'a' && var[i] <= 'z') result[i] = var[i] + delta; else result[i] = var[i]; //either not an alphabetic character or it's already upper cased } return result; }

  • (cs) in reply to Hirato
    Hirato:
    regardless if anyone cares about C or not, if I was to do such a function for my own use, it would look something like this, heck most C newbies would make something like this for lower to upper case conversion :P

    yay for pointers :D

    const char *upcase(char *var) { if(!var[0]) return "ERROR"; printf("given string = %s\n", var);

    char *result = malloc(sizeof(var)); int delta = 'A' - 'a', i = 0;

    for(; i < strlen(var); i++) { if(var[i] >= 'a' && var[i] <= 'z') result[i] = var[i] + delta; else result[i] = var[i]; //either not an alphabetic character or it's already upper cased } return result; }

    Wow! I bet it wasn't easy to make it that wrong. Hopefully, if you were to do such a function for anybody else's use, you'd write something a little more useable. The uppercase version of "" is "ERROR"? Four bytes (eight on a 64-bit machine) should be enough for any string? Terminating C-strings with a '\0' is for sissies, real memory is zeroed anyway? Precalculate the offset for speed, then we needn't bother about strlen()? Unicode? Nobody would ever use it, not even stuff like Ümlåütê, would they?

  • Jacob (unregistered) in reply to amischiefr

    I had this as an assignment for an intro level comp sci class. The answer was supposed to look like what i replied to looks like. However I was asked by several people to check theirs for correctness and sadly there was more than one guy whos result looked closer to what was posted in the article

  • Dave (unregistered)

    Easy way to upper: loop through every character, and if its' ASCII value is between what, 64 and 90 (whatever the lower case range is), add 32 to it.

  • A Nonny Moose (unregistered)

    The obvious solution... in C because I don't know Java...

    char * to_upper(char * input)
    {
        // dep: requires stdio.h
        char * buf = (char *)malloc(strlen(input)+1); // in practice I'd use a dynamically allocating function to avoid overruns but you get the idea
        printf("Please type in the uppercase equivalent of %s: ", input);
        scanf("%s", buf);
        return(buf);
    }

    Simple! And no locale issues!

  • B.o.B. (unregistered)

    TRWTF is using a tree of ifs when a switch statement would at least be slightly better

Leave a comment on “The Long Way toUpper”

Log In or post as a guest

Replying to comment #:

« Return to Article