• (cs)

    ToUpper() anyone ?

  • Domo (unregistered)

    I prefer ToLower(). It saves space!

  • Ed (unregistered)

    I prefer tolower. Or Tolower. Or TOlower. Or TOLower. Or TOWLOwer. Or TOWLOWer. Or TOWLOWEr. Or TOWLOWER....

  • (cs) in reply to Domo
    Domo:
    I prefer ToLower(). It saves space!

    Yepp. :) But Capitals look better.

    By the way: ToLower is ofter faster than ToUpper. And saving microticks of CPU-Power is very imporant nowadays.

    Regards

  • (cs)

    Ugh, there's a lot of this kind of code in our product, mostly because it was tossed in around `98, back when Java was the Hot, New Thing, and a lot of the methods didn't actually exit. There was no appropriate sort method, so we had to write one. And create an interface to indicate that things could be sorted. And toss that interface into EVERYTHING.

    There's a lot of ugly, old code that I don't dare touch in the product, because it works just fine.

    Normally, I'd rush to the defense of this code, saying things like "It might be time critical; converting to lowercase might take too long!" and "The string might be several megabytes!" but I'm guessing that this is part of a library for SQL calls, so the strings won't be much longer than 1k.

    Then again, this is TDWTF.

  • (cs) in reply to Domo
    Domo:
    I prefer ToLower(). It saves space!

    Could you possibly be a veteran of the PLATO system? The native word length on the CDC host was 60 bits, so this held 10 six-bit characters. Capital letters were represented with a shift code followed by a letter...so capital letters took up twice as much storage as lower case.

    With more special codes to give a diacritic to the previous letter, half-spaces, backspaces, etc...centering the contents of a string on the screen could be tricky. (In our projects, we determined display length by printing the string in "mode erase" (black-on-black) and then noting the location of the cursor)

  • cezio (unregistered)

    toLoler()

  • my name is missing (unregistered)

    Good thing they weren't looking for "pneumonoultramicroscopicsilicovolcanoconiosis"...

  • Paddington Bear (unregistered)

    Okay, I'm a rusty old programmer, but won't this think that 'intMandy' is part of an 'AND', or is it smarter than that?

  • w (unregistered) in reply to Ed
    Ed:
    I prefer tolower. Or Tolower. Or TOlower. Or TOLower. Or TOWLOwer. Or TOWLOWer. Or TOWLOWEr. Or TOWLOWER....

    I see it grew an extra 'W' at some point there...

  • Eric Hartwell (unregistered)

    This was obviously written in a shop that rated programmer performance by lines of code. Still, the programmer could have boosted his/her performance even more by unrolling those nasty loops and complex table lookups ...

  • (cs)

    Not to mention that the 3 methods do exactly the same, that there probably is an IndexOf equivalent in Java... And case insensitivity is just an extra parameter in .NET when using IndexOf.

  • Jon Skeet (unregistered)

    Calling either ToLower() or ToUpper() is, in general, a flawed way of performing a case-insensitive comparison, although it does depend on the platform you're using. It assumes that there's only one way of upper/lower-casing a string - when in fact it's culture-dependent. For instance (and this is where I first learned about the problem - it bit me) in Java, this code doesn't always return true:

    "mail".toUpperCase().equals("MAIL")

    In Turkey, the upper case version of "i" isn't "I".

    Where possible, it's better to use a case-insentitive comparison instead of equality (.NET is good on this front). Otherwise, you can use the invariant culture (or the equivalent on your platform) to get the same results regardless of the default culture of the environment you're running in.

    i18n has a lot to answer for.

  • (cs)

    I tend to use ToLowerInvariant() or ToUpperInvariant(), for the Turkish ı->I, i->İ reason mentioned previously.

  • (cs)

    Hey, at least the code has comments!

  • (cs)

    How old is this code? it's hard to believe that in 2007 there are still people who wouldn't know that any language has helper classes for something like searching for strings. Maybe 10-20 years ago it would've surprised a few more people, i dunno... its a shock to see this. I wonder how many company man-hours are wasted reinventing these wheels.

  • (cs) in reply to Volmarias
    Volmarias:
    Ugh, there's a lot of this kind of code in our product, mostly because it was tossed in around `98, back when Java was the Hot, New Thing, and a lot of the methods didn't actually exit. There was no appropriate sort method, so we had to write one. And create an interface to indicate that things could be sorted. And toss that interface into EVERYTHING.
    Ugh. I shudder at the thought of having to program Java against a pre-1.2 API...
  • bpk (unregistered)

    This is .net code no? I know in java there is a string compare function that ignores case so there is no need to call .toUpper or .toLower. My .net skills are a bit rusty does such a compare fuction exist?

  • Southern (unregistered)

    I bet this dude is not only a new programmer but also gets paid by line of code

  • (cs)

    Has anyone in the history of programming ever typed "oR"? It's not all caps, all lowercase, or english caps. If I ever saw anyone programming in leet speak, they'd get a shiv in the ribs.

  • Someone (unregistered) in reply to bpk
    bpk:
    This is .net code no?
    No. .net methods are pascal-cased (first letter is a captital).
    bpk:
    I know in java there is a string compare function that ignores case so there is no need to call .toUpper or .toLower.
    Most modern frameworks/libraries are able to compare strings regardless of their case.
    bpk:
    My .net skills are a bit rusty does such a compare fuction exist?
    Of course.
  • (cs) in reply to akatherder

    http://www.lolcode.com

    You'd better get your shivs ready.

  • (cs) in reply to Southern
    Southern:
    I bet this dude is not only a *new* programmer but also gets paid by line of code

    I was about to suggest the same thing. Bleh.

    Does anyone actually still get paid by line of code? I'd have thought that procedure was gladly defunct in all but the most retarded of institutions now.

  • NiceWTF (unregistered) in reply to Volmarias
    Volmarias:
    Ugh, there's a lot of this kind of code in our product, mostly because it was tossed in around `98, back when Java was the Hot, New Thing, and a lot of the methods didn't actually exit.

    String.toLowerCase() from a JDK 1.0.2 API reference copied from the sun website in june 1996, as the site says. I couldn't actually find the docs on Suns website anymore :(

    P.S. You probably meant "exist", although I have indeed seen alternative implementations of standard String methods that would (given the right inputs) never "exit", either.

  • Frandsen (unregistered) in reply to Ottokar
    Ottokar:
    And saving microticks of CPU-Power is very imporant nowadays.

    The Real W ... er ... I mean, I wonder why he doesn't break out of the loops when he finds a match. That would certainly save time.

    Also, he could save the arrays somewhere along with a score for each string, and then increment it on every call. Then he could order the strings by score, and thus increase the likelihood of getting an early match.

  • steve (unregistered) in reply to Someone

    [quote user="Someone"][quote user="bpk"]This is .net code no?[/quote] No. .net methods are pascal-cased (first letter is a captital).

    it's not required that they be pascal-cased and on a site that showcases bad code can you really assume it's not .net just because it isnt pascal-cased

  • (cs) in reply to NiceWTF
    NiceWTF:
    P.S. You probably meant "exist", although I have indeed seen alternative implementations of standard String methods that would (given the right inputs) never "exit", either.

    Yes, I did. Sadly, I'm not allowed to edit my post (thanks Alex).

    Someone:
    I know in java there is a string compare function that ignores case so there is no need to call .toUpper or .toLower.

    Most modern frameworks/libraries are able to compare strings regardless of their case.

    This wasn't a simple string compare, it was a "Does this string exist inside of this other string?" for which the appropriate usage is .indexOf (which does not have a case insensitive version). I can easily see this happening with someone who is new to Java, and is spooked at the idea of casting .toLowerCase, for fear of modifying the original string.

  • Todd (unregistered) in reply to steve

    final is a java keyword if it was sealed I would say .NET

  • wigwam (unregistered) in reply to Jon Skeet
    Jon Skeet:
    learned about the problem - it bit me) in Java, this code doesn't always return true:

    "mail".toUpperCase().equals("MAIL")

    So use: "mail".toUpperCase( Locale.UK ).equals("MAIL")

    Or what ever your locale is? It is in the APIs, and any code checking software (find bugs and the like) will throw an error on it.

  • (cs) in reply to sibtrag

    I am one of those scarred PLATOites. Many idioms had to be used every day in coding, as the original Tutor language was "designed" by "nonprogrammers".

    The wonderfull and horrible "Arrow" command-- "judge rejudge", "press Shift-Stop" to end a program, er, "lesson". No subdirectories. Requiring a call to the sysop to have a file, any file, created. A globally-smeared file name space, so if somebody in Iran named a lesson "hangman", nobody else could. Having only 150 60-bit variables worth of RAM. Allowing only 640 characters in each source code page, because once there had been a disk drive with that sector size. No visible cursor. The grab-bag of embed codes. The pneumatic slide selector. The proprietary non-error-checking 1800 baud modems. The low reliability of the 2megaword ECS memory, which went down for about an hour a day, most days. The slowness of CDCNET connections. The awful Viking 721 keyboard feel. The list is almost endless.

    Then again, saved and shared common was wonderful.

  • (cs) in reply to OmnipotentEntity
    OmnipotentEntity:
    http://www.lolcode.com

    You'd better get your shivs ready.

    That's just too scary for words...

  • (cs)

    We should all remember that posts are often anonymized by showing the code in a different language. The real offending code may well be in a language other than Java.

    Regardless of what the real language is, we all know every language even ten years ago had ways to do this sort of thing in one or two lines. When I read this code, I get the sense that it was written by someone who's used to programming in assembly.

  • (cs) in reply to Ancient_Hacker

    My emacs window is colored orange on black in homage to my PLATO roots.

    OBwtf: The 150 60-bit per-process variables were named n1 through n150 if being used as integers or strings and v1 through v150 if being used as floating point. However, a facility was available to define more meaningful aliases for these variables.

    But it wasn't all bad...ordinary text files could contain animation, the 512x512 screen was pixel addressable or could use a programmable character set (amazing for time period), and variables smaller than 60-bits could be packed (an array of 16 7-bit values would only take 2 of the 60-bit variables).

  • AdT (unregistered)

    This is supercalifragilisticexpialidocious! Or supercalifragilisticexpialidociouS. Or supercalifragilisticexpialidocioUs. Or...

  • Andrew Stein (unregistered) in reply to Frandsen
    Frandsen:
    Ottokar:
    And saving microticks of CPU-Power is very imporant nowadays.

    The Real W ... er ... I mean, I wonder why he doesn't break out of the loops when he finds a match. That would certainly save time.

    He is looking for the first (hence the name "min"). Breaking out will find a match.

  • TonyaRashcan (unregistered)

    Wait... you mean that this isn't a site to borrow code from? But you are on the front page of Google for most of my coding problems...

  • (cs) in reply to steve
    steve:
    Someone:
    bpk:
    This is .net code no?
    No. .net methods are pascal-cased (first letter is a captital).

    it's not required that they be pascal-cased and on a site that showcases bad code can you really assume it's not .net just because it isnt pascal-cased

    It's still Java.

    For some reason the Java developers decided that, while it was perfectly reasonable to abbreviate "integer" as "int", abbreviating "boolean" to "bool" was out of the question. C# uses the "bool" keyword; this code uses "boolean".

    But more to the point of the post you're replying to, there are calls to built-in functions in string: "int min=expression.length();" for instance. In .NET that would be "expression.Length". So while the programmer could have used camel-casing on his own methods, he still has to use Pascal-casing on the existing ones.

  • Andrew (unregistered)

    C'mon, no one is giving this programmer credit. The combinatorics are right.

    The 3 letter AND, NOT need 23 (yes, I learned Fortran) = 8 variants, and the OR needs 22 = 4 variants. At least, the arrays cover all the cases.

  • (cs) in reply to w
    w:
    Ed:
    I prefer tolower. Or Tolower. Or TOlower. Or TOLower. Or TOWLOwer. Or TOWLOWer. Or TOWLOWEr. Or TOWLOWER....

    I see it grew an extra 'W' at some point there...

    My guess was for wide chars.

  • (cs) in reply to EvanED
    EvanED:
    For some reason the Java developers decided that, while it was perfectly reasonable to abbreviate "integer" as "int", abbreviating "boolean" to "bool" was out of the question. C# uses the "bool" keyword; this code uses "boolean".
    If it were up to them, "int" wouldn't have been abbreviated "int". But it wasn't up to them: the goal was to be as similar to C++ as possible, because they knew a low learning curve (for C++ developers, at least) meant the best chance of adoption. I thought this was common knowledge.
  • (cs) in reply to VGR
    VGR:
    EvanED:
    For some reason the Java developers decided that, while it was perfectly reasonable to abbreviate "integer" as "int", abbreviating "boolean" to "bool" was out of the question. C# uses the "bool" keyword; this code uses "boolean".
    If it were up to them, "int" wouldn't have been abbreviated "int". But it wasn't up to them: the goal was to be as similar to C++ as possible, because they knew a low learning curve (for C++ developers, at least) meant the best chance of adoption. I thought this was common knowledge.

    Then why didn't they use 'bool'?

    Wikipedia gives the earliest version of Java, 1.0, released Jan 23, 1996. 'bool' was discussed in The Design and Evolution of C++, published in 1995, so it was at least out in the open that it was a likely addition at that point even if it wasn't in the language at that point. (It's not in the ARM, from 1990. I don't know if it was added between then or not until standardization.)

    I also think that if they were concerned about compatibility with the C++ mindset, 'int' vs. 'integer' should been the least of their worries.

    Addendum (2007-07-02 18:12): Upon further investigation, it seems that (1) bool had been accepted by the standards committee by November 1994, and (2) reviews of D&E were out as early as August 1994.

    In other words, there was really no excuse for Java to not follow in C++'s footsteps if they wanted to. I can't imagine that the inertia behind "boolean" would have been too great to overcome by that point.

  • (cs) in reply to Ed
    Ed:
    I prefer tolower. Or Tolower. Or TOlower. Or TOLower. Or TOWLOwer. Or TOWLOWer. Or TOWLOWEr. Or TOWLOWER....
    Don't forget to bring a towel!
  • [ICR] (unregistered)

    Obviously the code is far too inflexible. They should have used a dictionary with the key of the lower case letter and the value of the upper case letter and used that to build all the combinations of the word you are trying to find the index of. (joke, obviously)

  • (cs) in reply to EvanED
    EvanED:
    VGR:
    EvanED:
    For some reason the Java developers decided that, while it was perfectly reasonable to abbreviate "integer" as "int", abbreviating "boolean" to "bool" was out of the question. C# uses the "bool" keyword; this code uses "boolean".
    If it were up to them, "int" wouldn't have been abbreviated "int". But it wasn't up to them: the goal was to be as similar to C++ as possible, because they knew a low learning curve (for C++ developers, at least) meant the best chance of adoption. I thought this was common knowledge.

    Then why didn't they use 'bool'?

    Because abbreviations suck. Especially an abbreviation that only saves three characters. Readability is far more important, because roughly 90% of a developer's time is spent doing maintenance. (Or as Sun VP Graham Hamilton put it: "It's more important that Java programs be easy to read than to write.")

    The only reason "int" was not expanded to "integer" was to be C/C++ compatible. I wouldn't be surprised if even that was debated at length.

  • root (unregistered)

    Dang! memcmp is case sensitive! :D

  • fauxparse (unregistered)

    Mmm, O(n^2) text search algorithms...

  • BillyBob (unregistered) in reply to EvanED
    EvanED:
    Then why didn't they use 'bool'?

    Wikipedia gives the earliest version of Java, 1.0, released Jan 23, 1996.

    Addendum (2007-07-02 18:12): bool had been accepted by the standards committee by November 1994, and (2) reviews of D&E were out as early as August 1994.

    In other words, there was really no excuse for Java to not follow in C++'s footsteps if they wanted to. I can't imagine that the inertia behind "boolean" would have been too great to overcome by that point.

    No excuse? There was 14 months between Java's release and the acceptance of bool into the C++ standard (by your figures). Do you think that Sun rolled out Java the day after it was built? I doubt it, and the amount of time, effort and money required to make such a subtle and inconsequential change would make it prohibitive anyway.

  • Jon Skeet (unregistered) in reply to wigwam
    wigwam:
    Jon Skeet:
    learned about the problem - it bit me) in Java, this code doesn't always return true:

    "mail".toUpperCase().equals("MAIL")

    So use: "mail".toUpperCase( Locale.UK ).equals("MAIL")

    Or what ever your locale is? It is in the APIs, and any code checking software (find bugs and the like) will throw an error on it.

    Hence the bit of my post which you snipped:

    Otherwise, you can use the invariant culture (or the equivalent on your platform) to get the same results regardless of the default culture of the environment you're running in.

    It's better to use the invariant culture if there is one defined, rather than "UK" - it indicates the intent more clearly, IMO.

    Jon

  • I am your Father.... (unregistered)

    hmm FindOr(), FindNot()....

    try { findAnd(); break Yoda; } catch (WTFException wtf) {}

    Yoda: FindOr(); FindNot(); // there is no Try

  • woohoo (unregistered) in reply to Jon Skeet
    Jon Skeet:
    Calling either ToLower() or ToUpper() is, in general, a flawed way of performing a case-insensitive comparison, although it does depend on the platform you're using. It assumes that there's only one way of upper/lower-casing a string - when in fact it's culture-dependent. For instance (and this is where I first learned about the problem - it bit me) in Java, this code doesn't always return true:

    "mail".toUpperCase().equals("MAIL")

    In Turkey, the upper case version of "i" isn't "I".

    Where possible, it's better to use a case-insentitive comparison instead of equality (.NET is good on this front). Otherwise, you can use the invariant culture (or the equivalent on your platform) to get the same results regardless of the default culture of the environment you're running in.

    i18n has a lot to answer for.

    Ah. And what did you think this is for when it bit you? ;o) (see below) BTW, toLowerCase() without parameter does the same using the default locale...

    from java.lang.String:

    public String toUpperCase(Locale locale)

    Converts all of the characters in this String to upper case using the rules of the given Locale. Case mapping is based on the Unicode Standard version specified by the Character class. Since case mappings are not always 1:1 char mappings, the resulting String may be a different length than the original String.

    Examples of locale-sensitive and 1:M case mappings are in the following table.

     
    Language Code of Locale	Lower Case	Upper Case	Description
    tr (Turkish)	        \u0069	        \u0130	        small letter i -> capital letter I with dot above
    tr (Turkish)	        \u0131	        \u0049	        small letter dotless i -> capital letter
    
    I

Leave a comment on “Extra Sensitive Case Insensitivity”

Log In or post as a guest

Replying to comment #143734:

« Return to Article