• seebs (unregistered)

    I feel vaguely offended, because I submitted exactly this algorithm, in C++, for case-insensitive compares, months ago, and was told it wasn't wacky enough for the WTF. :)

    I was in a QA job at the time, and the code was... Astounding. They had a lot of performance problems related to code much like this.

  • dan (unregistered)

    NAME strcasecmp, strncasecmp - case-insensitive string comparions

    SYNOPSIS #include <strings.h> int strcasecmp(const char *s1, const char *s2); int strncasecmp(const char *s1, const char *s2, size_t n);

    I think you might guess the behaviour of these standard library functions by the names... And the implementation of the code was a quite nice WTF in itself, guess the coder gets payed by the line.

  • Fred (unregistered)

    I like the way the empty string is not equal to itself

  • Benanov (cs)

    i for "case insensitive." Points for not actually mangling the string in-place. Too bad there's a library function!

    I love how it returns 0 if one of the strings is NULL. Is that standard?

    One potential "optimization" this programmer could make, since they've already gone fancy anyway: return -1 if the strings are different lengths. (Of course this probably craps all over i18n; it would return -1 on different representations of the same string that haven't been optimized to the shortest representation in UTF-8.)

    Honestly?

    Toss it, use library functions. You don't get paid for rewriting libraries.

    DISCLAIMER: I work in C#.

  • softwarebear (unregistered)

    I love the way they check for NULL at the bottom before deleting ... you mean p1 or p2 could be NULL before that ?

  • Cyrus (unregistered)
    if (p1 != NULL) delete [] p1; if (p2 != NULL) delete [] p2;

    Oh...my...word...

    I would murder him; death is not good enough though, so I would have to find something even worse.

  • RON (unregistered)

    Wow. I mean really, wow.

    Memory allocation in a string comparison.

    There should be some sort of "computer exam" that people need to take before they're allowed within 100 feet of a computer.

    Seriously.

    Capcha: DOOM.

  • Hans (unregistered)

    Lovely. I especially like the clever use of return codes:

    <0 = one or both of the input strings has length 0, or the left string is less than the right one.

    ==0 = one or both of the input strings is NULL, or the strings are identical.

    1 = the right string is less than the left one.

    I also like the initialize, fill, fill again strategy employed to create the upper-case strings. One just cannot be too sure with memory, and writing three times seems far better than just once. Well done!

    And finally, I haven't yet come across a system that doesn't have strmpi in some form or other, but for all we know this could be part of a compiler distribution, i.e. the very code that hides behind normal strcmpi; so I'm not counting that as a WTF.

  • Hans (unregistered) in reply to Hans
    Hans:
    >1 = the right string is less than the left one.

    Whoops, that should have been '>0'. My mistake.

  • rbowes (cs)

    In the programmer's defense, at least he doesn't leak memory!

  • I'm not a robot (unregistered)

    Why do you write about case sensitive or insensitive? The 'i' means improved.

  • anonymous (unregistered) in reply to rbowes

    Not a WTF to me.

    Defensive programming, and maybe poor unoptimiced code. Maybe ugly on the non-standard return values.

    Other than that, boring code.

  • mav (unregistered) in reply to Cyrus
    Cyrus:
    if (p1 != NULL) delete [] p1; if (p2 != NULL) delete [] p2;

    Oh...my...word...

    I would murder him; death is not good enough though, so I would have to find something even worse.

    Err... I don't see a huge problem with checking for null before deleting something, even if it isn't needed. Maybe I missed some sarcasm here or something...

    I like how this function returns "the strings are equal" if one string or the other is null. Hilarious.

  • JL (unregistered) in reply to softwarebear
    softwarebear:
    I love the way they check for NULL at the bottom before deleting ... you mean p1 or p2 could be NULL before that ?
    Yes, they could. He initializes them to NULL, and he only reassigns them if the strings have greater than zero length (which is not the same as their pointers being NULL). It's nice that this code prevented an invalid memory access, but it looks like the function returns -1 if either string is zero-length, which doesn't seem very useful.
  • snoofle (unregistered) in reply to Cyrus
    Cyrus:
    if (p1 != NULL) delete [] p1; if (p2 != NULL) delete [] p2;

    Oh...my...word...

    I would murder him; death is not good enough though, so I would have to find something even worse.

    Forgetting that the memory was allocated for this in the first place, I don't really have a problem with sanity checking that pointers are not null before freeing whatever they point to; it's defensive programming. Assuming this code was actually necessary in the first place, down the line, after numerous modifications and the inevitable code bloat, there might be intervening code that sets one of the pointers to null, but the subsequent coder forgot to change the delete[]p at the bottom. It happens.

  • Chris (unregistered) in reply to softwarebear
    softwarebear:
    I love the way they check for NULL at the bottom before deleting ... you mean p1 or p2 could be NULL before that ?

    The best bit about that is that delete on a NULL pointer is safe and just doesn't do anything.

  • div (unregistered) in reply to Benanov
    Benanov:
    I love how it returns 0 if one of the strings is NULL. Is that standard?
    Standard behaviour of standard library functions (not syscalls) when faced with unexpected NULL pointers is to SEGV, so no, this is not standard.

    It's not even intuitively right: if either but not both pointers are NULL, we return 0 for equality. It would perhaps be more sane to treat NULL as an empty string (therefore sorting before any non-empty string). Comparison between NULL and "" can either return 0 (two empty strings are equal), or treat the NULL as being "even shorter than empty" so again sorting earlier.

    (Oh, the given code incorrectly compares "foo", "" - returning -1 if either string is non-NULL but empty. This is correct for "", "foo", but should be +1 for the other way round.)

  • Duston (unregistered) in reply to Hans
    Hans:
    Lovely. I especially like the clever use of return codes:
    And returning NULL means FILE_NOT_FOUND, right?
  • Anonymous Tart (unregistered)

    Actually came across a 'strnicmp' in our code - turned out this code originally used to need to compile on windows 3.1 using borland c++ compiler, which didnt come with a strncasecmp. The code now (only) is used on FreeBSD/GCC toolchain, so svn rm'ed :)

    For those not getting why this is a WTF, heres the gist of our impl for comparision

    int strnicmp(const char * s1, const char * s2, size_t len)
    {
        for (ssize_t r = 0, i = 0; i < len; ++i)
            if ((r = (toupper(s1[i]) - toupper(s2[i]))) != 0)
                return r;
        return 0;
    }
    

    Theres probably even a more optimal version :o

  • Ejg (unregistered) in reply to snoofle

    Not to mention that if one ot the strings is 0 length, the pointer is not allocated. thus strcmpi ("wtf", ""); would crash on "delete [] p2;" if the NULL check were left out

  • Anonymous Tart (unregistered) in reply to snoofle
    snoofle:
    Cyrus:
    if (p1 != NULL) delete [] p1; if (p2 != NULL) delete [] p2;

    Oh...my...word...

    I would murder him; death is not good enough though, so I would have to find something even worse.

    Forgetting that the memory was allocated for this in the first place, I don't really have a problem with sanity checking that pointers are not null before freeing whatever they point to; it's defensive programming. Assuming this code was actually necessary in the first place, down the line, after numerous modifications and the inevitable code bloat, there might be intervening code that sets one of the pointers to null, but the subsequent coder forgot to change the delete[]p at the bottom. It happens.

    Checking whether something is not NULL before deleting isnt called 'defensive programming', its called 'I am a retard and never learnt that deleting a NULL is perfectly sane, legal and allowed'.

    Of course you are right, it is ESSENTIAL that we allow for code bloat in strncasecmp.... I'm always finding I have to go rewrite bzero(), add in some bloat......

    You should really

  • Anonymous Tart (unregistered) in reply to snoofle
    snoofle:
    Cyrus:
    if (p1 != NULL) delete [] p1; if (p2 != NULL) delete [] p2;

    Oh...my...word...

    I would murder him; death is not good enough though, so I would have to find something even worse.

    Forgetting that the memory was allocated for this in the first place, I don't really have a problem with sanity checking that pointers are not null before freeing whatever they point to; it's defensive programming. Assuming this code was actually necessary in the first place, down the line, after numerous modifications and the inevitable code bloat, there might be intervening code that sets one of the pointers to null, but the subsequent coder forgot to change the delete[]p at the bottom. It happens.

    Checking whether something is not NULL before deleting isnt called 'defensive programming', its called 'I am a retard and never learnt that deleting a NULL is perfectly sane, legal and allowed'.

    Of course you are right, it is ESSENTIAL that we allow for code bloat in strncasecmp.... I'm always finding I have to go rewrite bzero(), add in some bloat......

  • Anonymous Tart (unregistered) in reply to Ejg
    Ejg:
    Not to mention that if one ot the strings is 0 length, the pointer is not allocated. thus strcmpi ("wtf", ""); would crash on "delete [] p2;" if the NULL check were left out

    The forum software here is quite ace and not at all susceptible to race conditions, oh no sir.

    In C++ it is fine and dandy to delete NULL. Its not defensive to check they are not NULL before deleting, it is retarded. It would be defensive to check they are not NULL before dereferencing them.

    void* foo = NULL;
    delete foo; //legal, not a bug
    delete [] foo; //legal, not a bug
    
  • IV (unregistered)

    I don't see how an empty string is equal to itself. I do see how a char* pointing to NULL isn't equal to itself, but that really isn't the same.

    Thinking on that, though, is it a derivation of "no two NULLs are alike" that not even one NULL is alike to itself? Is that philisophical in some way.

    CAPTCHA: stinky

  • dkf (unregistered) in reply to Anonymous Tart
    Anonymous Tart:
    For those not getting why this is a WTF, heres the gist of our impl for comparision ... Theres probably even a more optimal version :o
    There is indeed a more optimal version that checks for when it comes to the end of the first string as well as counting through the explicit limit, and so refuses to charge off the end of either buffer. :-)
  • dkf (unregistered) in reply to Anonymous Tart
    Anonymous Tart:
    For those not getting why this is a WTF, heres the gist of our impl for comparision ... Theres probably even a more optimal version :o
    There is indeed a more optimal version that checks for when it comes to the end of the first string as well as counting through the explicit limit, and so refuses to charge off the end of either buffer. :-)
  • dkf (unregistered) in reply to Anonymous Tart
    Anonymous Tart:
    For those not getting why this is a WTF, heres the gist of our impl for comparision ... Theres probably even a more optimal version :o
    There is indeed a more optimal version that checks for when it comes to the end of the first string as well as counting through the explicit limit, and so refuses to charge off the end of either buffer. :-)
  • wtf := 'worse than failure?' (unregistered)

    Don't really remember much of my C, so don't really care to comment on this.

    But honestly, Worse Than Failure??

    What the phuk? I'm sure everyone else has already said it, don't really feel like looking for the comments.

    cmon, change it back. worse than failure is... well.. a failure in and of itself as far as naming conventions go... Professional appeal, is that it?

    woopdie doo.

  • DaveAronson (unregistered)

    You don't get paid for rewriting libraries.

    Sure you do, if your PHB's H is P enough.

  • mkb (cs) in reply to Hans
    Hans:
    Lovely. I especially like the clever use of return codes:

    <0 = one or both of the input strings has length 0, or the left string is less than the right one.

    ==0 = one or both of the input strings is NULL, or the strings are identical.

    1 = the right string is less than the left one.

    strcmp is intended to be used for sorting strings as well, not just as a binary comparator.

    And finally, I haven't yet come across a system that doesn't have strmpi in some form or other, but for all we know this could be part of a compiler distribution, i.e. the very code that hides behind normal strcmpi; so I'm not counting that as a WTF.

    Indeed, POSIX (but not ANSI, looks like) defines stricmp and wcsicmp and ISO C++ has _stricmp, _wcsicmp, _mbsicmp, _stricmp_l, _wcsicmp_l, and _mbsicmp_l.

    Well, according to Microsoft anyway.

  • cant get 2 lines :/ (unregistered)

    best i have for now: int strcmpi(const char *s1, const char *s2) { for(; s1 != NULL && *s1 != '\0' && s2 != NULL && *s2 != '\0'; s1++,s2++) { if (toupper(*s1) < toupper(*s2)) return -1; if (toupper(*s1) > toupper(*s2)) return 1; } return 0; }

  • Thuktun (cs)

    "strcmpi" == "String Comparisons For Idiots"?

  • annoynamous (unregistered) in reply to Anonymous Tart
    int strnicmp(const char * s1, const char * s2, size_t len) { for (ssize_t r = 0, i = 0; i < len; ++i) if ((r = (toupper(s1[i]) - toupper(s2[i]))) != 0) return r; return 0; }

    The above fails to check for '\0'. :P

    char foo[WTF_STATIC_BUF_LEN], bar[WTF_STATIC_BUF_LEN];
    
    strcpy(foo, "foobar");
    strcpy(bar, "barfoo");
    
    strcpy(foo, "beer");
    strcpy(bar, "beer");
    
    your_strnicmp(foo, bar, WTF_STATIC_BUF_LEN); /* returns wrong value */
    

    Fair enough, it works correctly for most of the cases, but one will be pulling out hair the one time it does return incorrect values.

  • Eam (unregistered) in reply to div
    div:
    It would perhaps be more sane to treat NULL as an empty string.
    Shhh! If you say that too loud, you could summon... The Oracle!
  • MaW (unregistered) in reply to mkb
    strcmp is intended to be used for sorting strings as well, not just as a binary comparator.

    Yes, yes it is, but this function's ultimate return values aren't useful for sorting, unless your sorting requirements are really really weird.

    The most amusing thing here is not just that it duplicates a widely-available standard library function (which they could find a BSD-licensed implementation of and stick into their code if it really wasn't on the system), but that it does it entirely wrongly.

    It actually makes me quite glad I'm not a commercial programmer anymore.

  • mkb (cs) in reply to cant get 2 lines :/
    cant get 2 lines :/:
    best i have for now: int strcmpi(const char *s1, const char *s2) { for(; s1 != NULL && *s1 != '\0' && s2 != NULL && *s2 != '\0'; s1++,s2++) { if (toupper(*s1) < toupper(*s2)) return -1; if (toupper(*s1) > toupper(*s2)) return 1; } return 0; }

    That still returns 0 (equality) if s1 is NULL and s2 is not, or vice versa. It also returns 0 (equality) if you compare "foo" and "fooo". That's fine for strncmpi with n set to 3, but here it's not.

  • J Random Hacker (unregistered) in reply to dan
    dan:
    strcasecmp, strncasecmp

    I think you might guess the behaviour of these standard library functions by the names...

    Though to be fair, strcmp is a hell of a lot more portable than a BSDism like strcasecmp; it's part of standard C and strcasecmp isn't. I guess it depends on what your definition of 'standard library' is.

  • mkb (cs) in reply to MaW
    MaW:
    Yes, yes it is, but this function's ultimate return values aren't useful for sorting, unless your sorting requirements are really really weird.

    It's necessary if you want to use the function for both sorting AND testing for equality. An STL-style less than comparator can't test for equality! Perl sorts are the same way (the <=> and cmp operators).

  • Chris (unregistered)

    I'm currently a in a computer science 2 class in college and the sad thing is that this is exactly the kind of code crap they try to teach us to do. We rewrote dozens of different library functions for the last exam and now half the class doesn't realize that they can just use the ones in the library.
    I try to correct the professor on insanely making copies of every string for no apparent reason but he insists its better because what if "somebody" alters your string. Thats right, the evil magic program gnome (no not the X11 gui) is going to jump in my program and delete all my origin strings so I had better copy them and use the copies at the start of every function. Better yet we better check several times in the function just to make sure they're still there and have not been scampered away by some unknown force. Many modern computer science classes are being taught by the people who only learned a modern language (to this guy, even c++ is modern) just to teach it. I understand learning how the string functions work, but do we really have to prompt reinventing the wheel in our universities?

  • Tim (unregistered)
    int stricmp(char* s1, char* s2) {
        int comp;
        while (*s1 && *s2) {
            comp = toupper(*s1++) - toupper(*s2++);
            if (!comp) return comp;
        }
        return *s1 == NULL ? !(*s2 == NULL) : -1;
    }
    

    This is just off the top of my head. I'm sure it's loaded with bugs.

    I don't think it's possible to reasonably write this function in just "two lines", though.

  • Wintaki (unregistered) in reply to anonymous

    Ha! Are you stupid or was that a joke??? That is an absolutely horrible implementation! There is no need to copy the strings, they can easily be compared, case-insensitive even, in place!

    Ever heard of toupper()? if(toupper(*s1) != toupper(*s2)) return *s2-*s1

  • AngryRichard (unregistered) in reply to rbowes
    rbowes:
    In the programmer's defense, at least he doesn't leak memory!

    If the allocation for p2 throws std::bad_alloc, then he leaks p1, so no, he's still a tard.

    CAPTCHA: quake -- had nothing on Wolfenstein 3D.

  • mkb (cs) in reply to Chris
    Chris:
    I'm currently a in a computer science 2 class in college and the sad thing is that this is exactly the kind of code crap they try to teach us to do. We rewrote dozens of different library functions for the last exam and now half the class doesn't realize that they can just use the ones in the library. I try to correct the professor on insanely making copies of every string for no apparent reason but he insists its better because what if "somebody" alters your string. Thats right, the evil magic program gnome (no not the X11 gui) is going to jump in my program and delete all my origin strings so I had better copy them and use the copies at the start of every function. Better yet we better check several times in the function just to make sure they're still there and have not been scampered away by some unknown force. Many modern computer science classes are being taught by the people who only learned a modern language (to this guy, even c++ is modern) just to teach it. I understand learning how the string functions work, but do we really have to prompt reinventing the wheel in our universities?

    I fail to see the problem with learning how to implement basic functions at first...?

    The evil magic program gnome is also known as the other threads but making copies willy nilly is not the way to avoid it.

  • mkb (cs) in reply to Wintaki
    Wintaki:
    Ha! Are you stupid or was that a joke??? That is an absolutely horrible implementation! There is no need to copy the strings, they can easily be compared, case-insensitive even, in place!

    Ever heard of toupper()? if(toupper(*s1) != toupper(*s2)) return *s2-*s1

    Wow, that will compare the first two characters.

  • mkb (cs) in reply to Tim
    Tim:
    int stricmp(char* s1, char* s2) {
        int comp;
        while (*s1 && *s2) {
            comp = toupper(*s1++) - toupper(*s2++);
            if (!comp) return comp;
        }
        return *s1 == NULL ? !(*s2 == NULL) : -1;
    }
    

    This is just off the top of my head. I'm sure it's loaded with bugs.

    I don't think it's possible to reasonably write this function in just "two lines", though.

    Yes, yours will return 0 once two characters in the same position match.

    "dvqweF ewrgew" and "ksnw7Fed9 hw" will match in this case.

    Also, *s1 == NULL? Why are you comparing aa char to NULL? Do you mean *s1 == '\0' or s1 == NULL?

  • Anonymous (unregistered)

    At least he didn't do

    if (charA=='A' && charB=='a') || (charA=='a' && charB=='A')
        return true;
    else if (charA=='B' && charB=='b') || (charA=='b' && charB=='B')
        return true;
    [...]
    return false;
  • jman (unregistered) in reply to Anonymous

    Wow, that code is a real WHAT THE FUCK

  • Tim (unregistered)

    Hmm, upon further reflection, maybe I can improve my implementation slightly...

    int stricmp(char* s1, char* s2) {
        while (toupper(*s1) == toupper(*s2) && *s1 && *s2) {
            s1++;
            s2++;
        }
        return *s1 - *s2;
    }
    

    I think this version still works...

  • Tim (unregistered) in reply to mkb
    mkb:
    Tim:
    int stricmp(char* s1, char* s2) {
        int comp;
        while (*s1 && *s2) {
            comp = toupper(*s1++) - toupper(*s2++);
            if (!comp) return comp;
        }
        return *s1 == NULL ? !(*s2 == NULL) : -1;
    }
    

    This is just off the top of my head. I'm sure it's loaded with bugs.

    I don't think it's possible to reasonably write this function in just "two lines", though.

    Yes, yours will return 0 once two characters in the same position match.

    "dvqweF ewrgew" and "ksnw7Fed9 hw" will match in this case.

    Also, *s1 == NULL? Why are you comparing aa char to NULL? Do you mean *s1 == '\0' or s1 == NULL?

    Whoops, you're right, it should have been

    if (comp) return comp;
    instead.

    I thought it was legal in C to compare characters to NULL i.e. *s1 == '\0' is equivalent to *s1 == NULL, but I'll take your word for it.

  • seebs (unregistered) in reply to cant get 2 lines :/
    cant get 2 lines :/:
    best i have for now: int strcmpi(const char *s1, const char *s2) { for(; s1 != NULL && *s1 != '\0' && s2 != NULL && *s2 != '\0'; s1++,s2++) { if (toupper(*s1) < toupper(*s2)) return -1; if (toupper(*s1) > toupper(*s2)) return 1; } return 0; }

    Problems:

    1. Returns 0 for a comparison between "a" and "an".
    2. Checks for NULL over and over.

Leave a comment on “Enough String to Hang Yourself”

Log In or post as a guest

Replying to comment #:

« Return to Article