• (cs) in reply to D-Coder
    D-Coder:
    PedanticCurmudgeon:
    Someone should have brought up the "two problems" thing at least 2 hours ago. What's wrong with you people?
    We're retarded.

    And let me assure you...

    I don't know why, but i feel a strong desire to leave a reply.

    There is some-fin fishy going on here!

  • (cs) in reply to the wise guy
    the wise guy:
    aliquot:
    Wait, what?

    Same address = roommates or family. How many parents can there really be who give more than one of their kids the same name?

    Just think of two men, named John Smith who happen to have the same date of birth living in NY,NY in the same building. One in the 3rd one in the 5th floor.

    Most databases I know don't have an extra field for the number of the appartment you are living in, so what do you do now? ;-)

    So it's John Smith and John Smith, eh?

    A building in NY you say...

    Maybe... ... apply the established 9/11-pattern?

  • (cs) in reply to Suaver
    Suaver:
    swiers:
    Not many, but you often have a father and son with exactly the same name, and no legal "junior", let alone a "the third". I knew I guy who used to get hauled into jail for his father's outstanding warrants on a semi-regular basis.
    Not often a father and son would share the same date of birth, I would think....

    There's a lot of reasons a DB would not store date of birth. For example, an employer may not be legally allowed to ask. And in any case, having a distinct number assigned to a DB record can be handy just because (for example) it allows you to easily reference that record on other tables. Multi-value keys can be a pain that way...

    But yeah, my friends case was a clear example of (willful) human error - I suspect the cops were using it as an excuse to execute un-waranted searches. "Whoops, we arrested the wrong guy. And gee, his car turned up clean, so I guess we can't hold him for more than 24 hours."

  • Ollie (unregistered)

    I can't sea why weed have to keep dredging up the same mold smelly puns. I am whaling from the horror.

    Time to go back to my favorite stock market game: buy low, shell hi.

  • (cs) in reply to davey
    davey:
    This demonstrates why meaningless long integers are the best primary keys to use. As another poster pointed out, truly unique natural keys are very rare. If you think you have one, you don't. If you are sure you have one, you probably still don't.

    If you are 100% double dog dare sure that you have one and have 10 business / DB experts that agree with you, it's STILL better to use integers for performance reasons anyway, since computers cannot compare /anything/ faster than they can compare integers; certainly not composites thereof.

    Using "business keys" as primary keys is a basic error; and one of the few that can sink a project all by itself if requirements change. The design technique is fundamentally flawed.

    What a load of nonsense. So we're going to store IMSIs for customers, but with a unique constraint (because yes, they are unique and no, this requirement won't change) and a meaningless integer as the primary key instead?

    There are many cases where a meaningless integer (or varchar(64), if you've had the pleasure to work with Siebel) is the best choice for a primary key, but to state that anything else is a 'fundamentally flawed design technique' is sweeping, simplistic and incorrect.

  • Olivier (unregistered) in reply to Crumb Mudgeon

    I always knew the French were really communists. And Esperantistoj. And pretty much anyone who uses accents.

  • a highly-placed source (unregistered) in reply to snoofle
    snoofle:
    Nobody knows everything, least of all beginners, so I can understand not knowing that the "unusual" characters are effectively translated-down when using "like" (I didn't realize that either).

    Not knowing enough to ask if there's a better way than what they came up with, which looks sufficiently complicated that there just has to be a better way, is unforgivable in this day and age.

    Snoofle, I like your frequent yarns in the sidebar, but you might have missed the point.

    Collate languuage | case sensitive? | accent sensitive? Collate ... CI AI = ignore case, ignore accent

    "where A LIKE B" is lazy-hand SQL for "where A = B"

    BTW, = is ==, not =. SQL is more equal than other syntax.

  • (cs)

    So many people forget that the "L" in SQL is for LANGUAGE. There are many, many very useful operations that can be performed in SQL. This example illustrates its power.

  • (cs) in reply to no laughing matter
    no laughing matter:
    the wise guy:
    aliquot:
    Wait, what?

    Same address = roommates or family. How many parents can there really be who give more than one of their kids the same name?

    Just think of two men, named John Smith who happen to have the same date of birth living in NY,NY in the same building. One in the 3rd one in the 5th floor.

    Most databases I know don't have an extra field for the number of the appartment you are living in, so what do you do now? ;-)

    So it's John Smith and John Smith, eh?

    A building in NY you say...

    Maybe... ... apply the established 9/11-pattern?

    Hang on, I can see where this is going ...

    Two bad boys bring down the entire government database system by registering with the same name, date of birth and addresses in the same apartment building. The fact that this compromises the critera for a unique primary key causes a key component of the whatever-system to crash. This in turn brings down etc. etc. Now go write the movie.

  • (cs) in reply to Ollie
    Ollie:
    I can't sea why weed have to keep dredging up the same mold smelly puns. I am whaling from the horror.

    Time to go back to my favorite stock market game: buy low, shell hi.

    Selfish!

  • a highly-placed source (unregistered) in reply to xorsyst
    xorsyst:
    Vombatus:
    Robbert:
    I don't think this is very wtf-worthy. If I hadn't accidentally used a accent-insensitive primary key in a project that had both an accented and non-accented version of the same primary key I would have never found out about this feature (until now of course, that's why TDWTF is awesome).

    Some would argue that having anything other than long integers as a primary key is wrong.

    I prefer UUIDs as primary keys. They are automatically unique across tables, can be generated outside the database if desired, and avoid primary key clashes when using multi-master database replication.

    What a fine idea.

    GUIDs are 32 bytes long, and because of the way they are generated, are non-sequential. So adding 1 or a million rows to your table can force a table sort; in some cases, slower than a row-by-row rebuild.

    Plenty of rant on the googlenetwebs: "Why are GUIDs so fucking stupid as PKEYS?"

  • guest (unregistered)

    Someone has to this comment.

  • Lt. Data (unregistered)

    When I first saw a database with GUID IDs I thought WTF?

    Then when I had to merge multiple relational structures from distributed clones for central statistical analysis I thought genius!

    But the best part about GUID IDs is no one is tempted to display them to users, or ask customers to recite them over the phone. This automatically satisfies the "meaningless" part of "permanent, meaningless and unique" that should apply to all primary keys. And of course the U takes care of the unique bit.

  • Cyt (unregistered) in reply to Suaver
    Suaver:
    Not often a father and son would share the same date of birth, I would think....

    Yo - Zuneeeeeesiiiiis!!!

  • (cs) in reply to Toon
    Toon:
    Vombatus:
    Robbert:
    I don't think this is very wtf-worthy. If I hadn't accidentally used a accent-insensitive primary key in a project that had both an accented and non-accented version of the same primary key I would have never found out about this feature (until now of course, that's why TDWTF is awesome).

    Some would argue that having anything other than long integers as a primary key is wrong.

    Some would argue that using long integers as a primary key, when any other field is sufficient for uniquely defining a row, is being like those people who click "apply" before they click "OK", just to be sure.

    TRWTF here is applications which require you to enter "Apply" before pressing "OK" or you lose your changes.

  • (cs) in reply to Brian White
    Brian White:
    Richard:
    Neither query is SARGable [1]; any index on the name or surname columns will be ignored. As the table grows, the query will get slower, which will no doubt be the fault of the database, and used as an argument to go NoSQL [2].

    The simple solution is to change the collation on the name columns and drop the "collate ..." clauses from the query.

    If that won't work for some reason, use a computed column with the correct collation instead. [3]

    [1] http://en.wikipedia.org/wiki/Sargable [2] http://en.wikipedia.org/wiki/NoSQL [3] http://www.sqlteam.com/article/using-indexed-computed-columns-to-improve-performance

    You definitely want to preserve your full data in the actual column, and use a computed column for the lossy collation if you plan on showing the accent characters.

    Illiteracy is shining in this post.

  • ronpaii (unregistered) in reply to mirk

    I can’t think of any business information that is static. It is fun working with a widely used business application that displays company information sorted by the primary key. The users attempt to enter key values to have the sort show the Company names sorted. Then when the company changes its name, the users want to change the key. After about 10 years this application added the ability to sort by other columns, although the primary key is the default.

  • Name is Immaterial (unregistered) in reply to Oded
    I do try to reuse my fish, whenever possible.
    Good idea. Leftover cod makes for a great collation.
  • (cs) in reply to QJo
    QJo:
    Ollie:
    I can't sea why weed have to keep dredging up the same mold smelly puns. I am whaling from the horror.

    Time to go back to my favorite stock market game: buy low, shell hi.

    Shellfish!

    FTFY

  • Jupiter (unregistered) in reply to Toon
    Toon:
    Toon:
    Some would argue that using long integers as a primary key, when any other field is sufficient for uniquely defining a row, is being like those people who click "apply" before they click "OK", just to be sure.

    A case to illustrate my point while I'm up here on my soapbox: people might be defined by a long integer, because there can be two people living on the same address with the same name, sex and date of birth. That, as I sometimes explain to people who say they're "just a number", is why we have social security numbers, employee numbers, etc.

    (captcha: vereor. it sounds like a gurgle!)

    more to the point is that when they move house you don't need to update every single piece of information you have about them in your database.

  • (cs) in reply to DavidTC
    DavidTC:
    Erm, let's assume I don't know anything about collation...why the hell would I use SQL functions? Hell, the code _itself_ asks why they're doing it in SQL. You do the replacement outside the SQL, and pass that variable in. Uh...duh?

    Also, in what universe would this want or need a regexp? Why the fuck would you replace single characters using a regexp?

    To replace characters, you use something like php's strtr. I'm sure most languages have something like this. In fact, there's a example of stripping accents in the PHP documentation for that function:

    $txt = strtr($txt, "\xe1\xc1\xe0\xc0\xe2\xc2\xe4\xc4\xe3\xc3\xe5\xc5". "\xaa\xe7\xc7\xe9\xc9\xe8\xc8\xea\xca\xeb\xcb\xed". "\xcd\xec\xcc\xee\xce\xef\xcf\xf1\xd1\xf3\xd3\xf2". "\xd2\xf4\xd4\xf6\xd6\xf5\xd5\x8\xd8\xba\xf0\xfa\xda". "\xf9\xd9\xfb\xdb\xfc\xdc\xfd\xdd\xff\xe6\xc6\xdf\xf8", "aAaAaAaAaAaAacCeEeEeEeEiIiIiIiInNo". "OoOoOoOoOoOoouUuUuUuUyYyaAso");

    Wow, it's rocket science.

    You mean "in the user-submitted comments on the documentation".

    Sort of like saying "there's a mistake in that textbook on economics" when the mistake is in the notes someone wrote in the margin. Yes, the mistake is in that textbook, but that's not what that means...

  • (cs) in reply to Jupiter
    Jupiter:
    Toon:
    Toon:
    Some would argue that using long integers as a primary key, when any other field is sufficient for uniquely defining a row, is being like those people who click "apply" before they click "OK", just to be sure.

    A case to illustrate my point while I'm up here on my soapbox: people might be defined by a long integer, because there can be two people living on the same address with the same name, sex and date of birth. That, as I sometimes explain to people who say they're "just a number", is why we have social security numbers, employee numbers, etc.

    (captcha: vereor. it sounds like a gurgle!)

    more to the point is that when they move house you don't need to update every single piece of information you have about them in your database.

    Or when they change birthdays

  • (cs) in reply to PedanticCurmudgeon
    PedanticCurmudgeon:
    QJo:
    Ollie:
    I can't sea why weed have to keep dredging up the same mold smelly puns. I am whaling from the horror.

    Time to go back to my favorite stock market game: buy low, shell hi.

    Shellfish!

    FTFY

    Close: what was intended was "Sell fish".

  • Toon (unregistered) in reply to The poop of DOOM
    The poop of DOOM:
    Or when they change birthdays

    Hehe, good one. However, people can, and actually do, change genders. Come to think of it, it's kind of surprising I've never heard of a WTF story about that!

    captcha: bene (molto molto)

  • not Godfried (unregistered) in reply to aliquot

    I once knew a Gottfried Gottfried (the 17th) who was disowned by his family for not naming his 1st born son Gottfried.

    Also, I learned a lot about db collation today and plan to incorporate all the great ideas into my current db project. =D

  • mkl (unregistered) in reply to et

    Also, if the snippet comes from the times in which they didn't, probably didn't have those handy collate statement.

  • (cs) in reply to sabik
    sabik:
    Some would argue that having anything other than long integers as a primary key is wrong.
    And they should be auto-generated and not changeable by the user.

    And they should come from non-overlapping ranges for different tables.

    Why in the hell would this even be a remotely good idea?

    "We can only have 4 billion rows."

    "4 billion users?"

    "No, 4 billion (users + orders + settings + ...)"

  • pc (unregistered) in reply to Toon

    excellent response.

  • Patrick (unregistered) in reply to frits
    /* HACK! Being SQL, there's no concept of RegEx, so we have to this horrible hack */
    "I just completely your horrible hack. Is that better?" - Bob.
  • Patrick (unregistered) in reply to Bob
    Bob:
    Anyone who thinks regexs are a solution to character set issues deserves a good shooting. I can understand not knowing the answer, but thinking regex is the solution is insane.

    Yes, I know: Anyone who thinks regexs are a solution deserves a good shooting. FTFY.

    Hey, that's funny, you sound just like the "anyone who thinks SQL is a solution deserves a good shooting" guys. Just because you don't know how to use it right, doesn't mean it's bad.

  • (cs) in reply to Patrick
    Patrick:
    Bob:
    Anyone who thinks regexs are a solution to character set issues deserves a good shooting. I can understand not knowing the answer, but thinking regex is the solution is insane.

    Yes, I know: Anyone who thinks regexs are a solution deserves a good shooting. FTFY.

    Hey, that's funny, you sound just like the "anyone who thinks SQL is a solution deserves a good shooting" guys. Just because you don't know how to use it right, doesn't mean it's bad.
    Except for regexes

  • Sadly, I've Seen This (unregistered) in reply to Richard
    Richard:
    Neither query is SARGable [1]; any index on the name or surname columns will be ignored. As the table grows, the query will get slower, which will no doubt be the fault of the database, and used as an argument to go NoSQL [2].

    The simple solution is to change the collation on the name columns and drop the "collate ..." clauses from the query.

    If that won't work for some reason, use a computed column with the correct collation instead. [3]

    [1] http://en.wikipedia.org/wiki/Sargable [2] http://en.wikipedia.org/wiki/NoSQL [3] http://www.sqlteam.com/article/using-indexed-computed-columns-to-improve-performance

    Most modern RDBMS allow these computed indexes already as CREATE INDEX syntax. So, the trick in [3] is yet another a hack.

    For example, PostgreSQL allows indexes on a function:

    CREATE INDEX uppercase_person_idx ON person_table (UPPER(last_name));

    NoSQL is for very large map-reducible sets. You index problem is not handled any better by such a a DB.

  • (cs) in reply to Franky
    Franky:
    haha, those are the best ppl, hating something but not actually bothering to check their details first

    captcha: inhibeo - i hinder

    Bigots rarely do. It's so much easier to hate, when the facts are invented to support the prejudice.

  • marcus (unregistered) in reply to Olivier

    Yes, esperantistoj are komunistoj. Never trust anyone who uses more than 26 letters (or accents).

    Eĥoŝanĝo ĉiuĵaŭde al ĉiuj!

  • anon (unregistered) in reply to Severity One
    Severity One:
    Oh yeah, no cod reuse.
    It's people like these, that don't reuse their cods, that put this fish on the brink of extinction.

    Seeing as this is a database, perhaps they meant "no codd reuse".

    Sorry. I'll get back in my box

  • Frank (unregistered) in reply to Johnny come lately
    Johnny come lately:
    A dolphin is not a fish.

    What the hell is it, a bird?

  • Richard (unregistered) in reply to Sadly, I've Seen This
    Sadly:
    NoSQL is for very large map-reducible sets. You index problem is not handled any better by such a a DB.

    I know that; you know that; the people who wrote the original query probably don't. They'll see that their query is running like a three-legged dog in quicksand, blame SQL, and demand that the app be moved the NoSQL because it's more "enterprise-y".

    When NoSQL fails to solve their problems, they'll probably blame Bob for not knowing what he's doing, and insist that the company hire their high-school drop-out nephews who "know all about computers 'n' stuff".

  • surrogate key (unregistered) in reply to Severity One
    Severity One:
    davey:
    This demonstrates why meaningless long integers are the best primary keys to use. As another poster pointed out, truly unique natural keys are very rare. If you think you have one, you don't. If you are sure you have one, you probably still don't.

    If you are 100% double dog dare sure that you have one and have 10 business / DB experts that agree with you, it's STILL better to use integers for performance reasons anyway, since computers cannot compare /anything/ faster than they can compare integers; certainly not composites thereof.

    Using "business keys" as primary keys is a basic error; and one of the few that can sink a project all by itself if requirements change. The design technique is fundamentally flawed.

    What a load of nonsense. So we're going to store IMSIs for customers, but with a unique constraint (because yes, they are unique and no, this requirement won't change) and a meaningless integer as the primary key instead?

    There are many cases where a meaningless integer (or varchar(64), if you've had the pleasure to work with Siebel) is the best choice for a primary key, but to state that anything else is a 'fundamentally flawed design technique' is sweeping, simplistic and incorrect.

    Not only that, but the most suitable choice of surrogate key may not be an integer! A UUID (GUID) as mentioned above, a strong (i.e. highly collision-resistant) hash, or a suitably high-resolution timestamp (I know that we use insert timestamp + a numeric partition identifier as a surrogate key on some tables in the app I work on for my day job) may be a better choice than an auto-incrementing integer surrogate key for some applications. It all depends on the dataset you expect to handle, as with any other decision about primary keys.

  • Anders (unregistered) in reply to wonk
    wonk:
    Clearly, they should have created their own RegEx function in CLR. That way, they would have lots more WTfs.

    Pretty easy. Just use C# RegEx function and expose it in a CLR function.

    And no, I do not have that in production, but we "needed" it to track down a problem one day.

  • Fritz and Sauce (unregistered) in reply to George Foreman
    George Foreman:
    aliquot:
    How many parents can there really be who give more than one of their kids the same name?

    You say that like it's a bad thing.

    The lady from <insert appropriate location, let's default to Texas> who had 6 boys, all named Adam. When asked why, she replied "It's easy, when food is ready, I simply call 'Adam, dinner's ready' and they all come running". The question begged to be asked "What if you need to talk to a specific one?", "Then I just use their surname..."

  • Josef (unregistered) in reply to davey
    davey:
    Toon:
    That, as I sometimes explain to people who say they're "just a number", is why we have social security numbers, employee numbers, etc.

    Even social security numbers aren't good primary identifiers; they can be (and are) reused, and not everyone has one - even if you include only US citizens. Makes me cringe thinking how many incorrectly designed systems there are that use SSNs as primary keys (not to mention the inability to store them securely; encrypted primary keys anyone?)

    This demonstrates why meaningless long integers are the best primary keys to use. As another poster pointed out, truly unique natural keys are very rare. If you think you have one, you don't. If you are sure you have one, you probably still don't.

    If you are 100% double dog dare sure that you have one and have 10 business / DB experts that agree with you, it's STILL better to use integers for performance reasons anyway, since computers cannot compare /anything/ faster than they can compare integers; certainly not composites thereof.

    Using "business keys" as primary keys is a basic error; and one of the few that can sink a project all by itself if requirements change. The design technique is fundamentally flawed.

    Uhm...bits? or booleans, perhaps?

    Captcha: incassum...

  • Smart Arse (unregistered) in reply to the wise guy
    the wise guy:
    aliquot:
    Wait, what?

    Same address = roommates or family. How many parents can there really be who give more than one of their kids the same name?

    Just think of two men, named John Smith who happen to have the same date of birth living in NY,NY in the same building. One in the 3rd one in the 5th floor.

    Most databases I know don't have an extra field for the number of the appartment you are living in, so what do you do now? ;-)

    Are they both John William Smith, or is one John Peter Smith?

  • Suaver (unregistered) in reply to swiers
    swiers:
    Suaver:
    swiers:
    Not many, but you often have a father and son with exactly the same name, and no legal "junior", let alone a "the third". I knew I guy who used to get hauled into jail for his father's outstanding warrants on a semi-regular basis.
    Not often a father and son would share the same date of birth, I would think....

    There's a lot of reasons a DB would not store date of birth. For example, an employer may not be legally allowed to ask. And in any case, having a distinct number assigned to a DB record can be handy just because (for example) it allows you to easily reference that record on other tables. Multi-value keys can be a pain that way...

    But yeah, my friends case was a clear example of (willful) human error - I suspect the cops were using it as an excuse to execute un-waranted searches. "Whoops, we arrested the wrong guy. And gee, his car turned up clean, so I guess we can't hold him for more than 24 hours."

    That may well be, but you're picking and choosing what you argue here...

    the original post (included in your post) said:

    Toon:
    there can be two people living on the same address with the same name, sex and date of birth.
    There's probably also some scope for incorrect entry and 1958 to become 1985 thus making a father and son share a birthday (and a million other things to boot)....
  • Johnny Come Lately (Clone) (unregistered) in reply to Frank
    Frank:
    Johnny come lately:
    A dolphin is not a fish.

    What the hell is it, a bird?

    A mammal?

  • duotrei (unregistered) in reply to a highly-placed source
    a highly-placed source:
    snoofle:
    Nobody knows everything, least of all beginners, so I can understand not knowing that the "unusual" characters are effectively translated-down when using "like" (I didn't realize that either).

    Not knowing enough to ask if there's a better way than what they came up with, which looks sufficiently complicated that there just has to be a better way, is unforgivable in this day and age.

    Snoofle, I like your frequent yarns in the sidebar, but you might have missed the point.

    Collate languuage | case sensitive? | accent sensitive? Collate ... CI AI = ignore case, ignore accent

    "where A LIKE B" is lazy-hand SQL for "where A = B"

    BTW, = is ==, not =. SQL is more equal than other syntax.

    re ["where A LIKE B" is lazy-hand SQL for "where A = B" ] ORLY?

    A) How is writing more characters being lazy B) They don't do the same thing

  • the beholder (unregistered) in reply to PedanticCurmudgeon
    PedanticCurmudgeon:
    me_again:
    fisherman:
    benmurphyx:
    These people always flounder around for a solution
    Looks like we not be gettin many bite today...
    If I were on the hook for this I would have been like a fish out of water.
    I would expect someone to carp about these puns.
    Oh come on, it was just a matter of tuna-ing the DB...
  • big picture thinker (unregistered)

    The real WTF is why wasn't the table created with that collation scheme in the first place? Why would you want to run a collation on every query?

  • typos (unregistered) in reply to The poop of DOOM
    The poop of DOOM:
    Jupiter:

    more to the point is that when they move house you don't need to update every single piece of information you have about them in your database.

    Or when they change birthdays

    Someone entered my birthday wrong when I signed up for a bank account. Now I have to fax them all sorts of stuff to get it fixed.

    (validus: yes it is.)

  • Fishing (unregistered) in reply to Johnny Come Lately (Clone)

    Dolphin is a common name for the Mahi-mahi fish: http://en.wikipedia.org/wiki/Mahi-mahi

  • sabik (unregistered) in reply to pkmnfrk
    pkmnfrk:
    sabik:
    Some would argue that having anything other than long integers as a primary key is wrong.
    And they should be auto-generated and not changeable by the user.

    And they should come from non-overlapping ranges for different tables.

    Why in the hell would this even be a remotely good idea?

    "We can only have 4 billion rows."

    "4 billion users?"

    "No, 4 billion (users + orders + settings + ...)"

    If you're running that close to the limit, use a longer integer. Or UUID/GUID, as someone else suggested.

Leave a comment on “The Anti-SQL Coalition ”

Log In or post as a guest

Replying to comment #:

« Return to Article