• Shadow_x99 (unregistered)

    I have only one thing to say... SHAME!!

  • (cs)

    Sounds like the original developer skimmed over a database design book, learned what a key was, and noticed the test data he had, the address was always unique.

    But this mistake is so short sighted it's hardly forgivable.  Personally, I wish the guy was still there, I'd love to see him get cornered by the current dev team, the marketers and the VP.

  • (cs)

    You should need a license to do database design...this guy obviously belongs to the "no artificial primary keys EVER" club.  Hell, at least he had a primary key...I'm dealing with a 20 table database right now that has NONE.

    And I simply must see a screenshot of the interface.  Hey Alex, how about creating another forum called the "Interface Hall of Shame"?

  • (cs)

    [I]This developer probably had a history in city planning... That would explain all the numbered streets in New York. Hmm [^o)] 

  • (cs) in reply to Taipan
    Taipan:

    [I]This developer probably had a history in city planning... That would explain all the numbered streets in New York. Hmm [^o)] 

    ... Ehrmf... I will never use Emoticons in a post again... I promise
  • (cs) in reply to Taipan

    The worst thing is that he added the uniqueness to the front of address. I would have added whitespace to the end of the address

  • (cs)

    So we don't cover old ground, I'd like to reference http://www.thedailywtf.com/forums/2/49490/ShowPost.aspx

    There are arguments both ways about using natural keys, and that previous thread covers them well, no need to repeat it.

    This is a good argument for the always use an artificial key crowd, though there may in fact be a good natural key, that the other side can point out.

  • (cs) in reply to hank miller

    hank miller:

    This is a good argument for the always use an artificial key crowd, though there may in fact be a good natural key, that the other side can point out.

    Presumably there is a client_id in the table... he must be linking to the client somehow, right?... that's a good primary key for this table.

    If clients can have several addresses, the primary could be client_id+addr_type or something like that.

  • uncool (unregistered) in reply to hank miller

    I'm still just a hobbiest (so please forgive stupidity)

    but wouldn't a phone number be a (always) unique natural key?

  • FoShizzle (unregistered) in reply to uncool

    It depends on the application.  What if you have roommates using the same application that also share a phone number?

  • (cs) in reply to uncool
    Anonymous:

    I'm still just a hobbiest (so please forgive stupidity)

    but wouldn't a phone number be a (always) unique natural key?

    That would really depend on the business requirements. Consider that, instead of creating a Customer Number for new customers, you use their Phone Number. What happens when they change phone numbers? You will have to create a new account for them, or they will use their old account with their old phone number.

    If you're a telphone services compnay, this really isn't an issue, since if they change phone numbers, they change accounts.

  • My Second Post Here (unregistered) in reply to pjabbott
    pjabbott:
    You should need a license to do database design...this guy obviously belongs to the "no artificial primary keys EVER" club.  Hell, at least he had a primary key...I'm dealing with a 20 table database right now that has NONE.

    And I simply must see a screenshot of the interface.  Hey Alex, how about creating another forum called the "Interface Hall of Shame"?


    I have a graphic artist friend who has always wanted to make a "Museum of Programmer Art" online.  Basically an interface hall of shame, where programmers provided the artwork for an interface (or, say, a game) and couldn't understand why their bosses felt the need to get a professional artist to fix it up.

  • Chris (unregistered)

    Developer: That generates an error - the user then rings me up and I generate the customer record.

    Ahhh yes, I like to write code where the user has to call me up when he hits an error. Makes me feel important, and job security and all...

  • (cs) in reply to mlathe
    mlathe:

    hank miller:

    This is a good argument for the always use an artificial key crowd, though there may in fact be a good natural key, that the other side can point out.

    Presumably there is a client_id in the table... he must be linking to the client somehow, right?... that's a good primary key for this table.

    If clients can have several addresses, the primary could be client_id+addr_type or something like that.


    Er, the whole point is that there's no "ids" anywhere. If he was halfway bright enough to think up an id column, there's no way he'd choose something as lame as he did for the key.

    I wonder what the key was for orders... aha! Price! Since it's always negotiated, and charged for different numbers of users, it's always going to be different, right? =D
  • (cs) in reply to uncool
    uncool:

    I'm still just a hobbiest (so please forgive stupidity)

    but wouldn't a phone number be a (always) unique natural key?

    How is the phone number any different than the house address? Typically each house only has one land line, and most people I know won't give out their cell number when asked what their home phone number is. Usually the companies that keep my info have a separate entry for cell number.

    What this guy should have done was use something truly unique like the person's height or favorite color. The primary key for my customer record would be "periwinkle blue". Sorry everyone, you'll have to pick a different color.

  • Rob (unregistered) in reply to uncool
    Anonymous:

    I'm still just a hobbiest (so please forgive stupidity)

    but wouldn't a phone number be a (always) unique natural key?



    Possibly.  Probably, even.  But consider a case where two people from the same household have an account (husband/wife, parent/child, two siblings, roomates, whatever)  If they share the house line then you're SOL.
  • (cs) in reply to uncool

    No, a phone number is not unique enough. Both me and my roomate go to the same hair cutter and they key things by phone number.  I always have to tell them that no, I don't like my hair cut as style 1, that is what my roommate likes. I like style 4.  Also when I go to a local box store for electronics, they ask my phone number.  They then procede to call me by some one elses name who had the number before me.  I don't correct them because I don't want them collecting my shopping habbets under my name.

    A customer card with a unique number would be best in the above cases, although not practical.

  • (cs) in reply to uncool
    Anonymous:

    I'm still just a hobbiest (so please forgive stupidity)

    but wouldn't a phone number be a (always) unique natural key?



    Er, not necessarily.  What if two roommates order from the same company? Different names, but same address and phone number.  Also, what if two people from the same company place an order and they both give the general office number instead of their extension?  I do this all the time so the receptionist can deal with the telemarketers ;-)

    Generally speaking, it is very, very hard to come up with a natural PK when dealing with people.  Name obviously wouldn't work.  SSN might work, but then you might deal with people who are here on visas, international orders, or people who flat out don't want to give you that kind of info.  About the only marginally successful thing I have seen is e-mail address (places like amazon.com use email address as your login), but then you have cascade update issues if they wish to change it.  Best to just stick to an identify field.
  • (cs)

    Well, for the interface, maybe he was just color blind.

    I know a color blind guy who worked on a traffic light management system (with a graphical interface to display their state). Apparently his colleagues had a lot of fun "helping" him to get the colors right.
    The client was kind of wondering why the states were displayed as purple, brown and blue though.

  • (cs) in reply to pjabbott
    pjabbott:
    You should need a license to do database design...


    Amen to that...
  • (cs) in reply to DonMcNellis

    And, as the database grows, an integer or guid is going to be significantly faster for joins than a char*...

  • (cs) in reply to mlathe

    mlathe:
    The worst thing is that he added the uniqueness to the front of address. I would have added whitespace to the end of the address

    While that would be the sane thing to do, I'd almost bet that this was an MSAccess database.  It automatically trims trailing space characters from text fields.

    What I can't figure out is why this database wasn't under scrutiny as soon as the first mailing failed.  Wasn't anyone paying attention?

  • (cs) in reply to aikimark
    aikimark:

    mlathe:
    The worst thing is that he added the uniqueness to the front of address. I would have added whitespace to the end of the address

    While that would be the sane thing to do, I'd almost bet that this was an MSAccess database.  It automatically trims trailing space characters from text fields.

    What I can't figure out is why this database wasn't under scrutiny as soon as the first mailing failed.  Wasn't anyone paying attention?

    meant to be a joke

  • (cs) in reply to hank miller

    hank miller:
    So we don't cover old ground, I'd like to reference http://www.thedailywtf.com/forums/2/49490/ShowPost.aspx

    There are arguments both ways about using natural keys, and that previous thread covers them well, no need to repeat it.

    This is a good argument for the always use an artificial key crowd, though there may in fact be a good natural key, that the other side can point out.

    I would just like to point that I always use artifiicial keys (typically identity or Guid) and never had a problem using them.  Whereas in the rare times in the past where I chose natural keys or was forced into using natural keys, I had problems.

  • (cs) in reply to Alex Papadimoulis
    Alex Papadimoulis:
    Anonymous:

    I'm still just a hobbiest (so please forgive stupidity)

    but wouldn't a phone number be a (always) unique natural key?

    That would really depend on the business requirements. Consider that, instead of creating a Customer Number for new customers, you use their Phone Number. What happens when they change phone numbers? You will have to create a new account for them, or they will use their old account with their old phone number.

    If you're a telphone services compnay, this really isn't an issue, since if they change phone numbers, they change accounts.



    I have worked many years on healthcare and insurance software.  Almost every data guru thinks that "the social" (ala SSN - Social Security Number) is unique.  I don't know how many times I've had to battle that.

    I also worked at a certain (in)famous income tax preparation company and this was the primary key to much of the information stored on their client systems.  Granted, their ancient software was very resilient to the shortcomings to DOS, Netware networking and Btrieve database concurrency (and so were the mainframes they talked to)... yet because of the tie to SSN it was very interesting doing tax forms for non-resident aliens (also known in redneck as "illegals"), which we used to get a lot of in South Texas, and the SSA had already established magic SSNs for those individuals so we could complete the forms... however we had to do it with pencil and photocopier.

    When dealing with a gigantic amount of volitile data (the structure changes drastically from year to year), it does make sense to store based on social if you were using a disk-based method of storing tax returns... which is what we did.  For example, if you keyed in social "061-54-8856" to enter the tax return, the Watcom C program would go down the file structure like so...

    [RETURNS]
        -[56]   (last two digits of social)
          |
          -[061] (first three of social)
               |
               - 061548856.IRS (formatted tax file)

    That worked around file system limitations fairly well and made retrieving returns unbelievably fast.  A seperate RDBMS was used to store/sort and aggregate the personal information on the returns which were not subject to constand DDL change to allow searches and aggregation, but the tax files (which did change format constantly) were worked on by the tax preparers and could also be transformed to/from other formats besides the IRS Image File format.
  • James (unregistered) in reply to aikimark

    Uhm, no.  That would not be the sane thing to do.  The only thing to do is to add a truly unique identifier that doesn't rely on something as cheesy as adding spaces to the address.

    IMHO.

  • (cs) in reply to mlathe
    mlathe:
    aikimark:

    mlathe:
    The worst thing is that he added the uniqueness to the front of address. I would have added whitespace to the end of the address

    While that would be the sane thing to do, I'd almost bet that this was an MSAccess database.  It automatically trims trailing space characters from text fields.

    What I can't figure out is why this database wasn't under scrutiny as soon as the first mailing failed.  Wasn't anyone paying attention?

    meant to be a joke



    Don't worry.  I got it.

  • L-user (unregistered) in reply to James

    This guy probably treats the users like idiots!

  • (cs)

    I don't really want a screenshot of the application, I have a weak stomach. However, I would be interested in seeing the error generated by the program. I picture something along the lines of...

    === Cadastral Error !

    There has been a geographic conflict with another user on the network. Please move to another place or call DeveloperNameHere at PhoneNumberHere

    [Yes] [No]

    PS: No, I did not mean to say "catastrophic".

  • Stephen (unregistered)

    i know this maybe bad practice, but whenever i did relational databases, i always used uid autonumber as my primary unique keys.  so uid/pid/etc would link and work nicely.  but i really never had any databases more than a few relations so i dont think it was that bad in design.  please let me know if uid would been appropiate here, as i learn by example expecially bc seeing awsful code design implemations.

  • Kevin (unregistered)

    I wish I knew the name of that developer because I think he found his way to the project I'm currently working on. He's our 'project manager'. I use the term reluctantly. One of his nicknames is "Donkie".  One of his MANY claims to shame is that he stores dates as intergers (int). Just try and get records from one date to another with that kind of design. Best part is that the project he's mangling is a financial application that manages more than $35 million in assets.

  • (cs) in reply to LuciferSam

    <font size="1" style="font-family: verdana;">

    LuciferSam:
    No, a phone number is not unique enough. Both me and my roomate go to the same hair cutter and they key things by phone number.  I always have to tell them that no, I don't like my hair cut as style 1, that is what my roommate likes. I like style 4.


    <font size="2">The real WTF is that your haircutter doesn't just ask you what you want done when you go there.  And that they number their hairstyles.

    </font>
    </font>

  • (cs) in reply to Kevin
    Anonymous:
    ...One of his MANY claims to shame is that he stores dates as intergers (int)....

    Of course integers would not fit that bill, but if you can be sure to use a conversion method that will behave the same on all the platforms your app is going to run, dates converted to and stored as numbers (like OLE automation dates) that are precise enough could be useful in some situations. Yes, you end up with unfathomable and uneditable values in your tables, but you also avoid a lot of trouble arising from regional issues.

    If you are careful.

  • (cs) in reply to ishmaeel

    Great. My very first quotation attempt and it is botched. I thought I was immune to that.

  • Arachnid (unregistered) in reply to Kevin
    Anonymous:
    I wish I knew the name of that developer because I think he found his way to the project I'm currently working on. He's our 'project manager'. I use the term reluctantly. One of his nicknames is "Donkie".  One of his MANY claims to shame is that he stores dates as intergers (int). Just try and get records from one date to another with that kind of design. Best part is that the project he's mangling is a financial application that manages more than $35 million in assets.


    If this is the worst he does, you're lucky. Assuming he's using UNIX timestamps, you can still do comparisons (and hence filter by a daterange) quite easily. Most DBMSes even include functions to convert to and from UNIX timestamps.  
  • RobK (unregistered) in reply to uncool
    Anonymous:

    I'm still just a hobbiest (so please forgive stupidity)

    but wouldn't a phone number be a (always) unique natural key?



    Phone numbers and addresses change. They are not good primary keys.
  • Kevin (unregistered) in reply to ishmaeel

    I did forget to mention that he's stored dates as strings in some tables of the same db. No kidding. Needless to say, our developers have to code in conversions for all these disparate types. They're going mental.

  • (cs) in reply to mlathe

    Good...

  • (cs) in reply to aikimark
    aikimark:

     

    While that would be the sane thing to do, I'd almost bet that this was an MSAccess database.  It automatically trims trailing space characters from text fields.

    If it were Access, at least it would offer the designer to create the infamous "artificial" key.

  • (cs) in reply to mlathe
    mlathe:
    aikimark:

    mlathe:
    The worst thing is that he added the uniqueness to the front of address. I would have added whitespace to the end of the address

    While that would be the sane thing to do, I'd almost bet that this was an MSAccess database.  It automatically trims trailing space characters from text fields.

    What I can't figure out is why this database wasn't under scrutiny as soon as the first mailing failed.  Wasn't anyone paying attention?

    meant to be a joke

    Good... [8-)]

  • (cs) in reply to Kevin
    Anonymous:
    I did forget to mention that he's stored dates as strings in some tables of the same db. No kidding. Needless to say, our developers have to code in conversions for all these disparate types. They're going mental.

    Ok. now, "ouch!".

  • (cs) in reply to TankerJoe
    TankerJoe:
    mlathe:
    aikimark:

    mlathe:
    The worst thing is that he added the uniqueness to the front of address. I would have added whitespace to the end of the address

    While that would be the sane thing to do, I'd almost bet that this was an MSAccess database.  It automatically trims trailing space characters from text fields.

    What I can't figure out is why this database wasn't under scrutiny as soon as the first mailing failed.  Wasn't anyone paying attention?

    meant to be a joke



    Don't worry.  I got it.

    trying to be witty

  • (cs) in reply to mlathe
    mlathe:
    TankerJoe:
    mlathe:
    aikimark:

    mlathe:
    The worst thing is that he added the uniqueness to the front of address. I would have added whitespace to the end of the address

    While that would be the sane thing to do, I'd almost bet that this was an MSAccess database.  It automatically trims trailing space characters from text fields.

    What I can't figure out is why this database wasn't under scrutiny as soon as the first mailing failed.  Wasn't anyone paying attention?

    meant to be a joke



    Don't worry.  I got it.

    trying to be witty



    thats what you get
  • (cs) in reply to TankerJoe
    TankerJoe:
    mlathe:
    TankerJoe:
    mlathe:
    aikimark:

    mlathe:
    The worst thing is that he added the uniqueness to the front of address. I would have added whitespace to the end of the address

    While that would be the sane thing to do, I'd almost bet that this was an MSAccess database.  It automatically trims trailing space characters from text fields.

    What I can't figure out is why this database wasn't under scrutiny as soon as the first mailing failed.  Wasn't anyone paying attention?

    meant to be a joke



    Don't worry.  I got it.

    trying to be witty



    thats what you get

    next time i'll write

    The worst thing is that he added the uniqueness to the front of address. I would have added whitespace to the end of the address <insert laughter>

  • (cs)

    Clearly the problem was the result of a design flaw with the Post Office's addressing system, I hate it when programmers get blamed for elements beyond their control.



    I worked with a web system that required the user to use the back button when an error occured, which would then show the "Page Expired" error page.  After much hassling the original vendor, they finally said it was a flaw in Netscape, and not their problem (back when IE was barely used). 

    The guy should have taken it up with the Supreme Court to sue the USPS into using properly unique 1st line addresses, then he could be the Patron Saint of Natural Keys.

  • (cs) in reply to Stephen
    Anonymous:
    i know this maybe bad practice, but whenever i did relational databases, i always used uid autonumber as my primary unique keys.  so uid/pid/etc would link and work nicely.  but i really never had any databases more than a few relations so i dont think it was that bad in design.  please let me know if uid would been appropiate here, as i learn by example expecially bc seeing awsful code design implemations.


    This is frowned on for some main reasons:
    • It shows that you are unaware of what the "facts" inside each table are and you probably aren't aware that perhaps a composite key would be the PK.
    • It allows duplicates (you are simply avoiding the key aspect and just using an identity column to number your rows... like a sheet of columnar paper... you can write JOHN SMITH   444 STATE STREET   POUGHKEEPSIE NY five times on that sheet of paper and each row is unique)
    Consider a table of people which is frequently queried on by phone number (say a PBX system queries it as calls come in).

    You could probably be safe with an IDENTITY column for numbering the customers, but you would want your PK to be the phone number PLUS the caller's name.  This identifies a fact that is UNIQUE to each row in the table, prevents duplicates, and it's not bad design.  If you believe in MSSQL, a good fill-factor and a CLUSTERED index, then it makes retrieving a unique record a sub-second operation.

    A nice thing with setting your PKs right is if you feed in data from bulk information services (market  aggregators like Merkle and such).  You can feed in this data during the day (a program to throttle the inserts so OLTP functionality is not affected), and as constraints are dinged the exceptions can be fed into a dupe file or dupe table, which you can go back to your aggregator for a refund.

    What I see most often is some ID field at the beginning that's set IDENTITY CLUSTERED, then the rest of the table is heavy with non-clustered indexes as the table gets very big.  Sad, really.

    For extremely large people tables on MSSQL, I have set the primary key to the phone number plus the SOUNDEX of the name, and seperate index for the SOUNDEX of the address (seperate columns are maintained to store the SOUNDEX values and are updated with triggers).  This makes an address/name search on a table with 255 million+ rows on it come back with results in seconds for LIKE searches.
  • (cs) in reply to mlathe
    mlathe:
    TankerJoe:
    mlathe:
    TankerJoe:
    mlathe:
    aikimark:

    mlathe:
    The worst thing is that he added the uniqueness to the front of address. I would have added whitespace to the end of the address

    While that would be the sane thing to do, I'd almost bet that this was an MSAccess database.  It automatically trims trailing space characters from text fields.

    What I can't figure out is why this database wasn't under scrutiny as soon as the first mailing failed.  Wasn't anyone paying attention?

    meant to be a joke



    Don't worry.  I got it.

    trying to be witty



    thats what you get

    next time i'll write

    The worst thing is that he added the uniqueness to the front of address. I would have added whitespace to the end of the address <insert laughter=""></insert>



    i just wanted it quoted ONCE more.
  • (cs) in reply to christoofar
    christoofar:
    Anonymous:
    i know this maybe bad practice, but whenever i did relational databases, i always used uid autonumber as my primary unique keys.  so uid/pid/etc would link and work nicely.  but i really never had any databases more than a few relations so i dont think it was that bad in design.  please let me know if uid would been appropiate here, as i learn by example expecially bc seeing awsful code design implemations.


    This is frowned on for some main reasons:
    • It shows that you are unaware of what the "facts" inside each table are and you probably aren't aware that perhaps a composite key would be the PK.
    • It allows duplicates (you are simply avoiding the key aspect and just using an identity column to number your rows... like a sheet of columnar paper... you can write JOHN SMITH   444 STATE STREET   POUGHKEEPSIE NY five times on that sheet of paper and each row is unique)
    Consider a table of people which is frequently queried on by phone number (say a PBX system queries it as calls come in).

    You could probably be safe with an IDENTITY column for numbering the customers, but you would want your PK to be the phone number PLUS the caller's name.  This identifies a fact that is UNIQUE to each row in the table, prevents duplicates, and it's not bad design.  If you believe in MSSQL, a good fill-factor and a CLUSTERED index, then it makes retrieving a unique record a sub-second operation.

    A nice thing with setting your PKs right is if you feed in data from bulk information services (market  aggregators like Merkle and such).  You can feed in this data during the day (a program to throttle the inserts so OLTP functionality is not affected), and as constraints are dinged the exceptions can be fed into a dupe file or dupe table, which you can go back to your aggregator for a refund.

    What I see most often is some ID field at the beginning that's set IDENTITY CLUSTERED, then the rest of the table is heavy with non-clustered indexes as the table gets very big.  Sad, really.

    For extremely large people tables on MSSQL, I have set the primary key to the phone number plus the SOUNDEX of the name, and seperate index for the SOUNDEX of the address (seperate columns are maintained to store the SOUNDEX values and are updated with triggers).  This makes an address/name search on a table with 255 million+ rows on it come back with results in seconds for LIKE searches.


    I'm one of these "programmers who knows nothing about databases".  In university I intentionally avoided database courses for fear of getting a job as a DBA and not as a programmer (yes, I know that DBA's do some programming, I don't care).  I wanted to sabotage myself from that field of technology.  So now it's two years since I graduated and I'm programming Java/C++ at work, but I use PHP/MySQL for my website (which is really just a hobby).  I have a blog that users can post comments to, and I use the unix timestamp as the primary key for my blog table, and another unix timestamp as the primary key for my comments table (with a 'post_timestamp' column indicating which post the comment goes with).

    So can a DB expert tell me why/if that is a bad DB design?  I really don't know, I just kind of taught myself SQL from a book (this one, actually: http://www.amazon.com/gp/product/0672326728/qid=1135202114/sr=8-1/ref=pd_bbs_1/102-3871088-4957717?n=507846&s=books&v=glance), and people seem to have strong feelings about this.

  • (cs) in reply to christoofar
    christoofar:
    Consider a table of people which is frequently queried on by phone number (say a PBX system queries it as calls come in).

    You could probably be safe with an IDENTITY column for numbering the customers, but you would want your PK to be the phone number PLUS the caller's name.  This identifies a fact that is UNIQUE to each row in the table, prevents duplicates, and it's not bad design. 

    Ah, more stupid humor. If phone numbers can be duplicates, there is no reason that phone number+name can't be a duplicate too.

    Geez, use an autonumber for the PK to join to other tables, and use non-unique indexes on phone and name.

    What, your database doesn't support using multiple indexes? Bummer.

  • (cs) in reply to kipthegreat
    kipthegreat:

    I'm one of these "programmers who knows nothing about databases".  In university I intentionally avoided database courses for fear of getting a job as a DBA and not as a programmer (yes, I know that DBA's do some programming, I don't care).  I wanted to sabotage myself from that field of technology.  So now it's two years since I graduated and I'm programming Java/C++ at work, but I use PHP/MySQL for my website (which is really just a hobby).  I have a blog that users can post comments to, and I use the unix timestamp as the primary key for my blog table, and another unix timestamp as the primary key for my comments table (with a 'post_timestamp' column indicating which post the comment goes with).

    So can a DB expert tell me why/if that is a bad DB design?  I really don't know, I just kind of taught myself SQL from a book (this one, actually: http://www.amazon.com/gp/product/0672326728/qid=1135202114/sr=8-1/ref=pd_bbs_1/102-3871088-4957717?n=507846&s=books&v=glance), and people seem to have strong feelings about this.


    I really don't think that depriving yourself of extremely useful knowledge (which is also very relevant to many programmers) is a good way of avoiding becoming a DBA.

Leave a comment on “Uniquely Addressing”

Log In or post as a guest

Replying to comment #:

« Return to Article