The Daily WTF: Curious Perversions in Information Technology

ammoQ · 2005-12-21 Reply Admin

Anonymous:
I'm still just a hobbiest (so please forgive stupidity)

but wouldn't a phone number be a (always) unique natural key?

What if your customers don't tell you the phone number? Will you put made-up phone numbers into your database so you have your unique key?

2005-12-21 Reply Admin

Well just wait until you get 2 comments posted at the exact same second...

FrostCat · 2005-12-21 Reply Admin

kipthegreat:

I'm one of these "programmers who knows nothing about databases". In university I intentionally avoided database courses for fear of getting a job as a DBA and not as a programmer (yes, I know that DBA's do some programming, I don't care). I wanted to sabotage myself from that field of technology. So now it's two years since I graduated and I'm programming Java/C++ at work, but I use PHP/MySQL for my website (which is really just a hobby). I have a blog that users can post comments to, and I use the unix timestamp as the primary key for my blog table, and another unix timestamp as the primary key for my comments table (with a 'post_timestamp' column indicating which post the comment goes with).

So can a DB expert tell me why/if that is a bad DB design? I really don't know, I just kind of taught myself SQL from a book (this one, actually: http://www.amazon.com/gp/product/0672326728/qid=1135202114/sr=8-1/ref=pd_bbs_1/102-3871088-4957717?n=507846&s=books&v=glance), and people seem to have strong feelings about this.

Well, a timestamp is fine until two people post comments at the same time.

Aside: a forum WTF: "Something didn't quite work out ... " is not what I call a helpful error.

kipthegreat · 2005-12-21 Reply Admin

Anonymous:
Well just wait until you get 2 comments posted at the exact same second...

Yeah I thought of that, I add one second to it in that case.

ammoQ · 2005-12-21 Reply Admin

kipthegreat:
Anonymous:
Well just wait until you get 2 comments posted at the exact same second...

Yeah I thought of that, I add one second to it in that case.

Honestly, for a simple low-profile application like a single-user blog, there is hardly a database design bad enough to stop it from working.

ammoQ · 2005-12-21 Reply Admin

christoofar:

A nice thing with setting your PKs right is if you feed in data from bulk information services (market aggregators like Merkle and such). You can feed in this data during the day (a program to throttle the inserts so OLTP functionality is not affected), and as constraints are dinged the exceptions can be fed into a dupe file or dupe table, which you can go back to your aggregator for a refund.

What I see most often is some ID field at the beginning that's set IDENTITY CLUSTERED, then the rest of the table is heavy with non-clustered indexes as the table gets very big. Sad, really.

For extremely large people tables on MSSQL, I have set the primary key to the phone number plus the SOUNDEX of the name, and seperate index for the SOUNDEX of the address (seperate columns are maintained to store the SOUNDEX values and are updated with triggers). This makes an address/name search on a table with 255 million+ rows on it come back with results in seconds for LIKE searches.

I prefer working with database systems that allow for more than one unique index for each table. Anyway, when we talk about 255 million rows, I would not want it to break for those 10 million "John Smith" who have no phone.

JamesCurran · 2005-12-21 Reply Admin

This is rather appropriate for me, because at one point in my life, I lived at "18 John St" in one city & state, and worked at "18 John St" in a different city & state.

However, one thing which I haven't seen anyone comment on, is the fact that

1 Forrest St.
1 Forrest Str.
1 Forrest Street
One Forrest St.
One Forrest Str.
One Forrest Street
1 Forrest Ct
1 Forrest Court.
1 Forrest Ave.

And a whole bunch of other variation are (as far as the USPS is concerned) all the same address. So, whether or not the system can find your record depends on how you choose to abbrev your address that day.

Also, say you're the second person at "123 Main St". How are you supposed to know that in the future, when they ask you for you address, you're supposed to say "2 123 Main St"? And, how is the service rep, taken this information over the phone supposed to know that it's "2 123 Main St", and not "2123 Main St"?

That "developer" really shouldn't be allowed near computers (or any electrical device).

2005-12-21 Reply Admin

christoofar:
Anonymous:
i know this maybe bad practice, but whenever i did relational databases, i always used uid autonumber as my primary unique keys. so uid/pid/etc would link and work nicely. but i really never had any databases more than a few relations so i dont think it was that bad in design. please let me know if uid would been appropiate here, as i learn by example expecially bc seeing awsful code design implemations.

This is frowned on for some main reasons:

It shows that you are unaware of what the "facts" inside each table are and you probably aren't aware that perhaps a composite key would be the PK.

It allows duplicates (you are simply avoiding the key aspect and just using an identity column to number your rows... like a sheet of columnar paper... you can write JOHN SMITH 444 STATE STREET POUGHKEEPSIE NY five times on that sheet of paper and each row is unique)

Consider a table of people which is frequently queried on by phone number (say a PBX system queries it as calls come in).

You could probably be safe with an IDENTITY column for numbering the customers, but you would want your PK to be the phone number PLUS the caller's name. This identifies a fact that is UNIQUE to each row in the table, prevents duplicates, and it's not bad design. If you believe in MSSQL, a good fill-factor and a CLUSTERED index, then it makes retrieving a unique record a sub-second operation.

A nice thing with setting your PKs right is if you feed in data from bulk information services (market aggregators like Merkle and such). You can feed in this data during the day (a program to throttle the inserts so OLTP functionality is not affected), and as constraints are dinged the exceptions can be fed into a dupe file or dupe table, which you can go back to your aggregator for a refund.

What I see most often is some ID field at the beginning that's set IDENTITY CLUSTERED, then the rest of the table is heavy with non-clustered indexes as the table gets very big. Sad, really.

For extremely large people tables on MSSQL, I have set the primary key to the phone number plus the SOUNDEX of the name, and seperate index for the SOUNDEX of the address (seperate columns are maintained to store the SOUNDEX values and are updated with triggers). This makes an address/name search on a table with 255 million+ rows on it come back with results in seconds for LIKE searches.

I think you meant for your post to read "This is frowned on *by some people*...".

There is nothing inherently wrong with the design of using artificial keys. There are other constructs for managing uniqueness other than just the primary key - alternate keys and unique indexes - there's nothing special about using the primary key to enforce uniqueness. What is special about the primary key is that 1) it needs to be unique and unchanging, 2) it's values will be propagated to all other tables with foreign key relationships and 3) it will be heavily used in joins.

'Natural' keys tend to suffer some deficiencies here. The unique and unchanging natural data that you select frequently turns out to be neither in practice (people make entry errors, phone numbers change, names change, SS#'s aren't always available for everyone, etc., etc.). When the data does change the fact can't simply be updated in a single place, it has to be updated in every single place referring to the original table as well. Systems using natural keys that I've seen tend to develop key bloat when the primary key of dependent tables includes all the columns from the parent tables original key plus additional columns for the child - I've seen systems with 7 or 8 part primary keys due to a couple layers of nested dependency. Finally the size and efficiency of your primary key index for joins is at the mercy of what your natural key happens to be - is it a simple number (small and fast to compare, many records to an index page) or a concatenation of several text columns (from your example a phone number and a customer name).

Artificial keys suffer none of these deficiencies. The only legitimate argument you made against artificial keys is that you'll have one additional index on your table for the artificial value - though typically that index is going to be better for joins than an index on a natural key is going to be.

Eolianne · 2005-12-21 Reply Admin

That is so hilarious!

2005-12-21 Reply Admin

This is frowned on for some main reasons: It shows that you are unaware of what the "facts" inside each table are and you probably aren't aware that perhaps a composite key would be the PK.

I don't get it, you should be unaware of the "facts" of the table.

What I don't understand about using "natural" data for the Primary Key. How do I relate another table to this database?

Table1 (customertable)has PK with Customer Name and Primary Key Table2 (ordertable) is a list of Orders. One of the fields points back to the custoemrtable. What is this field? A name/phonenumber?

dpm · 2005-12-21 Reply Admin

JamesCurran:

However, one thing which I haven't seen anyone comment on, is the fact that

1 Forrest St.
1 Forrest Str.
1 Forrest Street
One Forrest St.
One Forrest Str.
One Forrest Street
1 Forrest Ct
1 Forrest Court.
1 Forrest Ave.

And a whole bunch of other variation are (as far as the USPS is concerned) all the same address.

The first six are. Why you think "court" and "avenue" are evaluated to be "street" is beyond me, though.

I worked on USPS software for two years and can assure you they aren't. You can also look at

http://www.usps.com/ncsc/lookups/usps_abbreviations.html

ok

dpm

dpm · 2005-12-21 Reply Admin

mthamil:

I really don't think that depriving yourself of extremely useful knowledge (which is also very relevant to many programmers) is a good way of avoiding becoming a DBA.

I just say "I don't want that job." It's worked so far. I also like to say "No, I don't know COBOL" even though I do. Who could blame me?

ok
dpm

Maurits · 2005-12-21 Reply Admin

Not only are duplicate SSNs used illegitimately, but SSNs are legitimately re-used as people die and are born.

This causes problems for companies that need to hold data for a long time, like life insurance companies.

JamesCurran · 2005-12-21 Reply Admin

dpm:
The first six are. Why you think "court" and "avenue" are evaluated to be "street" is beyond me, though.

Because the USPS officially ignores the suffix and considers all of them to be just "1 Forrest". New streets must be unique within the town irrespective of the suffix.

2005-12-21 Reply Admin

Anonymous:
I'm still just a hobbiest (so please forgive stupidity)

but wouldn't a phone number be a (always) unique natural key?

In this case, you can't register a customer without phone number. You might not have a phone number in every case. If the application state that you NEED a phone number to save the record, the user would make something up. These numbers use to always be the same. 6, 66, 69, 666 :)

2005-12-21 Reply Admin

JamesCurran:
dpm:
The first six are. Why you think "court" and "avenue" are evaluated to be "street" is beyond me, though.

Because the USPS officially ignores the suffix and considers all of them to be just "1 Forrest". New streets must be unique within the town irrespective of the suffix.

That's plainly not true. I assure you that 123 8th St and 123 8th Ave are two different locations in Manhatten

2005-12-21 Reply Admin

mlathe:

next time i'll write
The worst thing is that he added the uniqueness to the front of address. I would have added whitespace to the end of the address <insert laughter>

I code all of my IDs by converting them to binary strings where 0s are tabs and 1s are spaces. This encrypts the IDs so that they can't be seen when printed out, because confidentiality is extremely important in our office due to HIPAA.

mlathe · 2005-12-21 Reply Admin

Anonymous:
JamesCurran:

dpm:
The first six are. Why you think "court" and "avenue" are evaluated to be "street" is beyond me, though.

Because the USPS officially ignores the suffix and considers all of them to be just "1 Forrest". New streets must be unique within the town irrespective of the suffix.

That's plainly not true. I assure you that 123 8th St and 123 8th Ave are two different locations in Manhatten

hence he said "new streets"

GoatCheez · 2005-12-21 Reply Admin

I like unique keys. I like to think of databases tables kinda like a huge multidimensional array, with the first dimension being the index/primary key. My tendancy in db programming is to always use a field for the key that autoincrements. No, I don't do alot (if any) of db programming. Bottom line though, is that I've never had any problems with this situation.

dpm · 2005-12-21 Reply Admin

JamesCurran:
dpm:
Why you think "court" and "avenue" are evaluated to be "street" is beyond me, though.

Because the USPS officially ignores the suffix and considers all of them to be just "1 Forrest". New streets must be unique within the town irrespective of the suffix.

I cannot imagine why you say that. At what point in processing do they get "ignored"? I've worked on the code
and the database-generation scripts, so I know they don't --- what have you done?

And if that were true, how would mail get delivered? God knows there are thousands of examples against your claim:
anyone from Atlanta care to count how many "Peachtree" names there are in that area? Jeez. The first google entry
for (atlanta "how many peachtree") is http://www.newcolonist.com/peachtree.html which features a picture of the
signs at the intersection of Peachtree Road and Peachtree Avenue. http://www.newcolonist.com/peachtree.html

ok
dpm

2005-12-21 Reply Admin

> ...but you would want your PK to be the phone number PLUS the caller's name. This identifies a fact that is UNIQUE to each row in the table, prevents duplicates, and it's not bad design.

Actually, for this data you should use a generator to create a unique number that is unrelated to the data. The phone number plus name should be a seperate, unique constraint. This unique constraint is also called a candidate key.
Now the code actually reflects what you're trying to accomplish, and the unique constraint can be changed if the rules change. You may get a case where John Smith is assigned the phone number that used to belong to John Smith.
If you are concerned about retrieval speed, that would be handled by an index based on the query and the potential plans generated by the optimizer.

Rick DeBay

dpm · 2005-12-21 Reply Admin

mlathe:
hence he said "new streets"

Which wouldn't make a damned bit of difference to the USPS databases and applications.
What would it matter if "new" streets are unique, when you still have to deal with the
already existing thousands of duplicates?

ok
dpm

Gene Wirchenko · 2005-12-21 Reply Admin

JamesCurran:
This is rather appropriate for me, because at one point in my life, I lived at "18 John St" in one city & state, and worked at "18 John St" in a different city & state.

Lovely, lovely! Did the cities have different names?

In one newsgroup, there was a thread on addresses, and one poster wrote of classifying addresses by even or odd. (This is especially applicable in Canada since the Postal Code will typically be different for each side of the street.) I, myself, live at 1313 1/2 Railroad Avenue. Yes, that is a one-half. It does cause some doubletakes.

In Vancouver, BC, Canada, there is an address "South Foot of Main Street". It was in the Postal Code guide the last time I checked for it. (No, you can not simply fill in 0. That would be at the other end of Main Street.)

Sincerely,

Gene Wirchenko

loneprogrammer · 2005-12-21 Reply Admin

nbsp:

I code all of my IDs by converting them to binary strings where 0s are tabs and 1s are spaces. This encrypts the IDs so that they can't be seen when printed out, because confidentiality is extremely important in our office due to HIPAA.

So you're to blame!
http://compsoc.dur.ac.uk/whitespace/

Ytram · 2005-12-21 Reply Admin

JamesCurran:
dpm:
The first six are. Why you think "court" and "avenue" are evaluated to be "street" is beyond me, though.

Because the USPS officially ignores the suffix and considers all of them to be just "1 Forrest". New streets must be unique within the town irrespective of the suffix.

Just to add one more vote in favor of that statement being untrue. I have worked for a company that did direct mailing(we had our very own post office), and worked quite a bit with address issues. Street suffixes are not ignored, and if they were it would be a mess. As another poster said, even if the requirement was made that new streets had to be uniquely named, this solves nothing because of all of the pre-existing street with non-unique names.

And another thing: I just moved into a brand new neighborhood, and we have a street named SE 89th Terrace, that runs about 50 yards parallel to SE 89th Street. So if this law/rule/whatever requiring unique name exists, it must be very new.

ammoQ · 2005-12-21 Reply Admin

lance:
The only legitimate argument you made against artificial keys is that you'll have one additional index on your table for the artificial value - though typically that index is going to be better for joins than an index on a natural key is going to be.

Artificial keys require more lookups and therefore often additional joins. And of course they need additional space, though this might be the smallest concern.
In this forum, we've had a discussion about this topic a few weeks ago; and since I'm generally pro-artificial key (many systems use them for every single table), I've kept an eye on problems related to artificial keys.
My perception is that they hurt most where they are most useless.
The prerequisite is that my company makes software that is supposed to be "standardized", but in fact it is heavily customized for each client, to the point where it is almost individually developed. Although all clients generally need the same tables - product, customer, order, order_detail etc., the structure of the natural keys differ. E.g. for some clients, the product number is unique, while another clients needs an additional field, e.g. "company_number", to make the unique key. Using artificial keys for all those tables greatly improved reusability of the code between clients. On the other hand, some tables describe internal features of our software - parameters, code-description-mappings, messages etc. In many cases, there would be a "natural" key, and since these tables are not directly related to the customer's business, so there is the only reason for using the autogenerated IDs as unique key has been our stubbornness (called "consistency"). Anyway, looking back, it was a rather poor decision. Not a "OMG we are doomed" bad one, but definitely something I will do different the next time I have a chance to.

TheDauthi · 2005-12-21 Reply Admin

Ytram:
JamesCurran:
dpm:
The first six are. Why you think "court" and "avenue" are evaluated to be "street" is beyond me, though.

Because the USPS officially ignores the suffix and considers all of them to be just "1 Forrest". New streets must be unique within the town irrespective of the suffix.

Just to add one more vote in favor of that statement being untrue. I have worked for a company that did direct mailing(we had our very own post office), and worked quite a bit with address issues. Street suffixes are not ignored, and if they were it would be a mess. As another poster said, even if the requirement was made that new streets had to be uniquely named, this solves nothing because of all of the pre-existing street with non-unique names.

And another thing: I just moved into a brand new neighborhood, and we have a street named SE 89th Terrace, that runs about 50 yards parallel to SE 89th Street. So if this law/rule/whatever requiring unique name exists, it must be very new.

Again, yes. I live at a Woodstone Place, but the adjoining street is Woodstone drive. There are overlaps in the numeric ranges, as well.

RevMike · 2005-12-21 Reply Admin

fregas:
hank miller:
So we don't cover old ground, I'd like to reference http://www.thedailywtf.com/forums/2/49490/ShowPost.aspx

There are arguments both ways about using natural keys, and that previous thread covers them well, no need to repeat it.

This is a good argument for the always use an artificial key crowd, though there may in fact be a good natural key, that the other side can point out.

I would just like to point that I always use artifiicial keys (typically identity or Guid) and never had a problem using them. Whereas in the rare times in the past where I chose natural keys or was forced into using natural keys, I had problems.

Generally speaking, the artificial versus natural key debate comes down to this: You'll never do anything bad by using artificial keys. There are some cases when you can do better using natural keys and lots of cases when you can do worse using natural keys. If you're not 100% about using a natural key, go with artificial.

RevMike · 2005-12-21 Reply Admin

ammoQ:
kipthegreat:
Anonymous:
Well just wait until you get 2 comments posted at the exact same second...

Yeah I thought of that, I add one second to it in that case.

Honestly, for a simple low-profile application like a single-user blog, there is hardly a database design bad enough to stop it from working.

My thoughts exactly. If you are equipped to handle the special case of incrementing the timestamp on a collision, you should have just incremented a counter in the first place. the hack is just as much work as doing it right.

In general, you have a bad design because you can never scale beyond 60 transactions per second.

Otis Mukinfus · 2005-12-21 Reply Admin

WTF!

I can't type anything in the reply box, only the html box!</>

2005-12-21 Reply Admin

"I've seen systems with 7 or 8 part primary keys due to a couple layers of nested dependency"

Actually, having all of the keys of all of the parent tables can be a help in many situations. So you have a DB structure that involves 7 or 8 levels of nesting, if each table had it's own artificial key and only the key of it's immediate parent, you would have to join between all 7 levels of tables to get down to the detail tables for a specific top level entity. If all of the child tables contain the key of every table above it, it's trivial to skip the intermediate tables and join straight to the details you need.

Mind you this argument works with either natural or artificial keys. Sometimes it can be useful to do both, so each table has it's own single column primary key, and an alternate key that is based on the parent table + whatever is naturally unique.

Otis Mukinfus · 2005-12-21 Reply Admin

Interesting... When logging in with FireFox I can't type in this box and there are two tabs at the bottom.

I ran into this situation at a very large defense contractor who builds state of the art aircraft. Some of their systems allowed users to "decorate" data in fields to make it unique when there was duplication. Needless to say it was a nightmare to parse their data...

2005-12-21 Reply Admin

christoofar:

I have worked many years on healthcare and insurance software. Almost every data guru thinks that "the social" (ala SSN - Social Security Number) is unique. I don't know how many times I've had to battle that.

Yep, me too. I work in criminal justice. We come across non-unique SSNs all the time, not to mention lots of criminals with multiple SSNs.

Inaccurate data aside, most people don't realize that occosionally the same SSN is legitimately assigned to more than one person.

RevMike · 2005-12-21 Reply Admin

Anonymous:
This is frowned on for some main reasons: It shows that you are unaware of what the "facts" inside each table are and you probably aren't aware that perhaps a composite key would be the PK.

I don't get it, you should be unaware of the "facts" of the table.

He's using jargon particularly common in the OLAP subdiscipline of database design. Data is broken into "facts" and "dimensions". Then a "star schema" is built placing the fact table at the center and relating it by foreign key to several dimension tables. The primary key of each record in the fact table is a composite of the foreign keys of all the dimensions.

For instance, say you were to build an OLAP environment for sales data. You might build dimensions for product, sales team, and quarter. Then you'll build a fact table that has, for every combination of product, sales team, and time period, a set of values for gross sales, net sales, discounts, etc. This structure is very efficient for certain types of analysis.

What I don't understand about using "natural" data for the Primary Key. How do I relate another table to this database?

Table1 (customertable)has PK with Customer Name and Primary Key
Table2 (ordertable) is a list of Orders. One of the fields points back to the custoemrtable. What is this field? A name/phonenumber?

The reason you don't understand this is that there is lots of data for which there is not a reasonable natural key, so your example is rigged to fail. To build a more reasonable example, say you were building a system to handle stock trades. A composite key of ticker symbol and exchange would be an appropriate natural key. (However it would not actually be appropriate for a portfolio management app, because tickers symbols can and do change over time.)

RevMike · 2005-12-21 Reply Admin

Anonymous:
> ...but you would want your PK to be the phone number PLUS the caller's name. This identifies a fact that is UNIQUE to each row in the table, prevents duplicates, and it's not bad design.

Actually, for this data you should use a generator to create a unique number that is unrelated to the data. The phone number plus name should be a seperate, unique constraint. This unique constraint is also called a candidate key.
Now the code actually reflects what you're trying to accomplish, and the unique constraint can be changed if the rules change. You may get a case where John Smith is assigned the phone number that used to belong to John Smith.
If you are concerned about retrieval speed, that would be handled by an index based on the query and the potential plans generated by the optimizer.

Rick DeBay

Sure, go ahead and add indexes and constraints willy-nilly. Who cares about insert/update performance.

RevMike · 2005-12-21 Reply Admin

Anonymous:
I wish I knew the name of that developer because I think he found his way to the project I'm currently working on. He's our 'project manager'. I use the term reluctantly. One of his nicknames is "Donkie". One of his MANY claims to shame is that he stores dates as intergers (int). Just try and get records from one date to another with that kind of design. Best part is that the project he's mangling is a financial application that manages more than $35 million in assets.

I know someone like that? Is this Sybase code for a major investment bank in Manhattan?

RevMike · 2005-12-21 Reply Admin

DonMcNellis:
And, as the database grows, an integer or guid is going to be significantly faster for joins than a char*...

Depends on your platform. Some suffer from this deficiency and others don't.

2005-12-21 Reply Admin

To build a more reasonable example, say you were building a system to handle stock trades. A composite key of ticker symbol and exchange would be an appropriate natural key. (However it would not actually be appropriate for a portfolio management app, because tickers symbols can and do change over time.)

Why would stock ticker and exchange be a reasonal natural key?

And that still doesn't answer my question: Table1 has a PK of Stock Ticker and Exchange Table2 indexes back into Table 1...

WHAT value gets placed into this field in Table2? Is it "GOOGNYSE"?

I truley don't know, and am trying to understand what is going on.

sao · 2005-12-21 Reply Admin

loneprogrammer:
nbsp:

I code all of my IDs by converting them to binary strings where 0s are tabs and 1s are spaces. This encrypts the IDs so that they can't be seen when printed out, because confidentiality is extremely important in our office due to HIPAA.

So you're to blame!
http://compsoc.dur.ac.uk/whitespace/

after reading this post, and examining the website it links too, i am now converting all client code to this new language!

AWESOME!

JamesCurran · 2005-12-21 Reply Admin

What would it matter if "new" streets are unique, when you still have to deal with the already existing thousands of duplicates?

Because when you have millions of records -- and growing by thousands a day -- you can treat a couple thousand as a special case.

But, you are looking at it from the wrong point of view. If you have a letter for "1 Forrest St" in a town with no "Forrest St" but does have a "Forrest Ave", do you deliver it there or return it?

ammoQ · 2005-12-21 Reply Admin

Anonymous:
>>> To build a more reasonable example, say you were building a system to handle stock trades. A composite key of ticker symbol and exchange would be an appropriate natural key. (However it would not actually be appropriate for a portfolio management app, because tickers symbols can and do change over time.) Why would stock ticker and exchange be a reasonal natural key? And that still doesn't answer my question: Table1 has a PK of Stock Ticker and Exchange Table2 indexes back into Table 1... WHAT value gets placed into this field in Table2? Is it "GOOGNYSE"? I truley don't know, and am trying to understand what is going on.

Of course you need the same 2 fields in Table2, not a concatenated string. So Table2 probably has a compound key of at least 3 columns.

2005-12-21 Reply Admin

dpm:
JamesCurran:

dpm:
Why you think "court" and "avenue" are evaluated to be "street" is beyond me, though.

Because the USPS officially ignores the suffix and considers all of them to be just "1 Forrest". New streets must be unique within the town irrespective of the suffix.

I cannot imagine why you say that. At what point in processing do they get "ignored"? I've worked on the code
and the database-generation scripts, so I know they don't --- what have you done?

And if that were true, how would mail get delivered? God knows there are thousands of examples against your claim:
anyone from Atlanta care to count how many "Peachtree" names there are in that area? Jeez. The first google entry
for (atlanta "how many peachtree") is http://www.newcolonist.com/peachtree.html which features a picture of the
signs at the intersection of Peachtree Road and Peachtree Avenue. http://www.newcolonist.com/peachtree.html

ok
dpm

That's NOTHING!

Here in .au the austpost are very picky about their addresses
The address:

ClientName
Attn: Contact
123 Some St
Town State PostCode

Will be sent to some magically address linked to ClientName -- sometimes...

Attn: Contact
123 Some St
Town State PostCode

Will be returned saying incomplete address -- sometimes...

ClientName
Attn: Contact
123 Wrong St
Town State PostCode

Will be returned saying incomplete address -- sometimes...

I'm still yet to figure out wtf they are thinking?

Go go Australia Post!

olddog · 2005-12-21 Reply Admin

The MOST interesting part of the conversation was when the OWNER claimed that "MOST" of the marketing post was not delivered. Never mind the stupidity of the developer... what are the odds that this affected "MOST" of the clients? It was either a very small database OR the clients swap addresses often.

2005-12-21 Reply Admin

LuciferSam:
No, a phone number is not unique enough. Both me and my roomate go to the same hair cutter and they key things by phone number. I always have to tell them that no, I don't like my hair cut as style 1, that is what my roommate likes. I like style 4. Also when I go to a local box store for electronics, they ask my phone number. They then procede to call me by some one elses name who had the number before me. I don't correct them because I don't want them collecting my shopping habbets under my name.

A customer card with a unique number would be best in the above cases, although not practical.

Why do you people give someone your phone number just because they ask you for it? Just say no.

Zefrieg · 2005-12-21 Reply Admin

This post should have also displayed the attributes of the record in question. That way we could really see the extent of his poor judgment. It would be HORRIBLE if valid candidate keys existed, but the developer overlooked them.

This guy definately needs to read a book on database design. Mistakes like these just shouldn't happen unless you are a total beginner and have no idea what properties must hold in a primary or candidate key. Though, the company is at fault here too. They had nobody that could validate the work that was being done for them.

I agree that anyone working in the software field should be licensed to do their work. Other engineers must be licensed. Why should software engineering be any different?

2005-12-21 Reply Admin

What's wrong with you? Please don't tell me you are too fucking stupid to use a simple forum, yet still feel competent enough to comment on other's work.

2005-12-21 Reply Admin

JamesCurran:
dpm:
The first six are. Why you think "court" and "avenue" are evaluated to be "street" is beyond me, though.

Because the USPS officially ignores the suffix and considers all of them to be just "1 Forrest". New streets must be unique within the town irrespective of the suffix.

I don't believe this is true. Where I live, there is a point where Chennault Beach Dr. and Chennault Beach Rd. meet. I'm certain that the Post Office would be rather upset if a letter was addressed incorrectly between those two.

~TuxGirl

2005-12-22 Reply Admin

Manni:   An Area Code + Exchange + last four digits is a hell of a lot more unique than "1 Main Street" would ever be.

The heuristics to identify a possible duplicate aren't all that difficult. But you have to put some human imput into the process.      Things that work for a primarliy Anglo Saxon database won't work for slavic names, or asian names. One has to look at results of match pass, and be able to adjust parameters on the fly.

I've been doing this stuff for 20-odd years.   Hardly a day goes by without an "oh, shit"

2005-12-22 Reply Admin

The software analyzes the accompanying image, and treats the response accordingly.

2005-12-22 Reply Admin

"If you don't give me your telephone number, we won't be able to verify your address, and your order may not arrive at its intended destination"

Now Listen Carefully.

Unless you're standing in front of the clerk with a wad of real money in you hand, you're going to have a hard time getting your "stuff" without divulging your telephone number.

It is, after all, the least ambiguous piece of information that your going to give to the person (or software), that (who) is going to be responsible for ensuring that the deal is done. . . i.e. goods are received, AND bankers are satisfied.

Uniquely Addressing

Leave a comment on “Uniquely Addressing”