• Les (unregistered) in reply to Andrew
    Andrew:

    The USPS state code is in the STATES table. A database sequence generated the number 12 for the row ('VA', 'Virginia'). The relation between state codes and keys is documented in the DB, and is not arbitrary!

    In fact, the STATES table, presented as a menu, prevents users from entering invalid state codes in the first place. The database FOREIGN KEY for ORDERS.STATE_ID validates the ORDERS record for you.

    All you have to do is join STATES to the ORDERS table to see the state code in the address. If you can't write such JOIN clauses, then you are not qualified to use SQL at all.

    Wrong again. The two-letter code is what is known as a Natural Key. Adding a numerical value for each state is known as a Surrogate Key. You should never create a Surrogate Key when you have a perfectly good Natural Key. Either way, you still have to join to the State table (to get the full name of the State, and to verify that a given State exists). If you use the Natural Key, your code is much easier to maintain, and your application is easier to support.

  • (cs) in reply to dv
    dv:
    snoofle:
    cpt:
    The three letter codes are Australian states (Australian Capital Territory, New South Wales, QueensLanD.
    A few years back, one of the projects I worked on had a "state" column in the db, and I noticed that it had way more than 50 distinct values in it. It was explained to me that since all US states were 2 letters, and that the business folks were now doing business internationally, that rather than change the db and everything that depended upon it, we'd just be assigning arbitrary 2 character combinations for each new region.

    What they didn't tell me about was the range of allowable characters that the auto-state-name-generation function was designed to use... all printable ASCII characters!

    It sounded reasonable until the auto-state-name-generation function eventually created new state codes like backslash-dot and dot-star, and then the value from the state column got parsed with a regex - it made for some interesting debugging sessions.

    TRWTF is they let you debug apps with regexes, when you obvy don't know nothing about it.

    Don't know nothing?

  • (cs) in reply to Les
    Les:
    You should never create a Surrogate Key when you have a perfectly good Natural Key.

    That sounds great, but it doesn't apply in all cases. The system on which I presently work has alphanumeric keys formed by concatenating several unrelated fields from the record in order to generate a unique key. We have no control over the source data, so we're pretty much stuck with it. However, once it's in the DB, if we do lookups on this long (natural) string PK field, performance suffers. For a sufficiently large data set, the surrogate int key yields much better performance (in the downstream part of the system).

    The nature of the data AND how it will be used should be considered together when making the choice of implementation.

  • (cs) in reply to Outlaw Programmer
    Outlaw Programmer:
    I think everyone is missing TRWTF here. The width of the datatype thing is just a red herring; it doesn't really matter one way or the other. TRWTF is that Chareth's code is so bad that it failed a code review three times. When another developer tries to set him straight, he can only focus on these insignificant details in an attempt to justify that his version is better.

    I'm sure if we had access to the stored procedures we'd see that Chareth's is a total mess and that this column width thing is a total non-issue.

    That is exactly what I got from it.

    Chareth obviously wasn't getting it. Yuriy tossed together a quick solution to try to help him understand what he was doing wrong. Chareth's response was a childish attempt to make himself look good and Yuriy look bad, AND involved the supervisor in a truly technical discussion.

    The boss responded correctly. Discuss the technical inconsistencies all you want, but the point was that the "defects" Chareth pointed out were inconsequential at best. Chareth got told to STFU, politely even.

    No matter how good you think you might be, there's a time and place where you should know your role. Yuriy is a senior developer for a reason. Chareth is a junior developer FOR A REASON. Even if Yuriy is wrong, Chareth should have built the code according to his recommendations and then deal with it later. Instead, he throws a virtual temper tantrum. Nice one, Chareth...good luck getting out of that junior role any time soon.

  • (cs) in reply to AreYouSerious
    AreYouSerious:
    There are obviously some people reading this that have never had to maintain code that was written for "U.S. only" software that eventually expanded into international markets. It's people like Yuriy and the boss that don't have the foresight to look at the 2 character state abbreviation and say "I know we are U.S. only right now, but what if we expand into international markets in the future?" Just do it right the first time. I would much rather be maintaining Chareth's code in the future.

    It sounds like they have expanded internationally, which is why ACT and QLD are there. But they can't tax internationally so the tax stored procedure doesn't need to care about anything except US states.

  • Tool (unregistered)

    The sad thing is that there will never be a shortage of Chareths.

  • Jeanne (unregistered) in reply to Pecos Bill
    Pecos Bill:
    AMerrickanGirl:
    Sakamura:
    *HONK* Wrong!

    US does have 3-letter places. The different armies.

    The address for one of my friend is (replacing identifying info with clearly invalid values)

    ABC (DE) Homer J Simpson ABC-123 Unit 45678 FPO AP 96601-1234

    In our case, the code is FPO instead of CA for California...

    So I guess that boss doesn't want to handle marines. Such a jerk.

    Sorry, you're wrong. FPO, APO, etc. are part of the city field.

    Military "state" codes are AA (Armed Forces America), AE (Armed Forces Europe), AP (Armed Forces Pacific).

    From USPS ("Military" States): Armed Forces Africa AE

    Armed Forces Americas AA (except Canada)

    Armed Forces Canada AE

    Armed Forces Europe AE

    Armed Forces Middle East AE

    Armed Forces Pacific AP

    With the way things are going, maybe the Middle East needs it's own code... How about SH(it)

    Mail for each of these "states" goes through a certain office in the US. If you ever need to enter an APO or FPO address in a form which forces you to choose the abbreviation for one of the 50 states, you should choose the state that that office is in. For APO AE, it's NY.

  • Tool (unregistered) in reply to snoofle

    Ah! The old "don't understand joking"/"too pedantic" trick! Well played!

  • (cs) in reply to TallGuy
    TallGuy:
    Wait...what's Canada, then? A foreign country?!

    America's hat.

  • (cs) in reply to snoofle
    snoofle:
    For a sufficiently large data set, the surrogate int key yields much better performance (in the downstream part of the system).

    Actually, if all the data that the query needs is in the primary key, then you don't even need to go load the actual record out of the database. So, it's a trade-off between index size, and how that affects performance in the context of the queries you run.

    Conversely, if you have a primary key already, and you want to write fast queries, try to not use fields that are not part of the primary key. And there are cases where adding a second field to a perfectly good primary key can actually improve performance (but reduce maintainability).

  • Also a big geek (unregistered) in reply to Marc B
    Marc B:
    occasional reader:
    I've seen stuff like this before - people realising that the world doesn't rotate around the USA - where the 'state' column in addresses are relaxed to include provinces of Canada and abbreviated regions and counties of other countries.

    Maybe the junior developer might have been correct in this - the tax calculation might have been flexible to be used for countries other than the US?

    I'm sure the senior developer is correct in that the US isn't planning to annex another state quite this minute :-)

    If they do, they'd probably assign a two letter abbreviation anyway, as in IQ for Iraq and CA for Canada (aka 'North Maine').

    /facepalm

    Besides, everyone knows the REAL best way to do this would be to partition the customer table into 50 state-specific customer tables: tCustomer_1 for Alabama tCustomer_49 for Wyoming

    Just like an old employer used to.

  • Throknor (unregistered) in reply to Jeanne
    Jeanne:
    Pecos Bill:
    AMerrickanGirl:
    Sakamura:
    *HONK* Wrong!

    US does have 3-letter places. The different armies.

    The address for one of my friend is (replacing identifying info with clearly invalid values)

    ABC (DE) Homer J Simpson ABC-123 Unit 45678 FPO AP 96601-1234

    In our case, the code is FPO instead of CA for California...

    So I guess that boss doesn't want to handle marines. Such a jerk.

    Sorry, you're wrong. FPO, APO, etc. are part of the city field.

    Military "state" codes are AA (Armed Forces America), AE (Armed Forces Europe), AP (Armed Forces Pacific).

    From USPS ("Military" States): Armed Forces Africa AE

    Armed Forces Americas AA (except Canada)

    Armed Forces Canada AE

    Armed Forces Europe AE

    Armed Forces Middle East AE

    Armed Forces Pacific AP

    With the way things are going, maybe the Middle East needs it's own code... How about SH(it)

    Mail for each of these "states" goes through a certain office in the US. If you ever need to enter an APO or FPO address in a form which forces you to choose the abbreviation for one of the 50 states, you should choose the state that that office is in. For APO AE, it's NY.

    Would they even need to calculate sales tax for military bases? Could they have a physical presence in the base that would trigger the law? I'd have thought that bases were more-or-less federal grounds, and the stores on them wouldn't collect sales tax for the surrounding state.

    Also, the name of the stored procedure could be 'spCalculateUSSalesTax', and whether or not the address is in a US state is handled before calling. From what I've seen even Canada's sales taxes differ in their handling. So once you are in and know it's a state you are dealing with using two characters shouldn't matter.

  • Andrew (unregistered) in reply to Les
    Les:
    Andrew:

    The USPS state code is in the STATES table. A database sequence generated the number 12 for the row ('VA', 'Virginia'). The relation between state codes and keys is documented in the DB, and is not arbitrary!

    In fact, the STATES table, presented as a menu, prevents users from entering invalid state codes in the first place. The database FOREIGN KEY for ORDERS.STATE_ID validates the ORDERS record for you.

    All you have to do is join STATES to the ORDERS table to see the state code in the address. If you can't write such JOIN clauses, then you are not qualified to use SQL at all.

    Wrong again. The two-letter code is what is known as a Natural Key. Adding a numerical value for each state is known as a Surrogate Key. You should never create a Surrogate Key when you have a perfectly good Natural Key. Either way, you still have to join to the State table (to get the full name of the State, and to verify that a given State exists). If you use the Natural Key, your code is much easier to maintain, and your application is easier to support.

    Even more wrong. The code is not easier to maintain, or support. You have required that the application does not allow states or provinces outside the USA. You assume that the states need no maintainance.

    The unnatural integer key allows more states to be added. That's the issue. How can we most easily upgrade the application to use more states?

    Granted, If the application specified only United States addresses, then I happily would use 'VA'. It's not wrong to follow specs.

  • Andraax (unregistered) in reply to Rachael
    'm with Chareth on point (1). He says there are 3-letter states in the *existing data*, i.e. there are customers *already in the database* who live in the 3-letter Australian states he gives as examples. Yuriy's char[2] solution won't work for them.

    Even if they ship to Australia, they probably don't collect sales tax.

  • foo (unregistered) in reply to snoofle
    snoofle:
    Les:
    You should never create a Surrogate Key when you have a perfectly good Natural Key.

    That sounds great, but it doesn't apply in all cases. The system on which I presently work has alphanumeric keys formed by concatenating several unrelated fields from the record in order to generate a unique key. We have no control over the source data, so we're pretty much stuck with it. However, once it's in the DB, if we do lookups on this long (natural) string PK field, performance suffers. For a sufficiently large data set, the surrogate int key yields much better performance (in the downstream part of the system).

    The nature of the data AND how it will be used should be considered together when making the choice of implementation.

    Yeah. And I bet you never compared actual performance. You just assumed that the performance would be better with the surrogate key, Chareth. Now get back to fixing fonts.

  • (cs) in reply to Sakamura
    Sakamura:
    *HONK* Wrong!

    US does have 3-letter places. The different armies.

    The address for one of my friend is (replacing identifying info with clearly invalid values)

    ABC (DE) Homer J Simpson ABC-123 Unit 45678 FPO AP 96601-1234

    In our case, the code is FPO instead of CA for California...

    So I guess that boss doesn't want to handle marines. Such a jerk.

    No, as was ALREADY pointed out, FPO is the city; AP is the state. Jerk. I see someone else pointed out that it was already pointed out too.

  • Mike D. (unregistered)

    The only rule #1 I care about is from my days playing X-Wing:

    Rule #1: Don't get shot. Rule #2: Shoot them. Rule #3: Never forget that rule #1 comes before rule #2.

  • (cs) in reply to Capt Obvious
    Capt Obvious:
    This is the sort of thing that turns managers into overbearing, micro-managing asshats.
    They need an excuse now? I thought they just came through the birth canal the wrong way...

    I don't normally say this -- in fact, I've never said it -- but this is a truly lame WTF. It has all the requirements for impedimentia: (1) We don't get to see Chareth's code, even in approximation; as pointed out by various enlightened commenters. (2) The original database looks, to my untutored eyes, to be a piece of indefensible DDL garbage. (3) Yuriy could do with a slap across the face. (4) Nobody sane says "we'll never export (or expand) the system," so the designer of the Sp database was clearly a bit-head. (5) "Please note that existing data has many states like ACT, NSW, QLD etc which are all 3 characters." (6) There is no point six. (7) Everything else.

    As a result, all this one does is allow people to prattle on, more or less entertainingly, about SQL Server, money fields, two character state abbreviations (as a non-Republican in 2000, I truly believed that Florida's abbreviation should have been changed to "FU"), and managers. Plus make many, many absurd deductions about the details that were left out of the story.

    Not so much anonymised as vaporised, this WTF, isn't it?

  • Vulpes (unregistered)

    Agreeing that this is pretty lame. Some people can write efficient stored procedures (junior), others can recognise potential problems before they occur and present solutions in a concise way to management (senior). Perhaps Yuriy and Cherath should swap jobs!

  • Jason (unregistered) in reply to Former Junior Programmer
    Former Junior Programmer:
    TallGuy:
    Wait...what's Canada, then? A foreign country?!

    The 51st state.

    I thought it was America's hat?

  • frustrati (unregistered) in reply to Monkios
    Monkios:
    frustrati:
    Exception is Quebec, from within which Canada is an oppressive federation that has wrongfully annexed the proud and sovereign state of Quebec.
    Canada is my country. Quebec is where I live.
    In that case, you must provide all of your posts in French at a minimum. If you choose to post in both French and English, the French version must be displayed more prominently than the English one.
  • frustrati (unregistered) in reply to Peter
    Peter:
    frustrati:
    ascagnel:
    The US does have 3-letter states. If you're in the military, you use a special 3-letter state code in place of your actual location. The post office is equipped to handle this, and so should every website.
    Yeah, but some of us prefer to normalise the data, so if it OK with you, we'll map your three-letter abbreviations to our internal two-letter versions.

    Btw, three-letter states? I have heard of four-letter states like Ohio and Iowa. Not heard of any three-letter states...

    See comment from "snoofle" above. When you start going International, you have changes to make that get really hairy to map into a char(2) column. If you'll always be a US-only company, that's not a big deal, but once you step outside of the US, you're going to have some changes to make or deal with some really odd mappings.

    You fail at reading comprehension. We were discussing US states.

    Oh, and the country that I live in doesn't even have states. Fwiw, a real address handling database will be more complicated than you most probably realise. We are talking about handling all sorts of abbreviations and localisation.

  • (cs) in reply to frustrati
    frustrati:
    Oh, and the country that I live in doesn't even have states. Fwiw, a real address handling database will be more complicated than you most probably realise. We are talking about handling all sorts of abbreviations and localisation.
    It has been done (not that you said it hadn't). Take a look at SAP. Of course, that particular software wasn't written by USAians, but by people who are aware of different languages and currencies, and phone number, address, number, and date display formats. And yes, it is more complicated than most realise.
  • (cs) in reply to Also a big geek
    Also a big geek:
    Besides, everyone knows the REAL best way to do this would be to partition the customer table into 50 state-specific customer tables: tCustomer_1 for Alabama tCustomer_49 for Wyoming
    Well, that's got 49 of them, anyway, assuming the names involve sequential numbering as suggested. Take your pick for the 50th:

    tCustomer_0 tCustomer_50 tCustomer_3.14159 tCustomer_no_sir_pie_are_round

    And of course, we still have the District of Columbia, Samoa, the Marshalls... etc.

    http://www.usps.com/ncsc/lookups/abbr_state.txt

  • (cs) in reply to Andrew

    Why, that's awfully normal of you.

  • Franz Kafka (unregistered) in reply to frustrati
    frustrati:
    Oh, and the country that I live in doesn't even have states. Fwiw, a real address handling database will be more complicated than you most probably realise. We are talking about handling all sorts of abbreviations and localisation.

    not really; my solution is fairly simple: for the countries where I know the rules, I validate and store address info accordingly (still sort of complex unless you're okay with not being to query by subfield). For other countries, I don't validate and state as much - "you'd better make sure the address is valid"

  • Spartan (unregistered)

    They might have to rethink that currency field if they decide to do business in Zimbabwe.

  • Izzy (unregistered) in reply to Ren
    Ren:
    ... Wait... there's something wrong with the universe IT'S THE HADRON COLLIDER! RUN! ...
    Even now, black holes are eating all the countries with three letter states and non-numeric postal codes.
  • (cs) in reply to JackD

    Dang! That $100 could have bought me... one gallon of gas!

  • DB (unregistered)

    he might be right about one thing,though. Those 3 letter states are from Australia.

    So am I, and I am sick of sites which insist on me entering a US state code without any other options, even n/a.

  • Kwame Kilpatrick (unregistered) in reply to TallGuy
    TallGuy:
    Wait...what's Canada, then? A foreign country?!

    Of course not.

  • blunden (unregistered) in reply to Code Dependent
    Code Dependent:
    ...like another dagger made of salt and lemon juice...
    When life gives you salt and lemon juice, add tequila.
    Haha! So true.
  • Andrew (unregistered)

    I agree with Chareth if this is an international application. The fact Chareth says "Please note that existing data has many states like ACT, NSW, QLD etc which are all 3 characters." suggests that it is used internationally.

    1. Many countries have state or province greater than two characters
    2. US zip codes are longer than most other countries so this should be OK
    3. In some countries (such as the Phillipines) you need about $1,000,000 to buy a loaf of bread - in that case having a $1 trillion limit would make sense. Although why 12 decimal places???
    4. See point 3... but again why 12 decimal places

    I do get a little annoyed with the US-centric attitude (being in Australia). If this is an absolutely positively US-only application it is ok, otherwise it is in fact the boss who is the WTF.

  • (cs) in reply to DaveAronson
    DaveAronson:
    Wait, I thought Rule One was to be very careful when speaking to a small bald wrinkly old man....

    Don't talk to bald-headed wrinkly things...

  • Spider (unregistered)

    The Real WTF (TM) is why they are writing a stored procedure to calculate taxes when they are in fact using Microsoft Dynamics AX (referred in the mail just as AX) which, among other things, can do it just fine.

    Coincidentally, Microsoft Dynamics AX uses a varchar(10) field to store state codes and numeric(28, 12) fields to store amounts.

  • (cs) in reply to Pecos Bill
    Pecos Bill:
    BTW, Japan has the most wordy addresses (when spelled with Roman letters, of course).
    Japan has nothing compared to South Korea.
  • The Fake WTF (unregistered) in reply to occasional reader
    occasional reader:
    I've seen stuff like this before - people realising that the world doesn't rotate around the USA - where the 'state' column in addresses are relaxed to include provinces of Canada and abbreviated regions and counties of other countries.

    To be fair, it said that existing_data had 3 letter codes in there for the "states" in Canada.

    I'd say that's a WTF. And I'd think that any business would want to be able to handle octillion dollar orders, though it might be some time before they get any... :-)

  • Ellie (unregistered) in reply to Andrew
    Everyone is missing the main SQL-design flaw here. The governments' state abbreviations should not be the primary keys.

    The application should use INTEGER primary keys to the STATES table. The table stores the states' names and abbreviations for human readers only.

    Then, only the STATES table has to change when new countries are added. The integer foreign keys referencing existing STATES will continue to be valid.

    What do you really gain from this? Yes, you'll be all set when we rename one of the states. Thank goodness you'll only have to update that data in one place when that happens. Following that logic, you should probably make another table that has all the ZIP codes in it too. Yeah, that would be good! Ooh, and area codes, that way you can split the phone numbers and not store that data redundantly. BTW, don't worry about the performance impacts of the multi-thousand row queries where the devs call dbo.StateLookup(sStateID int) on every row because they can't or won't write the query with multiple aliased left joins to bring in the states table each time. You can just get a faster server. Thank goodness you saved all that space in the database.

    Grr.

    End rant.

    The practical way is to arbitrarily decide some size, like varchar(10), and use the States table to populate the select box for it. If you want to enforce data integrity then you can add a foreign key constraint back to the states table. Doing it this way takes 30 extra seconds of work to set up, and saves a minute or more each time someone writes a query to touch the data.

    Normalization is like salt. Too little and too much are both bad.

    -eg

  • (cs) in reply to cpt
    cpt:
    The three letter codes are Australian states (Australian Capital Territory, New South Wales, QueensLanD.

    That is the real WTF. There are only six Australian states (plus two territories plus a few external ones) yet we need three letter abbreviations!

    I once ordered something from the US and they thought the QLD was part of the postal code.

  • (cs) in reply to Spartan
    Spartan:
    They might have to rethink that currency field if they decide to do business in Zimbabwe.

    One "new" ZWD is 10,000,000,000 "old" ZWD so it's not so much an issue now :) (1 USD is about 65 ZWD now)

  • FatherStorm (unregistered) in reply to Rincewind

    huh? United STATES of America, ... state=state status=status. Maybe a US ting but first thought of state is always te regional type of state.

  • Herby (unregistered)

    Then there is a "real real" WTF! You don't even need state codes. They can be implied from the ZIP code. You can make a table of ZIP code to States very easily. Use the first 3 digits of the ZIP code and index a table of states, and go from there. The problem is if you go "international" and it gums up the works. Countries (you can use 2 or 3 letter abbreviations), and each has wierd postal codes. Some are both letters AND numbers.

  • DJ (unregistered)

    As someone in Australia (which is where those three character states codes are) I am frequently p*ssed at US based web services which don't cater for international addresses properly. Good on him for standing up for catering for existing data. I'd have to agree with his boss that the size of the fields for $ values was a bit too picky though. even if they were using Yen I don't think they'd run into that limit too often.

  • LukeRazor (unregistered) in reply to Rincewind

    ACT, NSW and QLD are all Australian states.

  • DJ (unregistered) in reply to Andrew

    "3. In some countries (such as the Phillipines) you need about $1,000,000 to buy a loaf of bread - in that case having a $1 trillion limit would make sense. Although why 12 decimal places??? " Yes, but the US company would not likely be making contracts or order in Philipino currency. Most international orders were made in USD but many now in EUR now too.

  • Iago (unregistered) in reply to Zap Brannigan

    Re: Rule Number One 2008-09-10 11:26 • by Zap Brannigan Rincewind: Doesn't "State" refer to status? I.E. Act = Active vice one of the United States. And doesn't this indicate the bosses boss doesn't have a clue? I see the confusion, this like our ERP. We store state abbreviation and status in the same column. It keeps our database schema simple.

    wtf?

    TopCodr, is that you?

  • David (unregistered)

    You guys have my sympathy in having to deal with such an overly-complex tax system. Catering for taxes at so many levels, i.e. federal/state/city etc, and also with the variations and exceptions at those levels, must be a real headache.

    Just for comparison, here in Australia, add 10% GST and you're done.

    We used to have state taxes, but they got abolished when the federal brought in the universal GST. We never had city taxes. Maybe they are the real WTF.

  • pookey (unregistered) in reply to Andrew
    Andrew:
    Michael:
    Rachael:
    I'm with Chareth on point (1). He says there are 3-letter states in the *existing data*, i.e. there are customers *already in the database* who live in the 3-letter Australian states he gives as examples. Yuriy's char[2] solution won't work for them.
    Yuriy's solution doesn't have to work for them, because the solution is only for US orders, which will only have US state codes, which will only be 2 letters

    Everyone is missing the main SQL-design flaw here. The governments' state abbreviations should not be the primary keys.

    The application should use INTEGER primary keys to the STATES table. The table stores the states' names and abbreviations for human readers only.

    Then, only the STATES table has to change when new countries are added. The integer foreign keys referencing existing STATES will continue to be valid.

    Why do you need keys? Is the database locked up? And why foreign keys? Can't a local locksmith help you? You sure try to complicate things.

  • Casey (unregistered)

    Thanks to FDR being such a wuss at the end of WWII, we do not collect sales tax from foreign countries.

    So even though Australia has three letter state abreviations, no need to process any of that data in any compute_US_sales_tax procedure.

  • pookey (unregistered) in reply to Zap Brannigan
    Zap Brannigan:
    Can someone point me to an international schema for storing addresses? Most of the systems I've worked on were designed with the US in mind. Canadian address sort of work (not really). Mexico and the rest of the world, not supported at all. Is there an ISO or other standards committee address format that is international?

    Don't worry - the same ISO group that fast-tracked M$ OOXML is working on a schema for addresses (or is that a scheme to get your address? I forget.) that is guaranteed to work as well as OOXML...

Leave a comment on “Rule Number One”

Log In or post as a guest

Replying to comment #:

« Return to Article