• (disco)

    nods head in a sage-like manner


    Filed under: there there, it's okay....

  • (disco)
    Article:
    To complicate things, the table contained constant string values, numeric postal codes and enumerated values. Thus, once you queried the data, you needed to explicitly convert it to the correct type before you could use it.

    You're converting postal codes to numeric values? I'm not from the US but don't you also have leading 0s and dashes in postal codes? I was always teached to save postal codes as varchar since you don't use them for arithmetic operations anyway.

  • (disco) in reply to Michael_Mahn
    Michael_Mahn:
    I'm not from the US but don't you also have leading 0s and dashes in postal codes?

    Definitely leading zeroes. The dashes are from an extended (?) version of the ZIP codes, known as ZIP+4, with 4 more digits added for additional precision.

    https://en.wikipedia.org/wiki/ZIP_code#ZIP.2B4

    You see them used occasionally, but most people don't.

  • (disco) in reply to Michael_Mahn

    Filed under: [XM4 5HQ](http://www.royalmail.com/letters-to-santa)
  • (disco) in reply to Michael_Mahn

    Numeric postal codes? I suppose they have no Canadian etc. customers?
    Well by now they probably have no customers at all so that simplifies things somewhat.

  • (disco) in reply to Michael_Mahn

    I think either way would work, nobody uses ZIP+4 and if they did it would be trivial to programmatically insert a dash before the last 4.

  • (disco) in reply to Michael_Mahn
    Michael_Mahn:
    You're converting postal codes to numeric values? I'm not from the US but don't you also have leading 0s and dashes in postal codes? I was always teached to save postal codes as varchar since you don't use them for arithmetic operations anyway.

    You must be new to software maintenance. I've seen this reasoning in many systems in my time:

    1. The dash is always present, so you can restore it during formatting.
    2. Outside of the dash, all postal codes are numeric digits (in the US).
    3. The government would never do an insane thing like change the code to use letters or increase the code length.
    4. And if it does, we can code around it. A0000-0000 can be coded to go from 1,000,000,000 to 1,099,999,999; that'll leave space for more later.
    5. Storing it as characters, even if we skipped the dash, would take 9 whole bytes...18 bytes in that *!@#$%! Java. If we store it as an int, it takes 4 bytes.
    6. Saving bytes in this day and age is of unbelievable importance.

    Or, at least, so goes the reasoning. This same reasoning can be applied to dates and license numbers of all kinds, and Social Security Number. Street numbers (because no one ever has a street number like 1014½ or 209B).

    The fact that almost every single point of reasoning in that list is wrong doesn't stop idiots from "saving bytes" by doing that. And even if they don't, such as in the systems I work with, Zip code would never be more than 5--wait, no never more than 9--characters, so a fixed-length field will do, right?

    It's insane, but no one has a way of forcing sanity on all software architects yet.

  • (disco) in reply to CoyneTheDup

    Dodged all those, but...

    • I was forced to split "first name" and "last name"
    • I was forced to split addresses not by lines but country/city/street

    I warned against it multiple times. No dice. Waiting for shit to hit the fan sooner or later.

  • (disco)

    We had a system which was modified some time in early October to store the date and an incrementing counter (00 -- 99) in an integer field: "ddmmyyyynn" to be used (among other things) to create a filename. All went well until October 22nd when all the print jobs failed at once. It devolved to me to completely rewrite that aspect of the print queue handler in a day, so that the customers did not get their reports late.

    Another time we discovered that some bright spark had stored the DHL despatch code (? something like that, might not have been DHL, I may be misremembering) as an integer. Worked fine till the despatch code, again, grew from 9 digits to 11 digits. Or something like that. Again, a lightning rewrite job that yours truly had to implement.

    This stuff is programming bread-and-butter. Jenny should keep quiet, keep her head down and tackle the job piece by piece. Getting up in front of a manager complaining about how hard it is to do ain't gonna get her anywhere. As I see this, it's a cushy little number that should provide a cash cow for her for a long time to come. She's just got to learn to communicate effectively (i.e. to dissemble convincingly).

  • (disco)

    This piece gets minus 1 brazillion points right off the top for using "performant" non-ironically.

  • (disco) in reply to Michael_Mahn

    Meh, I've seen worse. For example, I once helped a coder on a forum solve a Python 2.x problem that came from trying to use an integer value to store Social Security numbers, which for those outside of the US consist of nine digits broken up with dashes in two points, and have significant leading zeroes. Did I mention that it was Python, a language which uses multi-precision integers and implicitly converts integers to floats when needed?

    At one point in the code, there was a commented out line that read in said integer using input() instead of raw_input(), which meant that it was evaluating the string as an infix expression. I'd have loved to have been a fly on the wall the day the poster realized why that didn't work.

  • (disco) in reply to boomzilla
    boomzilla:
    You see them used occasionally, but most people don't.
    In my experience, most large commercial mailers use them. I use them if I know the addressee's ZIP+4, or if I have to look up the ZIP code anyway.

    Filed under: "People"? I ain't "people."

  • (disco) in reply to Michael_Mahn
    Michael_Mahn:
    I was always teached to save postal codes as varchar since you don't use them for arithmetic operations anyway.
    I always have the habit of doing this (storing them as a string) for any number which you do not used in computation. I'm not sure if that will give me a :wtf: award for myself but it seems to work the last couple of years.
  • (disco)

    What she needed to do:

    1. Find the database query that used up the most of the database's resources. A lot of databases give you the tools and information you need to do that.
    2. Add in the index that is almost certainly missing from their database that causes that query to run a table scan.
    3. Repeat a few times until you are getting noticeable improvements.

    Then go back to the bosses and say "I've taken a few steps that have cut your load times by 15%, but what you really need to do is redesign and replace this bit right here." Then you inch them slowly towards sanity. The reason for this is that until you have shown results, they hear "You need to rewrite the whole thing" as a cop-out.

  • (disco) in reply to EatenByAGrue
    EatenByAGrue:
    Then go back to the bosses and say "I've taken a few steps that have cut your load times by 15%, but what you really need to do is redesign and replace this bit right here."

    Why would we need to do that? You've clearly improved the performance by doing your databaseing magic, just improve it some more. We're not paying you to redesign it, just to make it faster!

  • (disco)

    I know its open season on everyone here for the trolls but this is a kid just out of college with no guidance...

  • (disco) in reply to DogsB

    Yeah, most of TRWTFing is directed at the management. We're not blaming the kid, he doesn't know any better...

  • (disco) in reply to sloosecannon

    That's when you give 'em (possibly invented) numbers: "I can get you another 3% in 2 weeks doing what I'm doing, but if I spend 4 weeks following my recommendations I can give you a 25% improvement which would give you more bang for your buck."

  • (disco)

    So.. the "highly paid consultant" is unable to do any performance improvements without rewriting the entire code, simply because the database is not structured the way she would have done it? And one of the big issues was that postal codes were stored as strings? And another was that "constant strings used to populate a lists" were stored the same way as "enumerated values"? These are the exact same things.

    Perhaps sticking with recent graduates is just as well. At least they don't have the go-to reasoning of "this code doesn't look exactly like the code for the last project I was working on; therefore it must be scrapped and redone from start".

  • (disco) in reply to rc4

    Really? So what is the numeric zip 12345? 12345-0000? 00001-2345?

  • (disco) in reply to DaveN

    o.Õ

    <!--inb4 stealing @PJH's trademark -->
  • (disco) in reply to dcon

    An example would be 20500. If you mail a letter to 1600 Pennsylvania Ave NW, Washington, DC 20500, it'll get delivered without issue. The only time I see ZIP+4 used is when a program looks up the +4 itself. I don't think I've ever seen leading zeroes in a zip code.

  • (disco) in reply to rc4
    rc4:
    I don't think I've ever seen leading zeroes in a zip code.

    New Jersey. I grew up there. Google is your friend.

  • (disco) in reply to rc4
    rc4:
    I've ever seen leading zeroes in a zip code.
    Valid ZIP codes with leading '0' include Puerto Rico, New England states, and New Jersey. The lowest valid ZIP with a leading '0' appears to be 00601: Rural area including towns of Juan González, Adjuntas, Garzas, Saltillo, PR. The highest appears to be 08904: New Brunswick, NJ
  • (disco) in reply to dcon

    Sorry, I thought you meant extra leading zeroes, like 0000020500.

    If your ZIP code is 06108, and gets truncated to 6108, you could code it to automatically insert leading zeroes until it reaches 5 characters in length.

  • (disco) in reply to HardwareGeek

    https://what.thedailywtf.com/t/the-graduate/51338/28?u=rc4

  • (disco) in reply to boomzilla

    Vast majority of uses I see, is people using their PO Box for the last 4.

    PO Box 4558 HTown, Tehas 70056-4558

  • (disco) in reply to dcon
    dcon:
    Really? So what is the numeric zip 12345? 12345-0000? 00001-2345?

    Since 0000x is not (currently) a valid ZIP code, it would be "safe" to treat it as 12345-0000. Since -0000 is (probably) not a valid +4, suppress it in the output. Of course, it's still a bad idea, but it could work (which makes it the worst kind of bad idea, because idiots will then think it's a good idea).

  • (disco) in reply to xaade
    xaade:
    Vast majority of uses I see, is peopleUSPS using the last 4 digits oftheir PO Box for the last 4.

    PO Box 124558 HTownBigCity, Tehas 70056-4558

    Where I used to live, the +4 specified a group of four apartments in a large apartment complex.

  • (disco) in reply to HardwareGeek
    HardwareGeek:
    Where I used to live, the +4 specified a group of four apartments in a large apartment complex.

    shouldn't the first digit of your STE say which building.

    My apartment number was 1130, 1 for building 1, 1 for floor 1, 30 for room 30.

  • (disco) in reply to xaade
    xaade:
    STE

    Secure Terminal Equipment? Simplified Technical English? Stockton Terminal and Eastern Railroad? Star Trek: Enterprise?

    xaade:
    shouldn't the first digit of your STE say which building.
    More or less, yes. In my case, the buildings were lettered A - N(?), so C303. For whatever reason, the USPS assigned +4 such that a given +4 would cover C301 – C304. In an area of detached houses, I think a given +4 typically covers a half-dozen, or so, houses.
  • (disco) in reply to rc4
    rc4:
    If your ZIP code is 06108, and gets truncated to 6108, you could code it to automatically insert leading zeroes until it reaches 5 characters in length.
    Or, y'know, you could store it correctly in the first place.
  • (disco) in reply to Zylon

    But then how would you end up on the front page of tdwtf?

  • (disco) in reply to HardwareGeek
    HardwareGeek:
    Since 0000x is not (currently) a valid ZIP code, it would be "safe" to treat it as 12345-0000. Since -0000 is (probably) not a valid +4, suppress it in the output. Of course, it's still a bad idea, but it *could* work (which makes it the worst kind of bad idea, because idiots will then think it's a good idea).

    And that's when you discover the meaning of the word mutable in a const object.

  • (disco) in reply to HardwareGeek

    No, this is why ZIP is a string, and you validate it.

  • (disco) in reply to xaade
    xaade:
    No, this is why ZIP is a string, and you validate it.send a confirmation mail to it

    Best practices, people!

  • (disco) in reply to xaade
    xaade:
    No,
    I agree. Which part of >it's still a bad idea ... the worst kind of bad idea

    did you not understand?

  • (disco) in reply to HardwareGeek
  • (disco)

    If I'm not going to do math on it, it's probably a string.

    I'm not getting from the text what the structure of that MUCK table is. Something like this?

    CREATE TABLE dbo.Ref_Data5 ( RefID INT NOT NULL PRIMARY KEY, RefValue varchar(1000), RefType varchar(50), Created/UpdatedTimes, etc, etc.. ) ??

    MUCK tables suck, sure.... but where is he getting numeric postal codes from something like this? Is there seriously a separate "PostalCode" column in this table? How does that field end up numeric if the others are strings?

  • (disco) in reply to Zylon

    Guess you've never heard of devil's advocate.

  • (disco) in reply to sloosecannon
    sloosecannon:
    We're not blaming the kid, he doesn't know any better...

    I don't know how far the embellishments in the article stretch, but possibly the kid should have known better though.

    Proper database normalization should be part of course material and other stuff like using proper bit-columns for booleans is something that's so retardedly basic you should've been able to land on that solution with a little bit of googling.

    Seriously; if you employ a VARCHAR column for booleans and think that's a-okay; if you don't get that nagging little voice in the back of your head that you should really query Google or other search engines because "there has to be a better way to do this", then you are an utter failure as a software developer out the gate.

  • (disco) in reply to Zylon
    Zylon:
    This piece gets minus 1 brazillion points right off the top for using "performant" non-ironically.

    Why? "Performant" is a perfectly cromulent word. :trolleybus:

  • (disco) in reply to ScholRLEA
    ScholRLEA:
    At one point in the code, there was a commented out line that read in said integer using input() instead of raw_input(), which meant that it was evaluating the string as an infix expression.

    To be fair, input() is ridiculously bad design, a function that only 0.1% of programs will ever need named after a very common word that all text-based programs need.

    Like the linux command line where the name you would expect a common command to be is always taken by some unrelated bullshit utility that no one has used since 1980 (yes, I know that's before Linux was made, that's how much it sucks) or an imagemagick command.

  • (disco) in reply to rc4

    If your zip is 06108, maybe it doesn't get stored at all, because that's not a valid octal number. (8?? what's 8?)

  • (disco) in reply to anonymous234

    Well, yeah, even Guido agrees; he's called it the worst mistake he made in the original Python. Unfortunately, he was stuck with it for a long time afterwards because of backwards compatibility, and while it got fixed in 3.x, the facts that a) it took so long to make that change and b) it still works that way in the current 2.x release, is definitely a problem.

    I'm sorry to say that Guido probably was copying Lisp when he decided that input() should evaluate the input string by default; in most Lisps, the default behavior is to read string input using the REPL, which translates the string into a Lisp data structure (either an atom or a list of atoms). So, if you are running a program at the listener, and enter

    >>> (display (read)) MARY HAD A LITTLE LAMB MARY

    it would be read as the atom 'MARY as a symbol and drop the rest. While this made a certain amount of sense in the original context of symbolic list processing, it was pretty much useless after around 1975 or so, but like Python 2.x's input() it was seen as too widely used to be phased out. Mind you, even in the Lisp languages (modulo a few odd ducks), it didn't actually (eval) the line by default, but treated it as quoted:

    >>> (display (read)) (+ 2 2) (+ 2 2)

    So the original Python input() operator was actually worse than what it copied:

    >>> print input() 2 + 2 4

    Fortunately, 3.x does it more sensibly, by reading the input as a string just as raw_input() used to:

    >>> print(input()) 2 + 2 2 + 2

  • (disco)

    The thing about postal codes makes me wonder - are there any locations to which the assumption "An address is a sequence of 1 or more lines" does not apply?

  • (disco) in reply to EatenByAGrue

    I've never been a highly-paid consultant, but what I would probably do in the case of a FUBAR program is:

    1. Try to explain as best as I could the concept of technical debt, possibly using the metaphor of an old car that breaks down every other week, and while it can still be maintained, it would be better to replace it (or at least do a major rewrite of the engine, but the metaphor does not go that far).
    2. Maybe, depending on the boss' receptiveness, try to explain some of the problems the program has in layman's terms
    3. If he's still yelling at me to "just fix it", okey dokey, just get on it and gradually "fix" the worst parts as best as I can.

    Having said that, the only thing we see in this article is bad table structures, and unless I'm missing something, that can be fixed easily.

  • (disco) in reply to CoyneTheDup
    CoyneTheDup:
    Storing it as characters, even if we skipped the dash, would take 9 whole bytes...18 bytes in that *!@#$%! Java. If we store it as an int, it takes 4 bytes.

    Ooh, packed BCD could cut that down to 5 bytes, anyway, or 3 if you skip the +4 in "zip+4".

  • (disco) in reply to xaade
    xaade:
    Vast majority of uses I see, is people using their PO Box for the last 4.

    The other common use is for the delivery route themselves.

  • (disco) in reply to HardwareGeek
    HardwareGeek:
    Where I used to live, the +4 specified a group of four apartments in a large apartment complex.

    I know I've mentioned this before, but when I was in college, my college had it's own zip code, and assigned a unique +4 number to each dorm room. Dorm A was 0101-0199 or whatever, Dorm B was 0200-0299, and so on. I don't remember the specifics beyond being in the first dorm alphabetically on the first floor, so my +4 was "0108".

Leave a comment on “The Graduate”

Log In or post as a guest

Replying to comment #:

« Return to Article