Confessions: The Phone Number

« Return to Article
  • KattMan 2012-07-17 09:04
    FRIST! and early

    Treat all numbers as characters unless you need to do math on them.
  • Coffee Hound 2012-07-17 09:07
    1. PHP (XML?)
    2. ???
    3. Profit!

    Now we know that step 2 is: store phone number as int (preferably while working in Dallas)... then charge consulting fees b/c the client mucked with YOUR production code... Flawless!

  • Gizzmo 2012-07-17 09:15
    What, and you didn't notice that 2147483647 is (2^31)-1? Sheesh. :)
  • Pista 2012-07-17 09:15
    Well, you seem to have kicked yourself in the butt, but at least you learned the lesson. For yourself and for a lot of us :D
  • Lone Phone Ranger (The bit-masked man) 2012-07-17 09:15
    I guess computers in Dallas adopt wild west slang when they type guess and internally monologue to themselves
  • Manadar 2012-07-17 09:15
    Only in Dallas!

    Quite literally.
  • SODOFFFFF 2012-07-17 09:19
    I must say,I didn't expect this from the first line of the story,good job
  • Kevin D 2012-07-17 09:21
    I live in the 989!!
  • dork 2012-07-17 09:23
    now whos gonna call the max int32 phone number and try to convince them if they switch to at&t they will get full 64bits of service!
  • Dave the Destroyer 2012-07-17 09:25
    As the first commenter alluded to isn't the real WTF here using numbers to store telephone numbers?
  • Rick 2012-07-17 09:27
    KattMan:
    FRIST! and early

    Treat all numbers as characters unless you need to do math on them.

    TRWTF - A frist comment that is also correct.

    I learned this lesson about numbers about 3 decades ago.

    The lesson that I learned more recently is "Don't use XML unless you have a schema or at least a dtd."
  • flats 2012-07-17 09:30
    Dave the Destroyer:
    As the first commenter alluded to isn't the real WTF here using numbers to store telephone numbers?
    I'd excuse him for not reading the whole article, in his rush to be a first poster, but yes, the submitter did realize that.

    My first thought was "How on earth do you not recognize INT_MAX?" but I hadn't considered 1) someone would be living in a 214 area code, where that number would look sane, and 2) (which is much weirder to me) that young developers could have very well never developed on a 32-bit system.
  • Carl 2012-07-17 09:31
    Storing a phone number as an integer is definitely a WTF, but so is having a development server that's configured differently than the production one.
  • ekolis 2012-07-17 09:33
    I feel sorry for whoever has that number. I wonder if his name is Max?
  • KattMan 2012-07-17 09:34
    flats:
    I'd excuse him for not reading the whole article, in his rush to be a first poster, but yes, the submitter did realize that.


    I did read the whole article first, but always felt that simply posting "FRIST" was a bit inane (some would say insane).
  • Shinobu 2012-07-17 09:36
    We've had an extensive discussion about phone numbers before.
    Plus some filler text because Akismet doesn't recognise that this an internal cross-thread link.
  • jc 2012-07-17 09:37
    you do realize that in php, all integers are signed, hence the one missing bit?
  • Mike 2012-07-17 09:43
    TRWTF: Languages that hide exceptions and returns random crapola as data. Nice!
  • notromda 2012-07-17 09:47
    Mike:
    TRWTF: Crapola Languages that hide exceptions and returns random crapola as data. Nice!


    FTFY
  • Dion 2012-07-17 09:50
    The explanation explains that there is some magic code converting XML to arrays in a couple of steps. It seems to me that the real WTF is that the XML API he gets the data from classifies the phone numbers as numbers.
  • Geoff 2012-07-17 09:51
    Not a PHP guy so I don't know but, if you try to convert a string to an int and its to big you get the largest possible signed integer value? Its not an exception, or overflow error?

    What the hell kind of sense does that make?
  • foo 2012-07-17 09:51
    Let's see, TRWTFs are:
    - Type-guessing -- honestly, in all my programming work I never needed to type-guess. Either the input fits the expected type, or it's invalid.
    - Not immediately thinking of 2^31 when seeing a 10-digit number starting with 21... (OK, slight excuses for the Dallas coincidence, but when a certain "magic" value keeps popping up, you better look what kind of value it actually is.)
    - PHP doing saturation rounding, apparently without any warning, in a type-cast
    - PHP itself
    - Testing on a different configuration that production
    - Storing phone numbers (or zip codes etc.) as numbers
    - The US numbering system :) -- many countries' phone numbers start with an initial 0, so the previous mistake would show much earlier.
    - The article:
    * Typo/grammar error in the first sentence
    * "2,147,483,647 is the highest integer value addressable on 32-bit systems". Triply wrong: (a) The highest *signed* integer, (b) representable, not addressable, (c) in a single register (64 bit integers stored in two registers are not that exotic, though maybe in PHP)
    - The comments, including this one
    - Akismet, of course
  • Dan 2012-07-17 09:52
    Did anyone try calling the number? If it was available, I'd grab it. I wonder how many calls they get for this reason...
  • The MAZZTer 2012-07-17 09:53
    KattMan:
    FRIST! and early

    Treat all numbers as characters unless you need to do math on them.


    I assume you mean PHONE numbers, otherwise this is just silly.
  • wtf 2012-07-17 09:56
    The real WTF is a 32bit server.
  • @Deprecated 2012-07-17 09:57
    dork:
    now whos gonna call the max int32 phone number and try to convince them if they switch to at&t they will get full 64bits of service!


    Tooooo funny!
    +2^31
  • DonaldK 2012-07-17 09:57
    Whahahaha 214 dialing code... you would have picked up on the problem a lot earlier in another city...

    Thanks this was entertaining.

    Captcha "damnum" ... hiehie that damnum won't fit into a 32 bit int...!
  • PiisAWheeL 2012-07-17 10:06
    Rick:
    The lesson that I learned more recently is "Don't use XML."

    FTFY.
  • Matthew 2012-07-17 10:07
    A fairly weak story... a relatively common error that everyone makes at least once in their life, and nothing really that fantastic or funny happened as a result. In the end, pretty much your run of the mill bugfix.

    Oh, and TRWTF is PHP.
  • Anketam 2012-07-17 10:16
    Gizzmo:
    What, and you didn't notice that 2147483647 is (2^31)-1? Sheesh. :)
    Yea as soon as I saw the 2147 I knew it was a 2^x just had to figure out what x was. That is what you get for storing a string as a number, then again it is php and types are so retro.
  • Ben Jammin 2012-07-17 10:17
    The MAZZTer:
    KattMan:
    FRIST! and early

    Treat all numbers as characters unless you need to do math on them.


    I assume you mean PHONE numbers, otherwise this is just silly.


    I generally do postal codes as strings, too. Granted us Merkins generally only use numbers, whilst other countries mix in some letters (some as close as Canada) so you may not have a number only postal code.
  • Vilx- 2012-07-17 10:24
    [Insert mandatory generic anti-PHP rant here]
  • John Winters 2012-07-17 10:26
    This brings me on to one of my pet peeves - web sites that insist you have to enter your phone number without any punctuation or spaces in it.

    It apparently doesn't matter to the programmer that the number would normally have spaces in it - the web site insists the *user* has to take the spaces out. Two points:

    1) Why insist on storing the numbers in an unconventional format? (perhaps so they can be stored in an int?)
    2) If you must have them that way, then you (the programmer) massage them after input. Don't expect the user to do it for you.

    Captcah: saepius - a very wise comment.
  • The Doctor 2012-07-17 10:27
    It's not often that people post WTF's that they're resposible for here...
  • hymie 2012-07-17 10:31
    The MAZZTer:
    KattMan:
    FRIST! and early

    Treat all numbers as characters unless you need to do math on them.


    I assume you mean PHONE numbers, otherwise this is just silly.


    You don't have numeric postal codes, numeric identification (Social Security) codes, numeric part numbers ... ?
  • toshir0 2012-07-17 10:34
    ekolis:
    I feel sorry for whoever has that number. I wonder if his name is Max?

    Dan:
    Did anyone try calling the number? If it was available, I'd grab it. I wonder how many calls they get for this reason...

    I third that.
    I can picture the guy totally going *cluster F-bomb* at each other call...
  • KattMan 2012-07-17 10:39
    hymie:
    The MAZZTer:
    KattMan:
    FRIST! and early

    Treat all numbers as characters unless you need to do math on them.


    I assume you mean PHONE numbers, otherwise this is just silly.


    You don't have numeric postal codes, numeric identification (Social Security) codes, numeric part numbers ... ?


    I'll just wait for future "numbers that sholuld be strings WTFs" from the Mazzter. I'm sure hilarity will ensue since we have warned him.

    Yes part numbers, SSN's, drivers license numbers, document numbers, customer numbers, etc, should all be strings. Cost, amount to order, inventory levels, days till end, etc should be numbers as you will eventually do math on them.
  • Chuck Lester 2012-07-17 10:40
    Congratulations on being the frist person to report your OWN WTF!!
  • Web Dude 2012-07-17 10:47
    A dubstep joke on TDWTF! How modern!
  • pitchingchris 2012-07-17 10:51
    foo:

    - The US numbering system :) -- many countries' phone numbers start with an initial 0, so the previous mistake would show much earlier.


    Its not the US numbering system at fault, its the idiot who tried to put it a phone number into an int.
  • Rodnas 2012-07-17 10:52
    2,147,483,647th (so not frist ey)
  • willaien 2012-07-17 10:55
    PHP does automatic type casting in some cases, and won't even let you specify a type. So, the big WTF is using PHP in a production system.

    Veekun has an excellent article on why PHP is a bad idea for any project of sufficient size to warrant a language be used.

    Akismet seems to think I'm spamming up the joint, so adding a bit of text below the URL.
  • Cbuttius 2012-07-17 10:57
    Sorry, the number you have called has been changed.

    Please redial 9223372036854775807
  • Spoe 2012-07-17 10:58
    Geoff:
    Not a PHP guy so I don't know but, if you try to convert a string to an int and its to big you get the largest possible signed integer value? Its not an exception, or overflow error?


    Yep. Much like many overflow conditions in C, e.g.:


    int foo ( int x) {
    return ( x+1 ) > x;
    }
    int main ( void ) {
    printf ("%d\n", ( INT_MAX+1 ) > INT_MAX );
    printf ("%d\n", foo ( INT_MAX ));
    return 0;
    }


    GCC (and several other compilers) with -O2 will output:
    1
    0

    Meaning INT_MAX + 1 is both greater than and less than INT_MAX depending on how it's calculated.

    No error is reported for the overflow. As I understand it, the behavior in this case is valid per the C99 spec since INT_MAX + 1 represents undefined behavior: the compiler can do what it wants.

    I don't see C receiving the same level of disrespect as PHP.
  • A Person 2012-07-17 10:58
    Started at a new job about 6 months ago. Week 2 I am putting together some tables including you guessed it a telephone number.

    Boss looks at it later. 'Youre storing a phone number as text, why?'

    Little perplexed as boss is supposed to be a programmer.

    'Well you never need to perform arithemtic on it so this avoids any potential unintended math operations being performed on it'

    I expected a sage nod and for him to move on. What actually happened was the 3 other developers and my boss looked at me as if I was insane.

    It got stored as an int, and I realised I'm not going to learn anything here.
  • Spoe 2012-07-17 11:05
    Spoe:

    GCC (and several other compilers) with -O2 will output:
    1
    0


    Correction:
    0
    1
  • Bubbah Ewing 2012-07-17 11:06
    This must have been over 15 years ago.

    Around that time they "divided" Dallas into 2 area codes, 972 and 214 with the old code being the "inner" area. About 4 years after that they added the 469 area code as we consumed all the 972 numbers. Now it's an amalgamated 214/469/972 block where your neighbor could be assigned a different area code than you and you could have a phone number that is "next to" one that is a 1 hour drive away.
  • pantsman 2012-07-17 11:12
    Started at a new job. They are using PHP. Realised I'm not going to learn anything here.
  • callcopse 2012-07-17 11:12
    Lone Phone Ranger (The bit-masked man):
    I guess computers in Dallas adopt wild west slang when they type guess and internally monologue to themselves


    Of course. I think they would look, or see their avatars as, a little like Yosemite Sam. Ooohhh that rackin, frackin PHP ...
  • Cbuttius 2012-07-17 11:13
    A few years ago between jobs I decided to teach myself PHP as I wanted to learn a bit more about it.

    The main WTF as I saw it that having a large amount of server-side scripting was likely to be slow on the server. However a small amount, or for a small intranet web-service with a few users, it isn't such a bad thing.

    Of course with any language, there will be WTF uses of it.

    I never got to write a project of any size to discover some of its WTF-ness, although it seemed at the time that it could integrate with C libraries quite well that you could actually write large amounts of your web-service code in C (or C++ with a C interface) and use PHP as a simple thin layer.
  • airdrik 2012-07-17 11:16
    Chuck Lester:
    Congratulations on being the frist person to report your OWN WTF!!

    Because no one has ever reported their own WTF before.
    Alas if only the designer of Akismet would come forth and report the WTF that is Akismet
  • Drtfsxzjkl 2012-07-17 11:17
    willaien:
    Akismet seems to think I'm spamming up the joint, so adding a bit of text below the URL.

    Akismet's algorithm is (or at least includes): if (post.endsWithURL()) then {is_spam = true}. I've yet to see a single counterexample.
  • Andrew 2012-07-17 11:18
    wtf:
    The real WTF is a 32bit server.
    I was going to say the same thing, but then I saw a bunch of in use P3 servers in the closet yesterday.

    Oh snap!
  • Maurits 2012-07-17 11:19
    KattMan:
    hymie:
    The MAZZTer:
    KattMan:
    FRIST! and early

    Treat all numbers as characters unless you need to do math on them.


    I assume you mean PHONE numbers, otherwise this is just silly.


    You don't have numeric postal codes, numeric identification (Social Security) codes, numeric part numbers ... ?


    I'll just wait for future "numbers that sholuld be strings WTFs" from the Mazzter. I'm sure hilarity will ensue since we have warned him.

    Yes part numbers, SSN's, drivers license numbers, document numbers, customer numbers, etc, should all be strings. Cost, amount to order, inventory levels, days till end, etc should be numbers as you will eventually do math on them.


    At the risk of being tautological, if you don't need to do math on the thing, it's not a number, it's just an identifier.

    I'll add ISBNs and PINs to KattMan's list, and subclass part numbers as UPCs and EANs.
  • curtmack 2012-07-17 11:20
    Mike:
    TRWTF: Languages that hide exceptions and returns random crapola as data. Nice!


    No matter how much I look at it, I will always see "crapola" as a combination of "crap" and "cola," and so I will always think of it as the flavor of Pepsi Blue. (Or Crystal Pepsi, if you prefer.)
  • foo 2012-07-17 11:23
    A Person:
    Started at a new job about 6 months ago. Week 2 I am putting together some tables including you guessed it a telephone number.

    Boss looks at it later. 'Youre storing a phone number as text, why?'

    Little perplexed as boss is supposed to be a programmer.

    'Well you never need to perform arithemtic on it so this avoids any potential unintended math operations being performed on it'

    I expected a sage nod and for him to move on. What actually happened was the 3 other developers and my boss looked at me as if I was insane.

    It got stored as an int, and I realised I'm not going to learn anything here.
    The correct answer would have been: "Because it's not a number, just a sequence of digits." Of course, the chances he'd understand the difference are small.

    A more pragmatic answer would be: "So you can preserve the puctuation and don't need extra code to remove it on input and add it on output", or "if you ever want to go international, note that some countries have numbers starting with 0".
  • VictorSierraGolf 2012-07-17 11:24
    well pardner, that don't look like no number I've ever seen, here's the biggest one I got,


    Why did I read it in Georg Carlins voice?
  • foo 2012-07-17 11:25
    pitchingchris:
    foo:

    - The US numbering system :) -- many countries' phone numbers start with an initial 0, so the previous mistake would show much earlier.


    Its not the US numbering system at fault, its the idiot who tried to put it a phone number into an int.
    Wow, you managed to get trolled even though I didn't mean to (note the smiley).
  • validus 2012-07-17 11:30
    No, dude. What you should have learned from that is that your suck as a coder, and that your logic is not quite there yet.

    I'm referring to casting that phone number value to numeric. If you're casting to this and that solely on the look of the value, that's where you're making a mistake. Add a metadata value that will mean the type, and cast based on that.
  • Rick 2012-07-17 11:39
    Perl is also quite interesting with large integers:

    #! /usr/bin/perl

    printf "%d\n", 2 ** 62; #4611686018427387904
    printf "%d\n", 2 ** 63; #-9223372036854775808
    printf "%d\n", 2 ** 64; #-1




  • DidThisOnce 2012-07-17 11:47
    Reminds me of something I did once.

    The purpose of the code was to read a barcode badge that contained a persons SSN and alert them to pending notices. Simple enough, but we needed to provide keyboard input because they may not have their badges just yet.

    I thought I'd take a shortcut and just do a validation check by casting to a number and displaying an error if it didn't match. Well, it also happens that the barcodes append a number to the end depending on what department you work for (1 digit). Making the barcode numbers 10 digits long.

    Well this all worked for testing, because every test badge we had used SSNs printed for people in PA (SSNs start with 1 most of the time). The first time a contractor arrived from out of state (about 1 day after this went live), he had an SSN / badge that started with a 3. Broke everything good.

    Now we have real input validation!

  • ¯\(°_o)/¯ I DUNNO LOL 2012-07-17 11:51
    >google for 214-748-3647
    >eyes get big

    John Winters:
    This brings me on to one of my pet peeves - web sites that insist you have to enter your phone number without any punctuation or spaces in it.
    It's when they do that about credit card numbers that bugs me, but I'm sure the credit card companies are super strict about not massaging those numbers or something like that.

    Also, TRWTFIPHP
  • curtmack 2012-07-17 11:57
    This isn't the worst I've seen. Here's how MS SQL Server guesses data types in imported Excel spreadsheets:

    - Look at the top 20-ish rows.
    - Is any single one of them numeric?
    - If so, MUST BE A NUMBER!

    Thankfully, SQL Server does care about cell formatting (unlike Access, another program that has WTF Excel importing - you'd think Office would have pretty good compatibility with itself, but you'd be wrong). So if you format the column as Text, it won't type-guess.

    But as an added bonus, it silently fails to import numeric values into fields set up as varchars. And there's no option to tell it to explicitly cast numbers to strings either. So you get to import the data once to get all of the numeric employee IDs, import it again to get the non-numeric ones, and then merge them together.

    Edit: To be clear here - I mean that, even if it's from a column that you explicitly state is a string, SQL Server will still insist that "24593" is a number, and will fail to import it, presumably because it converts it from a string to a number, and then wonders why it can't put it into a string field.
  • ¯\(°_o)/¯ I DUNNO LOL 2012-07-17 12:00
    You know what? After looking over Google results for 214-748-3647, it's obvious that there is some way to game Google to have your search turn up for anything that looks like a phone number. (Not that I hadn't already guessed this from all the crap you get when you type in anything vaguely like an electronic part number.)

    According to the results I get, people all over the country have this number, not just in Dallas! I think I even saw an address from Ontario with this number.
  • Mason Wheeler 2012-07-17 12:06
    Spoe:
    No error is reported for the overflow. As I understand it, the behavior in this case is valid per the C99 spec since INT_MAX + 1 represents undefined behavior: the compiler can do what it wants.

    I don't see C receiving the same level of disrespect as PHP.

    Yeah, C does stupid crap, but it does it quickly, and that makes everything OK! </sarcasm>
  • Mcoder 2012-07-17 12:17
    Spoe:

    int foo ( int x) {
    return ( x+1 ) > x;
    }
    int main ( void ) {
    printf ("%d\n", ( INT_MAX+1 ) > INT_MAX );
    printf ("%d\n", foo ( INT_MAX ));
    return 0;
    }[/code]

    GCC (and several other compilers) with -O2 will output:
    1
    0

    Meaning INT_MAX + 1 is both greater than and less than INT_MAX depending on how it's calculated.

    No error is reported for the overflow. As I understand it, the behavior in this case is valid per the C99 spec since INT_MAX + 1 represents undefined behavior: the compiler can do what it wants.

    I don't see C receiving the same level of disrespect as PHP.


    That's because C is a low level language, without the concept of exceptions. You use it when you want a low level language, and don't want the computer to do things when you are not looking. If you are coding in C, you'd better know what comes from that addition, or at least check error.

    That is different from PHP. You choose PHP when you must work on a PHP codebase, or when you are insane.
  • Publius 2012-07-17 12:22
    PiisAWheeL:
    Rick:
    The lesson that I learned more recently is "Don't use XML."

    FTFY.

    +10 irony internets awarded for presenting this advice using an XHTML-based website
  • Harold 2012-07-17 12:28
    Too late on getting that phone number.

    It was taken in the early 90's by a friend of mine.

    He has going to an asterix box. It all goes to voice mail.

    On occasion he listens and says that some of that is pretty funny. Mostly it is drunks guys who have just hit some crazy max score on the arcade game Golden Tee. They think it is a hoot to call the number and leave a message.

    He should save some of those up and post them.... but that would be too much like work.... :)
  • Max Int 2012-07-17 12:38
    Hey why are you spreading my phone number around?
  • Todd 2012-07-17 12:45
    Mike:
    TRWTF: Languages that hide exceptions and returns random crapola as data. Nice!

    At least C has the excuse of "it's for the efficiency, this way we can do it in just one assembly instruction" (not that it's a very good excuse though, it makes much more sense to make it a compiler option and enable bounds checking by default).
  • Spoe 2012-07-17 12:45
    Mcoder:
    That's because C is a low level language, without the concept of exceptions. You use it when you want a low level language, and don't want the computer to do things when you are not looking. If you are coding in C, you'd better know what comes from that addition, or at least check error.


    Yet in many situations C sets errno, a rough approximation of an exception, e.g. if using strol(). Not here.

  • Sea Sharp, Waves Hurt 2012-07-17 12:48
    Drtfsxzjkl:
    Akismet's algorithm is (or at least includes): if (post.endsWithURL()) then {is_spam = true}. I've yet to see a single counterexample.

    Truth of the programming tree is under roots of bark skin tantamount to tranquility. Foresight combines food and lack of oxygen to produce decapodic didacticism. Frisky front-facing foibles frame fantastic folly for few friends.

    http://www.slashdot.org/

    Dogs are animals that fear most stammering stealth but can't reproduce exotic beef jerky. Moira is a name that was given to some girl in a story and didn't have any meaning other than the meaning that it had. Candle-lit dinners are dark.

    Evil isn't good.

    What? The water? I can't hear it because the stand of Douglas Fir is rustling in the wind blown by large mountains falling in the sky to the area left of the wood burning farm for sticky bootlaces. I don't know.

    http://www.yahoo.com/
  • Anon 2012-07-17 12:49
    Geoff:
    Not a PHP guy so I don't know but, if you try to convert a string to an int and its to big you get the largest possible signed integer value? Its not an exception, or overflow error?

    What the hell kind of sense does that make?


    Welcome to the wonderful world of PHP.

    In the world of PHP, if you ask it to do something stupid that it doesn't understand, it decides that it's better to just do something rather than give you one of those troublesome error messages that stops your program from running. Nobody likes those.
  • Geoff 2012-07-17 12:51
    Well a more useful comparison would be what dose atoi() do when given input to large for an integer. That too appears to be undefined so your point stands.

    Still unlike C which needs to be simple to best meet many of its use cases, PHP is a high level language with a pretty specific target application. I would expect some error handling from its type conversion library functions.
  • steve 2012-07-17 13:19
    Just ran into something similar.

    Zip codes and social security numbers can start with a zero and look odd when stored as an int.
  • Jay 2012-07-17 13:37
    foo:
    The US numbering system :) -- many countries' phone numbers start with an initial 0, so the previous mistake would show much earlier.


    That seems rather pointless. If all the numbers start with a zero, then the zero is superfluous and could be left off.

    Well, that said, my electric company has all it's customer account numbers start with the same three digits, I think it's 973 or some such. So when you call and they ask for your account number, if you give a number that DOESN'T start with 973, they know that you're giving them the wrong number so they don't try to use your phone number or social security number or whatever as an account number and maybe screw up someone else's account. Or if a customer has trouble finding their account number on their bill or whatever paperwork, they can say, "Look for a number that starts with 973." Seems like a not-bad idea; I may use it someday.
  • datachick 2012-07-17 13:40
    If you store ZIPCodes as numbers, you'll lose leading zeros and have to do a bunch of casting/converting/BS to get the right value.

    ProTip: If humans use the word "number" in describing the data, it's never* really a number. VIN, Account Number, Customer Number, etc.

    *close enough to be "never" to be truthy.
  • Jay 2012-07-17 13:40
    Gizzmo:
    What, and you didn't notice that 2147483647 is (2^31)-1? Sheesh. :)


    Well, I've been in this busy for 30 years, and I must admit that I didn't immediately recognize it. Sure, if you asked me what the 32 bit int max was, I would have said 2.1 something billion. But seeing it formatted as a phone number, it just didn't occur to me. I can see that if you live in or near a 214 area code, it would be even less likely to leap out.

    Indeed, my first thought when I saw that phone number given was, Wow, shouldn't they have anonymized this, put in a 555 number or something? Otherwise a bunch of idiots are liable to call that number.
  • Rootbeer 2012-07-17 13:43
    willaien:
    Veekun has an excellent article on why PHP is a bad idea for any project of sufficient size to warrant a language be used.


    Veekun's the guy who designed the architecture that Facebook runs on, right?

    No?
  • Jay 2012-07-17 13:47
    Apparently the poster isn't the only person to have a problem with this magic phone number.

    http://www.24hourplaces.com/search.php?state=AL&city=&category=0&zip=&miles=1&page=1

  • Jay 2012-07-17 13:53
    willaien:


    As opposed to projects small enough that you don't need any language at all? I'm not sure what that means. "What language did you write this program in? Java? PHP? Visual Basic?" "No, this was a small project, so I didn't use a language."

    But this is just a short comment, so instead of writing it in English or Spanish or French, I probably should have just written it without using a language.
  • Gurth 2012-07-17 13:54
    Jay:
    foo:
    The US numbering system :) -- many countries' phone numbers start with an initial 0, so the previous mistake would show much earlier.


    That seems rather pointless. If all the numbers start with a zero, then the zero is superfluous and could be left off.

    It could … if you set up the system so that everyone in your country is in the same area code, anyway. It's the area code that starts with a 0, to allow the phone system to differentiate between local and long-distance calls.
  • Ken Snyder 2012-07-17 13:59
    Yes phone numbers are not numbers at all. You sometimes need extension numbers. Or maybe a + to indicate the first part is an international calling code.

    If you are dealing with international phone numbers you will never be able to insert dashes programmatically. There are too many rules and they change too often. In South Korea, for example, the number of digits in the area code varies and the remaining part of the phone number may be 7 or 8 digits. By convention they split it up into 3 and 4 or 4 and 4.
  • yoda 2012-07-17 13:59
    TRWTF is using PHP for anything moderately complex.
  • Jay 2012-07-17 14:00
    ¯\(°_o)/¯ I DUNNO LOL:

    John Winters:
    This brings me on to one of my pet peeves - web sites that insist you have to enter your phone number without any punctuation or spaces in it.
    It's when they do that about credit card numbers that bugs me, but I'm sure the credit card companies are super strict about not massaging those numbers or something like that.


    I agree. How tough would it be to strip out spaces and hyphens so the user can enter his phone number or credit card number the way he is used to seeing it?

    Back in the early days of computers (I don't think so much today), many people would type the letter oh instead of zero and the letter el instead of one, which of course would then screw up numeric inputs. Manuals would routinely lecture users not to do this. Just because it looks right on a typewriter, it doesn't work on the computer, etc.

    My boss years ago came up with a simple solution: He modified our number input function to interpret letter oh as a zero and letter el as a one. When I saw it I thought, zounds, what a simple solution! Instead of lecturing users over and over about the same error, just fix it for them!
  • Nagesh 2012-07-17 14:01
    PHP can not be use for serious projects.
  • Anon 2012-07-17 14:45
    Jay:
    My boss years ago came up with a simple solution: He modified our number input function to interpret letter oh as a zero and letter el as a one. When I saw it I thought, zounds, what a simple solution! Instead of lecturing users over and over about the same error, just fix it for them!


    Great. So they can complacent on your application, then go entering invalid data in all the other applications.

    I would rather use an input mask so they can mash the 0h and el keys all they want but they will not pass validation because they have not typed enough characters. Eventually maybe they will get a clue and find the 0 and 1 keys actually show up on screen.
  • foo 2012-07-17 14:54
    Spoe:
    Mcoder:
    That's because C is a low level language, without the concept of exceptions. You use it when you want a low level language, and don't want the computer to do things when you are not looking. If you are coding in C, you'd better know what comes from that addition, or at least check error.


    Yet in many situations C sets errno, a rough approximation of an exception, e.g. if using strol(). Not here.

    C != libc
  • foo 2012-07-17 15:03
    Jay:
    ¯\(°_o)/¯ I DUNNO LOL:

    John Winters:
    This brings me on to one of my pet peeves - web sites that insist you have to enter your phone number without any punctuation or spaces in it.
    It's when they do that about credit card numbers that bugs me, but I'm sure the credit card companies are super strict about not massaging those numbers or something like that.


    I agree. How tough would it be to strip out spaces and hyphens so the user can enter his phone number or credit card number the way he is used to seeing it?

    Back in the early days of computers (I don't think so much today), many people would type the letter oh instead of zero and the letter el instead of one, which of course would then screw up numeric inputs. Manuals would routinely lecture users not to do this. Just because it looks right on a typewriter, it doesn't work on the computer, etc.

    My boss years ago came up with a simple solution: He modified our number input function to interpret letter oh as a zero and letter el as a one. When I saw it I thought, zounds, what a simple solution! Instead of lecturing users over and over about the same error, just fix it for them!
    Until a Brit from Oldham (OL) tried to enter his postcode ...
  • bruno 2012-07-17 15:08
    Also, you could use alpha-numeric in phone number for aliases 1-800-WTF-WTF1
  • lopacsqas 2012-07-17 15:10
    Dude, next time treat phone number as strings, why in hell would you treat differently.

    Noob mistake.. i hope you have learn!
  • David C. 2012-07-17 15:25
    Spoe:

    int foo ( int x) {
    return ( x+1 ) > x;
    }
    int main ( void ) {
    printf ("%d\n", ( INT_MAX+1 ) > INT_MAX );
    printf ("%d\n", foo ( INT_MAX ));
    return 0;
    }


    GCC (and several other compilers) with -O2 will output:
    1
    0


    I inlined your correction in the above quote...

    What you're seeing here is a bunch of quirks of the C language and what compilers are allowed to do.

    First off, when I compile this, I get a warning for the INT_MAX+1 expression. So any developer should immediately know something's wrong here.

    The reason for the seemingly contradictory answers is that the compiler isn't actually doing any arithmetic, but is shortcutting the expression, generating constants.

    In the case of the first expression ((INT_MAX+1) > INT_MAX), the compiler knows that you've generated an overflow (producing the warning.) It also knows that on the x86 platform, INT_MAX+1 is INT_MIN. So the comparison returns false. The generated code passes a 0 constant to printf.

    In the case of the second expression (in the function foo), it sees the expression ((x+1)>x), which is true for all values of x other than INT_MAX. Since INT_MAX+1 is an implementation-defined expression, it can optimize the whole expression to a constant. The generated code passes a 1 constant to printf.

    Some interesting observations (in playing with this):

    1: If you change the parameter to foo (x) to (volatile int x) then the compiler will make no attempt to optimize the expression. It will do the addition, the result will wrap to INT_MIN, and the function will return 0.

    2: If you replace the expression INT_MAX+1 with the result (2147483648), then the compiler will replace the int constant with a long constant, eliminating the warning and causing the constant 1 to be passed to printf, because there's no more overflow condition. (explicitly Casting INT_MAX to long will do the same thing.)
  • da Doctah 2012-07-17 15:28
    I'd hate to be the guy in Dallas trying to start up a business placing guard animals, only to discover the alternative significance of 714-PIT-DOGS.
  • herby 2012-07-17 16:13
    I pity the person (according to a quick search: Wil Stuart) in the Dallas area who actually HAS the number (214)748-3647. Someone else mentioned that he has a recorder on the number. So, let's all try it (not really, he doesn't need the attention!).

    Before (like before 1995) when ALL area codes had a 1 or a 0 as the middle digit, you COULD actually encode a US/Canada phone number as a 32 bit int. Take the middle digit of the area code as the most significant digit and then use the next 9 digits. Some things actually used this technique. Now with the proliferation of things like cell phones (which chew up LOTS of phone numbers) we have zads of area codes, so it doesn't work any more.
    In this encoding scheme, the phone number would be encoded as: 1247483647.
    Trivia: When AT&T first proposed things like area codes back in the 1940's, they estimated that the scheme with area codes with 1 or 0 as the middle digit would last until around 2000, considering that they were estimating 50 years in the future, being off by 10% (5 years) was a pretty good prediction.
  • John Hensley 2012-07-17 16:19
    ANOTHER integer overflow WTF, don't the editors ever get tired of them? Anyone who can't immediately recognize an integer limit value shouldn't be reading this site anyway.
  • Anon 2012-07-17 16:43
    All I learned from this is that your code is shit.

    And there's probably 10 submissions that weren't posted here about how you were casting the number and it broke that stupid language spectacularly.
  • Meep 2012-07-17 16:58
    Drtfsxzjkl:
    willaien:
    Akismet seems to think I'm spamming up the joint, so adding a bit of text below the URL.

    Akismet's algorithm is (or at least includes): if (post.endsWithURL()) then {is_spam = true}. I've yet to see a single counterexample.


    Well, any actual comment spam kinda has to have a link to work. And a lot of comment spam tends to be drawn from other comments, with a link thrown in. It's probably a pretty effective heuristic.
  • David F. Skoll 2012-07-17 17:19
    "I don't see C receiving the same level of disrespect as PHP."

    Experienced programmers realize that C is a glorified assembler language, so they expect it to do things like this. A high-level [sic] language like PHP should offer more protection.
  • Zylon 2012-07-17 18:00
    Mcoder:
    That's because C is a low level language, without the concept of exceptions.

    Is C lower-level than any number of 8K BASICs that supported exception handling (known at the time as "error trapping") perfectly well?
  • FAsl iegh 2012-07-17 18:12
    Carl:
    Storing a phone number as an integer is definitely a WTF, but so is having a development server that's configured differently than the production one.
    yes indeed it would be - but I'm not sure that this article suggests that would have necessarily been the case - only that that was the first point of investigation.

    (although, TBH I've never worked anywhere where the Dev box is exactly the same as the prod box - partly because people have played with config, partly because the box is often lower spec and partly because people are scared that if they make it too real it might accidentally do real stuff)
  • phreddy 2012-07-17 18:14
    And if you have postcodes with leading zeros (we have some in Oz) then storing them as numbers don't work.
  • Jimbo Jones 2012-07-17 18:19
    The MAZZTer:
    KattMan:
    FRIST! and early

    Treat all numbers as characters unless you need to do math on them.


    I assume you mean PHONE numbers, otherwise this is just silly.
    Er, no. He means ALL numbers unless you need to do math on them (which is exactly what he said).

    If you don't need to do MATH with it, you likely only need it for display purposes or you might find you need to play string games on it anyway. I can't really think of a case where you'd want to store a number as a number except where there's math involved. Obviously something like a counter would probably be stored as a number, but its incrementing nature means it is being used for math anyway... (and though it may be stored in a DB as a number, it doesn't necessarily need to be treated as a number when we pull it out the DB unless we want to sort based on it later (where numbers might be a little more efficient)).
  • pjt33 2012-07-17 18:21
    hymie:
    You don't have numeric postal codes, numeric identification (Social Security) codes, numeric part numbers ... ?

    My national identification number starts with an X. In the previous country where I lived, by social security number started with a J.

    foo:
    A more pragmatic answer would be: "So you can preserve the puctuation and don't need extra code to remove it on input and add it on output", or "if you ever want to go international, note that some countries have numbers starting with 0".

    Either all foreign countries have numbers starting with 0 or they all have numbers starting with +. The latter is more portable.
  • JKR 2012-07-17 18:26
    Spoe:
    Geoff:
    Not a PHP guy so I don't know but, if you try to convert a string to an int and its to big you get the largest possible signed integer value? Its not an exception, or overflow error?


    Yep. Much like many overflow conditions in C, e.g.:


    int foo ( int x) {
    return ( x+1 ) > x;
    }
    int main ( void ) {
    printf ("%d\n", ( INT_MAX+1 ) > INT_MAX );
    printf ("%d\n", foo ( INT_MAX ));
    return 0;
    }


    GCC (and several other compilers) with -O2 will output:
    1
    0

    Meaning INT_MAX + 1 is both greater than and less than INT_MAX depending on how it's calculated.

    No error is reported for the overflow. As I understand it, the behavior in this case is valid per the C99 spec since INT_MAX + 1 represents undefined behavior: the compiler can do what it wants.

    I don't see C receiving the same level of disrespect as PHP.
    Assuming that you mean 0,1 not 1,0 this is a compiler optimization, not a language quirk (ie it's a gnu thing not a C thing). Basically the compiler can see that x+1 will always be greater than x so it doesn't bother with the actual calculation at runtime. OTOH INT_MAX is just a macro and we have literal numbers substituted (which the compiler doesn't recognise as being the same) so we have to work out INT_MAX+1 (which overflows silently) to work out the inequality....

    As this is not behaviour defined in the standard, but rather something that happens on a specific compiler with a specific optimization setting it would be rather unfair to show disrespect to the language for it....

  • Drtfsxzjkl 2012-07-17 18:26
    Meep:
    Drtfsxzjkl:
    willaien:
    Akismet seems to think I'm spamming up the joint, so adding a bit of text below the URL.

    Akismet's algorithm is (or at least includes): if (post.endsWithURL()) then {is_spam = true}. I've yet to see a single counterexample.


    Well, any actual comment spam kinda has to have a link to work. And a lot of comment spam tends to be drawn from other comments, with a link thrown in. It's probably a pretty effective heuristic.

    But it does have quite a few false positives, given that there are at least three or four people complaining in every post.
  • Mick 2012-07-17 18:41
    foo:
    A Person:
    Started at a new job about 6 months ago. Week 2 I am putting together some tables including you guessed it a telephone number.

    Boss looks at it later. 'Youre storing a phone number as text, why?'

    Little perplexed as boss is supposed to be a programmer.

    'Well you never need to perform arithemtic on it so this avoids any potential unintended math operations being performed on it'

    I expected a sage nod and for him to move on. What actually happened was the 3 other developers and my boss looked at me as if I was insane.

    It got stored as an int, and I realised I'm not going to learn anything here.
    The correct answer would have been: "Because it's not a number, just a sequence of digits." Of course, the chances he'd understand the difference are small.

    A more pragmatic answer would be: "So you can preserve the puctuation and don't need extra code to remove it on input and add it on output", or "if you ever want to go international, note that some countries have numbers starting with 0".
    Phone numbers are tricky, because people will have different conventions (which they sometimes alternate even for the same number).

    For example:
    The international dialing code for Australia is +61
    An area code within Australia is 0X where:
    X=2 = NSW/ACT
    X=3 = VIC (and Tas?)
    X=4 = Mobile
    X=7 = QLD
    X=8 = SA/NT/WA
    There used to (I think they were taken out when the numbers went to 8 digits) be some localised area codes, like country SA was (085) instead of (08), but we digress

    For international callers, we drop the 0.

    There are then 8 digits for the number (which has increased from 6 or 7 about 20 years ago according to certain rules).
    Generally, the first 4 (formally 2 or 3) of these 8 digits represent the Telephone Exchange, although in the digitalized world this is no longer guaranteed - but we digress.

    Locally, most people used to leave off the area code (you can generally (excepting mobiles, I think) call other numbers within your area code without specifying it).
    Firstly, people seem to write mobiles down differently to local numbers, eg:
    0412 123 123 for mobiles
    8123 1234 (or (08) 8123 1234) for local

    But increasingly people seem to be realising that mobile numbers are the same as local numbers, and it's not uncommon anymore to see:
    (04) 1212 3123 (for the mobile above).

    So far, this isn't a major issue, because we can strip punctuation in comparison, and even cut off area codes if we detect them. It becomes more complex, however when we realise the following numbers are possible (but not necessarily) the same:
    +61 8 8123 1234
    011 61 8 8123 1234
    08 8123 1234
    8123 1234 (AFAIK, this could e(at least theoretically) xist in an area code other than 08)

    Storing them is no issue as strings, but even as strings we have a problem of comparison - and a lot of takeaway stores seem to use the phone number to uniquely identify a customer....Of course, these issues are not necessarily insurmountable, but phone numbers are horribly complicated things, and are certainly not NUMBERS....
  • da Doctah 2012-07-17 18:49
    Exercise for the reader: devise a parser that can tell that 60608 is not a number but 6.022141e23 is.
  • Hoolio 2012-07-17 18:52
    curtmack:
    This isn't the worst I've seen. Here's how MS SQL Server guesses data types in imported Excel spreadsheets:

    - Look at the top 20-ish rows.
    - Is any single one of them numeric?
    - If so, MUST BE A NUMBER!

    Thankfully, SQL Server does care about cell formatting (unlike Access, another program that has WTF Excel importing - you'd think Office would have pretty good compatibility with itself, but you'd be wrong). So if you format the column as Text, it won't type-guess.

    But as an added bonus, it silently fails to import numeric values into fields set up as varchars. And there's no option to tell it to explicitly cast numbers to strings either. So you get to import the data once to get all of the numeric employee IDs, import it again to get the non-numeric ones, and then merge them together.

    Edit: To be clear here - I mean that, even if it's from a column that you explicitly state is a string, SQL Server will still insist that "24593" is a number, and will fail to import it, presumably because it converts it from a string to a number, and then wonders why it can't put it into a string field.
    Even Excel doesn't seem to have much compatibility with itself if you want to use file formats other than its native....

    Try this:
    Open a new sheet, and in the first cell put
    "01"
    and in the second put
    =concatenate("0","1")


    Now save it as a csv file (accepting that some formatting may be lost etc).
    Open the csv in something other than Excel, and we see that the string with the quotes is stored """01""" and the one without is 01.
    Now reopen the file in Excel and (you guessed it) the 01 becomes a 1....

    IMHO Excel should be treating everything as a string when dealing with CSV format - especially something that has been created using a String function....
  • Daniel 2012-07-17 18:58
    pjt33:

    Either all foreign countries have numbers starting with 0 or they all have numbers starting with +. The latter is more portable.


    Any country is free to choose whatever they like as their international access code. It is standardised as "00" across Western Europe and "011" in North America. Most other countries go along with one of those two but certainly not all of them. There are a few weird ones out there that don't even begin with "0". Some former Soviet states (including Russia) still use "8" and some other countries use various things beginning with a "1".

    As such, not allowing the user to enter a "+" is just asking for trouble. Who knows what they will enter instead? Maybe something flat out wrong? Maybe something right in their context but not in yours? Maybe something you need to decipher to work out? Maybe something you can't easily work out? Maybe something easily misunderstood as a local number?

    The failure to understand that people do things differently in other countries and that you need to accommodate them if you want their money is an endless source of WTFs. In addition to things that won't take a phone number with a "+" in it there are the forms that won't let you enter a postcode with letters in a (mandatory) "zip code" field and even a few that expect you to pick a US state when you don't live in one (although admittedly I have not seen that particular manifestation of idiocy for quite a while now). You would expect to see this only on forms where it is expected that the users are all local, which is fair enough in many contexts, but sometimes there is also a field for country which implies that they would like to do business overseas if only they knew how to.
  • Frank 2012-07-17 19:13
    da Doctah:
    Exercise for the reader: devise a parser that can tell that 60608 is not a number but 6.022141e23 is.


    /* returns 1 if input string is a number 0 if not and 2 if it is indeterminate*/
    int intParser(char *t)
    {
    if(strncmp(t, "60608", 5) == 0) return 0;
    if(strncmp(t, "6.022141e23", 11) == 0) return 1;
    return 2;
    }


    QED
  • Pizza Boy 2012-07-17 19:16
    Noones mentioned house numbers.

    28A
    1/27
    U3/89
    u2/57A
    27-365
    3 at 14

    etc
  • Tired Rabbit 2012-07-17 19:23
    phreddy:
    And if you have postcodes with leading zeros (we have some in Oz) then storing them as numbers don't work.

    Here in the states it is the same.

    The only place I could possibly see using a number as a number is the house number (so if one wanted to sort by house along a street), but even THAT is a WTF because not all house numbers are integers, or numbers.

    PHP is such low-hanging fruit, why does Alex even bother posting them?



    The way I see it, there are only two types of PHP programs:

    1) Initially written by a noob who really didn't know what he (or she) was doing, and managed to make every classic mistake in the book, or

    2) Some sucker who has to maintain the garbage written by #1.

    Face it, if you write a new application using PHP, you're doing it wrong. There is nothing that PHP does better except allow inexperienced people think they're skilled developers.
  • foo 2012-07-17 19:40
    da Doctah:
    Exercise for the reader: devise a parser that can tell that 60608 is not a number but 6.022141e23 is.
    The magic word you're looking for is context. When you get 60608 in your order-quantity field, it's a number, when you get it in the zip-code field, it's not. The problem (also in the OP, AIUI) is trying to parse and type-convert with too little context.
  • Norman Diamond 2012-07-17 19:41
    Spoe:
    GCC (and several other compilers) with -O2 will output:
    1
    0

    No error is reported for the overflow. As I understand it, the behavior in this case is valid per the C99 spec since INT_MAX + 1 represents undefined behavior: the compiler can do what it wants.

    I don't see C receiving the same level of disrespect as PHP.
    Then you haven't looked very hard. Maybe the amount of abuse of C hasn't kept pace with the amount of abuse of PHP because it's easier for a larger number of less talented lusers to abuse PHP, but C had its share of critics in the old days.

    As for undefined meaning undefined, of course it would be expensive for implementations to diagnose all kinds of undefined behaviour and no one would want to incur that kind of expense all the time. But consider the Pascal standard, where the requirement is not that all kinds of undefined behaviour be diagnosed, the requirement is that implementations MUST OFFER AN OPTION where programmers can choose to require all kinds of undefined behaviour to be diagnosed. Pascal was vilified for providing an option for that amount of bondage and discipline. No one would ever accept such a safety measure in C. If people like that were in charge of automotive safety, seat belts would be illegal. People like that were in charge of nuclear power plant safety and see where that got us.

    C gets the disrespect it deserves, you just need to look harder.
  • Norman Diamond 2012-07-17 20:06
    Jay:
    foo:
    The US numbering system :) -- many countries' phone numbers start with an initial 0, so the previous mistake would show much earlier.
    That seems rather pointless. If all the numbers start with a zero, then the zero is superfluous and could be left off.
    That's the reason why international calls omit that zero between the country code and the remainder of the city code. However, most calls are domestic and the zero is needed to indicate that a full number is being dialled instead of a local (intra-city) number.

    Also, historically, long distance calls even within a single country were considerably more expensive than local (intra-city) calls. Also phones were dialled using dials instead of push buttons. So if a phone were offered for public use by a corner store or other public facing proprietor, they would put a lock in the 9 position in the dial. Digits 1 to 9 could be dialled but 0 couldn't be dialled (except by historical phone phreaks of course). Phone numbers didn't have any 0 digits except the first digit of the city code. So local calls could be dialled but long distance calls could not be dialled.

    Eventually pay phones replaced public use of phones provided by corner stores etc. (and later pay phones became obsolete because everyone except me carries a cell phone but let's not get ahead of ourselves), local numbers started to include embedded 0's the same as other digits, and the costs of calls were computed by some newfangled kind of calculating machine so it was no longer necessary to lock out the riffraff from dialling an initial 0.
  • John 2012-07-17 20:11
    Norman Diamond:
    Jay:
    foo:
    The US numbering system :) -- many countries' phone numbers start with an initial 0, so the previous mistake would show much earlier.
    That seems rather pointless. If all the numbers start with a zero, then the zero is superfluous and could be left off.
    That's the reason why international calls omit that zero between the country code and the remainder of the city code. However, most calls are domestic and the zero is needed to indicate that a full number is being dialled instead of a local (intra-city) number.

    Also, historically, long distance calls even within a single country were considerably more expensive than local (intra-city) calls. Also phones were dialled using dials instead of push buttons. So if a phone were offered for public use by a corner store or other public facing proprietor, they would put a lock in the 9 position in the dial. Digits 1 to 9 could be dialled but 0 couldn't be dialled (except by historical phone phreaks of course). Phone numbers didn't have any 0 digits except the first digit of the city code. So local calls could be dialled but long distance calls could not be dialled.

    Eventually pay phones replaced public use of phones provided by corner stores etc. (and later pay phones became obsolete because everyone except me carries a cell phone but let's not get ahead of ourselves), local numbers started to include embedded 0's the same as other digits, and the costs of calls were computed by some newfangled kind of calculating machine so it was no longer necessary to lock out the riffraff from dialling an initial 0.
    I think the OP wasn't so much talking about getting rid of the need to DIAL 0 as the need to STORE 0....
  • Norman Diamond 2012-07-17 20:13
    Zylon:
    Mcoder:
    That's because C is a low level language, without the concept of exceptions.
    Is C lower-level than any number of 8K BASICs that supported exception handling (known at the time as "error trapping") perfectly well?
    Yes, C is lower level than BASIC, _by_design_.
  • Michael J. Cohen 2012-07-17 20:20
    Sadly, Veekun's article is pretty wrong in and of itself.

    There are multiple in-depth analyses of what he got wrong floating around, but here's the HN thread that I most readily remember reading...

    http://news.ycombinator.com/item?id=4177516

    Akismet thinks I'm spamming, too. Silly software!
  • Nathan Hillery 2012-07-17 22:28
    First name Max.
    Last name Int.
    Middle initial U.
  • foo2 2012-07-17 23:35
    foo:
    - Storing phone numbers (or zip codes etc.) as numbers
    - The US numbering system :) -- many countries' phone numbers start with an initial 0, so the previous mistake would show much earlier.


    Then it gets exported to .CSV, then opened in Excel which handily hacks the leading 0 off. Or converts it to scientific notation. Often both.
  • SDF 2012-07-18 03:14
  • Gurth 2012-07-18 05:00
    Jay:
    Back in the early days of computers (I don't think so much today), many people would type the letter oh instead of zero and the letter el instead of one, which of course would then screw up numeric inputs. Manuals would routinely lecture users not to do this. Just because it looks right on a typewriter, it doesn't work on the computer, etc.

    Many typewriters didn't have a 1 and/or a 0 on the keyboard, because a lowercase l and uppercase O would look much the same, and that way the keys normally used for the 1 and/or 0 could be used for other symbols for which there would otherwise be no room. Anybody used to a typewriter would naturally type the same way on a computer, I'd expect.
  • Gurth 2012-07-18 05:08
    Norman Diamond:
    everyone except me carries a cell phone

    That statement is false — I have never owned a cellphone, and if it's left up to me, probably never will either.
  • Your Name * 2012-07-18 05:12
    Nathan Hillery:
    First name Max.
    Last name Int.
    Middle initial S.


    FTFY
  • foxyshadis 2012-07-18 05:29
    foo:
    Let's see, TRWTFs are:
    - Type-guessing -- honestly, in all my programming work I never needed to type-guess. Either the input fits the expected type, or it's invalid.
    - Not immediately thinking of 2^31 when seeing a 10-digit number starting with 21... (OK, slight excuses for the Dallas coincidence, but when a certain "magic" value keeps popping up, you better look what kind of value it actually is.)
    - PHP doing saturation rounding, apparently without any warning, in a type-cast
    - PHP itself
    - Testing on a different configuration that production
    - Storing phone numbers (or zip codes etc.) as numbers
    - The US numbering system :) -- many countries' phone numbers start with an initial 0, so the previous mistake would show much earlier.
    - The article:
    * Typo/grammar error in the first sentence
    * "2,147,483,647 is the highest integer value addressable on 32-bit systems". Triply wrong: (a) The highest *signed* integer, (b) representable, not addressable, (c) in a single register (64 bit integers stored in two registers are not that exotic, though maybe in PHP)
    - The comments, including this one
    - Akismet, of course


    As a consultant, am I supposed to keep identical copies of every 32-CPU server I'm still supporting and might-be supporting around the house, just in case one of them calls me up to fix something?
  • foxyshadis 2012-07-18 05:57
    Pizza Boy:
    Noones mentioned house numbers.

    28A
    1/27
    U3/89
    u2/57A
    27-365
    3 at 14

    etc

    I pity the poor misguided son of a gun trying to micro-optimize a few bytes out by storing house numbers as integers, but has anyone actually done that? Everyone seems to associate the house number with the street name.
  • Stephen Griffith 2012-07-18 06:00
    You have to store them unconventionally if a user could live in another country.
  • Stephen Griffith 2012-07-18 06:03
    3) Cheap hosting is another reason.
  • My Name 2012-07-18 07:11
    Norman Diamond:
    Zylon:
    Mcoder:
    That's because C is a low level language, without the concept of exceptions.
    Is C lower-level than any number of 8K BASICs that supported exception handling (known at the time as "error trapping") perfectly well?
    Yes, C is lower level than BASIC, _by_design_.

    +1. "Old, limited and probably a bit painful" doesn't imply "low-level".
  • Captain Boolean 2012-07-18 07:23
    All this talk of 'numbers as text, unless you need to do maths (yes, I'm British)'...

    What if you're designing a data warehouse for conspiracy theorists? That'd be a long running project I reckon!
  • pjt33 2012-07-18 08:23
    Daniel:
    pjt33:

    Either all foreign countries have numbers starting with 0 or they all have numbers starting with +. The latter is more portable.


    Any country is free to choose whatever they like as their international access code. It is standardised as "00" across Western Europe and "011" in North America. Most other countries go along with one of those two but certainly not all of them. There are a few weird ones out there that don't even begin with "0". Some former Soviet states (including Russia) still use "8" and some other countries use various things beginning with a "1".

    As such, not allowing the user to enter a "+" is just asking for trouble. Who knows what they will enter instead? Maybe something flat out wrong? Maybe something right in their context but not in yours? Maybe something you need to decipher to work out? Maybe something you can't easily work out? Maybe something easily misunderstood as a local number?

    We seem to be in violent agreement. My comment about beginning with 0 was addressed specifically at a comment which implied that the international access code varies according to the country you're calling to, which would be crazy. I suppose I can't rule out the possibility that there is some country that crazy, but...

    The failure to understand that people do things differently in other countries and that you need to accommodate them if you want their money is an endless source of WTFs. In addition to things that won't take a phone number with a "+" in it there are the forms that won't let you enter a postcode with letters in a (mandatory) "zip code" field and even a few that expect you to pick a US state when you don't live in one (although admittedly I have not seen that particular manifestation of idiocy for quite a while now). You would expect to see this only on forms where it is expected that the users are all local, which is fair enough in many contexts, but sometimes there is also a field for country which implies that they would like to do business overseas if only they knew how to.

    One of the worst examples I've seen of this is the paper form for British citizens who wish to register as overseas voters. So logically the only people who could reasonably fill in this form will have addresses outside the UK, but their addresses will be held on record in the UK. And yet the section in the form for the address doesn't have a space for the name of the country in which the overseas voter is living. WTF?
  • Marc 2012-07-18 08:36
    The MAZZTer:
    KattMan:
    FRIST! and early

    Treat all numbers as characters unless you need to do math on them.


    I assume you mean PHONE numbers, otherwise this is just silly.


    No, it's not. It goes for things such as international standard book numbers, project numbers, employee numbers etc. just as well. Thing is, these are not really numbers, but keys.

    You can't add project 4 to project 6 and expect the outcome to be project 10.

    Incremental, numeric keys are a bad idea, anyway - they're a one-way trip to unscalable applications.
  • dkf 2012-07-18 09:29
    Pizza Boy:
    Noones mentioned house numbers.

    28A
    1/27
    U3/89
    u2/57A
    27-365
    3 at 14

    etc
    Why this insistence on houses having numbers at all? There are other ways of giving properties unique addresses.
  • Anon 2012-07-18 09:51
    dkf:
    Pizza Boy:
    Noones mentioned house numbers.

    28A
    1/27
    U3/89
    u2/57A
    27-365
    3 at 14

    etc
    Why this insistence on houses having numbers at all? There are other ways of giving properties unique addresses.


    Because house numbers are usually, sort of, ordered. If you're looking for 1038 whatsit road and you just passed 1012 and 1014 whatsit road, you know you are moving in (probably) the right direction.

    Unfortunately, it's often broken. My brother-in-law lives in a house with a number in the thousands. A couple of doors up the house numbers are in the tens of thousands. Very weird.
  • Aaron 2012-07-18 10:04
    This is actually a really good, interesting, and useful, WTF. Can we have more of these please?
  • KattMan 2012-07-18 10:17
    Aaron:
    This is actually a really good, interesting, and useful, WTF. Can we have more of these please?


    What, and break with tradition? Are you crazy?
  • Qŭert 2012-07-18 10:24
    foo:
    The correct answer would have been: "Because it's not a number, just a sequence of digits." Of course, the chances he'd understand the difference are small.


    Some human languages have this difference built-in, e.g. in Esperanto. "Because it is not a numbro, just a numero" (same as in French: nombre and numero)
  • Qŭert 2012-07-18 10:28
    numéro in French, I mean.
  • Zemm 2012-07-18 10:33
    Mick:

    There used to (I think they were taken out when the numbers went to 8 digits) be some localised area codes, like country SA was (085) instead of (08), but we digress


    Yes, in the 1990s we went from 50-odd area codes to four, with the first few digits of the local number being of geographic significance. Confusing area codes like Hobart (002 or +61 02) were changed (to (03) 62 or +61 3 62). Most numbers only had a digit inserted, or other minor change. (085) became "(08) 85" which would have since been "overlaid" with "(08) 75". The renumbering also opened up the entire 04 range for mobile phones: 0G and 1G systems had various 00x and 01x prefixes while GSM started in 1992 with 041x. And there's talk of opening up 05 for mobiles too. One hundred million numbers is not enough for our population of 20-odd million!

    Mick:

    There are then 8 digits for the number (which has increased from 6 or 7 about 20 years ago according to certain rules).
    Generally, the first 4 (formally 2 or 3) of these 8 digits represent the Telephone Exchange, although in the digitalized world this is no longer guaranteed - but we digress.


    I know very few exchanges with a full 10,000 number block: they are mostly in blocks of 1000. You can actually download a list from the Telstra website! (Starting from page 256 of the PDF)

    Mick:

    It becomes more complex, however when we realise the following numbers are possible (but not necessarily) the same:
    +61 8 8123 1234
    011 61 8 8123 1234
    08 8123 1234
    8123 1234 (AFAIK, this could e(at least theoretically) xist in an area code other than 08)


    Those are all the same number, depending where you call it from. +61 is the international prefix, it removes all ambiguity (AFAIK all mobile phones let you enter a literal "+" so you don't need to know about 00 or 011 or 0011 or 8~10 or whatever country you are in). 011 is "+" for the USA, Canada, et al. With the 08 is valid calling from anywhere in Australia, and without it is valid calling from anywhere in the 08 code - like a normal open dialling plan. (WA, NT, SA, and parts of western NSW) Indeed if you called 81231234 from the rest of NSW it could be a valid number in Sydney.

    If I were in charge of numbering I'd "close" the numbering plan, remove all "area codes" and give everyone nine digit numbers. Internationally very few numbers would change (only some within +611); local numbers would gain a 2, 3, 7 or 8; while inter-area calls and mobiles would drop their initial 0. And while I'm at that I'd remove the notion of "local call" vs "long distance": calling interstate is no different to calling next door - I already get this from my voice provider.

    Mick:

    Storing them is no issue as strings, but even as strings we have a problem of comparison - and a lot of takeaway stores seem to use the phone number to uniquely identify a customer....Of course, these issues are not necessarily insurmountable, but phone numbers are horribly complicated things, and are certainly not NUMBERS....


    I used to work at a pizza place that did that. All phone numbers were obviously stored as ints, but since the numbering plan is known and with nine significant digits it sort of made sense. I worked in the city which was "(07) 46xx xxxx". If you entered "21" as the phone number it became "07 4600 0021". This was handy because you could just enter the old-style six digit phone number and it would work itself out. If you entered a mobile number as nine digits beginning with 4 (or ten with 04) it would generally work, but they kept having to update the system to handle newer ranges. I remember when the 0438 range was released: entering 0438123456 would actually become "07 4612 3456". Whoops!

    However, at the newest store entering "555123" actually became "07 0055 5123" because numbers beginning with 45 were introduced, which did cause some problems when going between stores. (Another WTF was the older stores had membrane keyboards and the new one had standard 101 keyboards, so you needed to know the codes the membrane keyboards would have sent out, like NAZ for $5.95 voucher or GAR for garlic bread. But I'm digressing; the computer system was replaced with a touchscreen system after I quit in 2005. The old system was a single P166 for the entire store using Wyse terminals)
  • Zemm 2012-07-18 10:43
    Jay:
    foo:
    The US numbering system :) -- many countries' phone numbers start with an initial 0, so the previous mistake would show much earlier.


    That seems rather pointless. If all the numbers start with a zero, then the zero is superfluous and could be left off.


    Most countries with the initial 0 would be the equivalent of an American saying a phone number is "1212-555-1212". The 0 is the trunk prefix which is not used when dialling from International. But then the NANP is unique in that your trunk prefix of 1 is the same as your country code, so my example includes the trunk prefix; to be completely correct from an international perspective that number should be "+1.2125551212".
  • Zemm 2012-07-18 10:49
    Maurits:

    At the risk of being tautological, if you don't need to do math on the thing, it's not a number, it's just an identifier.

    I'll add ISBNs and PINs to KattMan's list, and subclass part numbers as UPCs and EANs.


    But EANs (of which ISBNs and UPC are subsets) have check digits that you add and modulo to ensure you have read the number correctly. That sounds like maths to me.
  • KattMan 2012-07-18 11:29
    Zemm:
    Maurits:

    At the risk of being tautological, if you don't need to do math on the thing, it's not a number, it's just an identifier.

    I'll add ISBNs and PINs to KattMan's list, and subclass part numbers as UPCs and EANs.


    But EANs (of which ISBNs and UPC are subsets) have check digits that you add and modulo to ensure you have read the number correctly. That sounds like maths to me.


    I wouldn't, because you do not do math with the entire set of digits, rather you pull it apart into its separate digits and then do math on that. It is a series of characters that fall within the range of 0-9 that you will split (string operation) then convert to numbers and perform a checksum calculation on.

    If you saved it as a number field (int, double, float whatever) you would have to convert it to the string to split it then convert the pieces back to ints and do the math. Cut out one of those operations since you are not doing math on the original string of digits.

    Edit:
    I will further mention those UPC codes or EAN's that begin with 0, save it as a number and you won't have the right number of digits to perform your checksum on, save it as a string and you will preserve your true EAN or UPC value.
  • Justin Reese 2012-07-18 11:41
    Hi all. I'm the dummy who both wrote the dumb code and submitted the story. Let me clear up a few misunderstandings.

    foo:
    Type-guessing -- honestly, in all my programming work I never needed to type-guess. Either the input fits the expected type, or it's invalid.


    Absolutely. This was my critical mistake. My reasons for doing this were unsound, but even unsounder was introducing it so deep in the library chain that I didn't even realize the phone number had been touched by it. Immediately after discovering the bug, I removed the type-guesser and apologized to mankind.

    Not immediately thinking of 2^31 when seeing a 10-digit number starting with 21... (OK, slight excuses for the Dallas coincidence...


    No, not a slight excuse, that's the whole reason. If this had been in, say, an "annual salary" field, it may have jumped out faster. But it was showing up in a phone number field, and being massaged by the template into XXX-XXX-XXXX format, and we're from Dallas (where 214 is everywhere).

    ..., but when a certain "magic" value keeps popping up, you better look what kind of value it actually is.)


    Which is what I did. Which is why I found the bug. Hey your hindsight is as good as mine!

    Testing on a different configuration that production


    Alex introduced an error when he massaged my article. The testing and production boxes are identical, and both exhibited the error. It was only on my dev box (64-bit) that things worked properly.

    Speaking of which: you guys know Alex/team re-write the submissions, right? The basic facts of the story, and some of my original language – glad you guys liked the "pardner" – are here, but much of the color and some of the details are his. I didn't actually blame a co-worker or boss, test and production were both 32-bit, and I didn't actually learn anything from this.

    Storing phone numbers (or zip codes etc.) as numbers


    The numbers were stored (and transmitted) as strings; it was my dumb type-guesser in the XML-reading library that broke things.

    validus:
    No, dude. What you should have learned from that is that your suck as a coder, and that your logic is not quite there yet.


    I'll not give you the former, but given that I submitted an embarrassing, self-incriminating story to a website of my peers, I'll give you the latter.
  • Justin Reese 2012-07-18 11:53
    I just realized it doesn't come through in the article, but the phone number was only being int'ed before being sent to the template for rendering. It was a safe and happy string in the database and in the XML.
  • Justin Reese 2012-07-18 11:57
    Also I wasn't doing it specifically to phone numbers, but to all strings that looked like numbers, which really was even stupider but it was a late night and I'd had lots of wine* and nobody checks my code before it goes into production.

    *It was probably 10am and I was fully alert.
  • Greeno 2012-07-18 12:20
    The main problem with storing phone number as number would be the loss of preceeding zeros.
  • Mozzis 2012-07-18 12:43
    If I am writing in C, it is at least partly because I don't want all of the hand-holding and its associated overhead that would be implied by a check for overflow everytime I added two numbers together. And don't tell me it can be done in "only a couple of assmembly instructions" - if I am using C, I expect *maximum performance*, and if results need checking, I will do it myself thank you very much.
  • Ol' Bob 2012-07-18 12:59
    From "The Tao of Software":

    Programming languages which require that developers know and care about the number of bits in a numeric variable are ill-suited for business use.
  • Ol' Bob 2012-07-18 13:03
    dork:
    now whos gonna call the max int32 phone number and try to convince them if they switch to at&t they will get full 64bits of service!


    ...and even now, deep in the bowels of the advertising department, some junior copyrighter is running to his boss screaming "HEY! BOSS!! I GOT ME A IDEER!!!".
  • Gurth 2012-07-18 13:10
    Marc:
    You can't add project 4 to project 6 and expect the outcome to be project 10.

    What if project 4 is yellow paint, project 6 is blue paint and project 10 is green paint?
  • Jay 2012-07-18 13:22
    Norman Diamond:
    Jay:
    foo:
    The US numbering system :) -- many countries' phone numbers start with an initial 0, so the previous mistake would show much earlier.
    That seems rather pointless. If all the numbers start with a zero, then the zero is superfluous and could be left off.
    That's the reason why international calls omit that zero between the country code and the remainder of the city code. However, most calls are domestic and the zero is needed to indicate that a full number is being dialled instead of a local (intra-city) number.

    Also, historically, long distance calls even within a single country were considerably more expensive than local (intra-city) calls. Also phones were dialled using dials instead of push buttons. So if a phone were offered for public use by a corner store or other public facing proprietor, they would put a lock in the 9 position in the dial. Digits 1 to 9 could be dialled but 0 couldn't be dialled (except by historical phone phreaks of course). Phone numbers didn't have any 0 digits except the first digit of the city code. So local calls could be dialled but long distance calls could not be dialled.

    Eventually pay phones replaced public use of phones provided by corner stores etc. (and later pay phones became obsolete because everyone except me carries a cell phone but let's not get ahead of ourselves), local numbers started to include embedded 0's the same as other digits, and the costs of calls were computed by some newfangled kind of calculating machine so it was no longer necessary to lock out the riffraff from dialling an initial 0.


    I'm not sure if you're agreeing with the original poster or not.

    I think the real answer here is: No, there is no country where all phone numbers start with "0". There may be countries where all DOMESTIC numbers start with "0", or some other subset, so that the zero indicates which subset. But then of course the whole point of the zero is that it is not superfluous, it is telling you that a certain subset applies, and a number beginning with a non-zero indicates that a different subset applies.

    My post on this was not intended to make fun of the phone system, but of the poster's statement about the phone system.
  • Jay 2012-07-18 13:25
    Gurth:
    Jay:
    Back in the early days of computers (I don't think so much today), many people would type the letter oh instead of zero and the letter el instead of one, which of course would then screw up numeric inputs. Manuals would routinely lecture users not to do this. Just because it looks right on a typewriter, it doesn't work on the computer, etc.

    Many typewriters didn't have a 1 and/or a 0 on the keyboard, because a lowercase l and uppercase O would look much the same, and that way the keys normally used for the 1 and/or 0 could be used for other symbols for which there would otherwise be no room. Anybody used to a typewriter would naturally type the same way on a computer, I'd expect.


    Exactly. When people went from typewriters to computers, many just continued to type the way they were used to, and had trouble making the shift. Even if you told them a dozen times, it's difficult to break long-established habits.
  • Jay 2012-07-18 13:31
    foxyshadis:
    Pizza Boy:
    Noones mentioned house numbers.

    28A
    1/27
    U3/89
    u2/57A
    27-365
    3 at 14

    etc

    I pity the poor misguided son of a gun trying to micro-optimize a few bytes out by storing house numbers as integers, but has anyone actually done that? Everyone seems to associate the house number with the street name.


    Years ago I worked for a Political Action Committee. We got printouts from the Board of Elections that were called "walking lists", intended to be used by people who were going door-to-door passing out campaign literature. These were lists of registered voters, sorted by street, within street by odd vs even house number, and within odd/even by the house number. This was intended to give a list in the order that the houses would be on the street, first one side and then the other. (Do other countries use the same convention as the US? That on one side of the street the house numbers are all even and on the other side they are all odd?)

    This of course required that house number be stored as a separate field.

    They had an additional field for the extra things that sometimes get appended to a house number, like a letter or a fraction. ("101B", "214 1/2", etc)

    So yes, there are times when you want to store the house number separately and be able to manipulate it.
  • Jay 2012-07-18 13:35
    Ol' Bob:
    From "The Tao of Software":

    Programming languages which require that developers know and care about the number of bits in a numeric variable are ill-suited for business use.


    Hmm, then to the best of my knowledge, there are no programming languages that are well-suited for business use. In what language can all numeric types store the number 5 billion? Maybe there are some out there, but I don't think I've ever used them.
  • Maude 2012-07-18 13:45
    Jay:
    Ol' Bob:
    From "The Tao of Software":

    Programming languages which require that developers know and care about the number of bits in a numeric variable are ill-suited for business use.


    Hmm, then to the best of my knowledge, there are no programming languages that are well-suited for business use. In what language can all numeric types store the number 5 billion? Maybe there are some out there, but I don't think I've ever used them.

    Python you moron. And I'd guess many, many more.
  • no laughing matter 2012-07-18 13:51
    Wait, Dallas?

    Like Dallas, TEXAS?

    Ladies and gentlemen, we finally found out the rule has an exception:

    INTs are not bigger in Texas!
  • no laughing matter 2012-07-18 13:56
    Michael J. Cohen:
    Sadly, Veekun's article is pretty wrong in and of itself.

    There are multiple in-depth analyses of what he got wrong floating around, but here's the HN thread that I most readily remember reading...

    http://news.ycombinator.com/item?id=4177516

    Worth to read the replies, too!

    Then you will find out that Veekun is not so far off as you suggested.

    And you will notice that the author "wvenable" claims that Java supports pass-by-reference.
  • Jay 2012-07-18 14:01
    foo:
    Let's see, TRWTFs are:
    ...
    - Testing on a different configuration that production
    ...


    Well, having development and production machines with identical configurations eliminates some areas of problems -- like the one brought up here -- but is often impractical.

    Like, I probably want to have a debugger running on my development machine to help identify errors, but not on my production machine because it would hurt performance. There may be other applications running on the production server that I don't have on my development server. The production server typically is more powerful than the development server because it has to support more users. Etc.

    "Configuration" is a big thing. It is not at all easy to insure that the configuration on two machines is absolutely identical, i.e. that they have exactly the same hardware; exactly the same version of the OS, including all pactches; all the same software, including utilities, software we have purchased, and what we have developed ourselves, etc.

    And of course there are some ways in which the production server MUST be different from a development server. We probably don't want the development server talking to the production database server. If we have interfaces to external systems, we probably don't want our development server sending messages to external production system. Hopefully such differences are just a matter of changing some URLs or the like in some configuration files, but not always.

    For that matter, if we are supporting multiple production servers, does each developer have to have a separate development computer for each production server that any app he works on is deployed on?

  • no laughing matter 2012-07-18 14:02
    Maude:
    Jay:
    Ol' Bob:
    From "The Tao of Software":

    Programming languages which require that developers know and care about the number of bits in a numeric variable are ill-suited for business use.


    Hmm, then to the best of my knowledge, there are no programming languages that are well-suited for business use. In what language can all numeric types store the number 5 billion? Maybe there are some out there, but I don't think I've ever used them.

    Python you moron. And I'd guess many, many more.


    Smalltalk, for example.
  • Jay 2012-07-18 14:13
    Maude:
    Jay:
    Ol' Bob:
    From "The Tao of Software":

    Programming languages which require that developers know and care about the number of bits in a numeric variable are ill-suited for business use.


    Hmm, then to the best of my knowledge, there are no programming languages that are well-suited for business use. In what language can all numeric types store the number 5 billion? Maybe there are some out there, but I don't think I've ever used them.

    Python you moron. And I'd guess many, many more.


    I said "In what language can all numeric types store the number 5 billion?"

    Quoting from http://www.devshed.com/c/a/Python/Data-Types-in-Python/, "A plain integer takes up just a few bytes of memory and its minimum and maximum values are dictated by machine architecture. sys.maxint is the largest positive plain integer available, while
    -sys.maxint-1 is the largest negative one. On 32-bit machines, sys.maxint is 2147483647." So my statement is not refuted.

    Even granting that Python lets you work with numbers of arbitrary size easily ... well, maybe the point of the original writer was that Python is a "good" language and languages that do not do this are "bad" languages.

    If so, the bad languages still include the languages used for almost all of the development going on in the world: C, C++, C#, Java, Javascript, VB, not to mention Fortran, COBOL, and assembler.
  • SystemsReady 2012-07-18 15:38
    I dunno, personally I'd have all numbers as strings. You're not doing math with them, and having them as strings with the appropriate formatting in the database will keep you from having to write PHP code to format them every time you spit them out. :)
  • Agention 2012-07-18 17:10
    I only know 32767 and 65535 by heart. 31 & 32 bit numbers look so... big.
  • pjt33 2012-07-18 17:32
    Jay:
    Ol' Bob:
    From "The Tao of Software":

    Programming languages which require that developers know and care about the number of bits in a numeric variable are ill-suited for business use.


    Hmm, then to the best of my knowledge, there are no programming languages that are well-suited for business use. In what language can all numeric types store the number 5 billion? Maybe there are some out there, but I don't think I've ever used them.

    GolfScript. But it's most certainly not suited for business use.
  • Norman Diamond 2012-07-18 19:11
    Jay:
    Norman Diamond:
    Jay:
    foo:
    The US numbering system :) -- many countries' phone numbers start with an initial 0, so the previous mistake would show much earlier.
    That seems rather pointless. If all the numbers start with a zero, then the zero is superfluous and could be left off.
    That's the reason why international calls omit that zero between the country code and the remainder of the city code. However, most calls are domestic and the zero is needed to indicate that a full number is being dialled instead of a local (intra-city) number.

    Also, historically, long distance calls even within a single country were considerably more expensive than local (intra-city) calls. Also phones were dialled using dials instead of push buttons. So if a phone were offered for public use by a corner store or other public facing proprietor, they would put a lock in the 9 position in the dial. Digits 1 to 9 could be dialled but 0 couldn't be dialled (except by historical phone phreaks of course). Phone numbers didn't have any 0 digits except the first digit of the city code. So local calls could be dialled but long distance calls could not be dialled.

    Eventually pay phones replaced public use of phones provided by corner stores etc. (and later pay phones became obsolete because everyone except me carries a cell phone but let's not get ahead of ourselves), local numbers started to include embedded 0's the same as other digits, and the costs of calls were computed by some newfangled kind of calculating machine so it was no longer necessary to lock out the riffraff from dialling an initial 0.
    I'm not sure if you're agreeing with the original poster or not.
    Yes and no. The leading 0 is superfluous when the caller is in another country, so they dial the country code, omit the zero, and continue with the rest of the city code and local number. The leading 0 is not superfluous when the caller is in the same country.

    Jay:
    I think the real answer here is: No, there is no country where all phone numbers start with "0". There may be countries where all DOMESTIC numbers start with "0", or some other subset, so that the zero indicates which subset.
    Well, if you want to play word games by saying that a country's own phone numbers are a subset of this country's phone numbers, and foreign countries' phone numbers are a different subset of this country's phone numbers, so the first subset starts with 0 (domestic numbers including their city codes) while the second subset doesn't start with 0 (foreign countries' country codes) ... ok, you set up your subsets that way, but I'm not going to pretend to understand it.
  • Norman Diamond 2012-07-18 19:25
    Jay:
    foxyshadis:
    Pizza Boy:
    Noones mentioned house numbers.

    28A
    1/27
    U3/89
    u2/57A
    27-365
    3 at 14

    etc
    I pity the poor misguided son of a gun trying to micro-optimize a few bytes out by storing house numbers as integers, but has anyone actually done that? Everyone seems to associate the house number with the street name.
    [...]
    (Do other countries use the same convention as the US? That on one side of the street the house numbers are all even and on the other side they are all odd?)
    [...]
    They had an additional field for the extra things that sometimes get appended to a house number, like a letter or a fraction. ("101B", "214 1/2", etc)
    Thank you for noticing the fractions. Here are a few more examples of countries which don't use the same convention as the US.

    The world's largest country by population, and many other countries near it, have sequences of numbers as the previous poster pointed out:
    1/27
    27-365
    Subdistrict number within a named district, block number (a 2-dimensional area surrounded by roads), lot number (starting with 1 for some randomly chosen lot in the block, and incrementing as you walk around the block by following the roads either clockwise or counterclockwise). As areas get redeveloped, sometimes a merged block gets two or more block numbers, the lot numbers no longer make sense, etc. Also if a lot contains two or more buildings then you'd better include the building name in the address.

    The world's second-largest country by population:
    No house numbers. Building name, apartment number, floor number, name of street, and name of city.

    The US and some other countries:
    Rural route number. No house number, no street name.

    Some of my relatives:

    (The above line is null. No house number, no street name, no rural route number. Just the district name and village name. The postman has to know everybody in the district.)
  • Zemm 2012-07-18 19:36
    KattMan:
    Zemm:
    Maurits:

    At the risk of being tautological, if you don't need to do math on the thing, it's not a number, it's just an identifier.

    I'll add ISBNs and PINs to KattMan's list, and subclass part numbers as UPCs and EANs.


    But EANs (of which ISBNs and UPC are subsets) have check digits that you add and modulo to ensure you have read the number correctly. That sounds like maths to me.


    I wouldn't, because you do not do math with the entire set of digits, rather you pull it apart into its separate digits and then do math on that. It is a series of characters that fall within the range of 0-9 that you will split (string operation) then convert to numbers and perform a checksum calculation on.

    If you saved it as a number field (int, double, float whatever) you would have to convert it to the string to split it then convert the pieces back to ints and do the math. Cut out one of those operations since you are not doing math on the original string of digits.

    Edit:
    I will further mention those UPC codes or EAN's that begin with 0, save it as a number and you won't have the right number of digits to perform your checksum on, save it as a string and you will preserve your true EAN or UPC value.


    You could still store it as a number after performing your string-based validation. You'd even only need to store maximum 12 digits, needing 40 bits, since the check digit is redundant for storage. If it is >= 100000000000 then it's an EAN barcode, otherwise UPC and for output you'd just pad out with zeros to 11 digits. After that compute the check digit. (All scanners should be compatible with both. A UPC is an EAN beginning with 0).

    All that is only if you care about saving the few bytes of storage. These days I'd call that field VARCHAR and not worry about it. I've seen barcodes much longer than 13 characters that you may or may not need to care about. Some barcodes can even store non-numerals so a string field would be the only solution in this case.
  • oheso 2012-07-19 04:51
    Daniel:
    a few that expect you to pick a US state when you don't live in one (although admittedly I have not seen that particular manifestation of idiocy for quite a while now).


    You're lucky. I still run into it all the time. That and its cousin, the required state that only accepts two characters.
  • SimonP 2012-07-19 08:37
    If you are coerced into doing this you should probably convert the int back into a phone number and check it's the same as the number you intended to store... at least this way when it goes wrong you'll know all about it!
  • Danny 2012-07-19 11:22
    How about using a language that has working numbers?
  • curtmack 2012-07-19 11:42
    Ol' Bob:
    From "The Tao of Software":

    Programming languages which require that developers know and care about the number of bits in a numeric variable are ill-suited for business use.


    Programming languages which require that developers know and care about the syntax for creating, handling, and closing filehandles are ill-suited for business use. We should make a programming language that implicitly creates files on disk to store data. Variables are read from, and written to, files with the same name. Scope is handled with directories. Files are automatically closed when variables leave scope. Options set when declaring a variable could force file deletion when it leaves scope, or give it an explicit filename that differs from its variable name. In this way we wouldn't have to use filehandles at all, which helps prevent developer confusion. To write to a disk, simply create a variable with the filename and write to that variable. Reading from a disk is easier, just create a variable and it will be initialized with the current file contents.

    The best part is that we'd avoid using memory entirely. It would be so efficient!
  • caper 2012-07-19 13:34
    Unless they are doing something very dumb, PHP is not the problem:

    $ cat z.php
    #!/usr/local/bin/php
    <?php

    $s = "9895551212"; printout( $s );
    $i = 9895551212 ; printout( $i );

    function printout( $x ) { echo $x . "\n"; }
    ?>

    $ ./z.php
    9895551212
    9895551212

    I'm guessing that they also use MySQL and store the strings as 32 bit signed integers which is where things went wrong.
    http://dev.mysql.com/doc/refman/5.0/en/out-of-range-and-overflow.html
  • big picture thinker 2012-07-19 14:05
    Geoff:
    Not a PHP guy so I don't know but, if you try to convert a string to an int and its to big you get the largest possible signed integer value? Its not an exception, or overflow error?

    What the hell kind of sense does that make?


    Well, from an assembly point of view, there is no exceptions for overflows. Since arithmetic is modular, the CPU will just set the carry flag if the result exceeds the maximum value that can fit in the register. The result might not be what you expect, but there is no actual "error" or "exception".

    In some languages, when the carry flag is set, the designer of the language decided to have an exception thrown.

    In other languages, when the carry flag is set, the designer of the language decided to return INT_MAX.

    Saying one makes sense but not the other is kind of arbitrary and subjective.

    In one case, you trap the exception... in the other case, you can compare the result to INT_MAX... either way, you can effectively get the job done.

    If anything, many people argue that exception-based programming is something that can lead you down a very dangerous path if you don't fully understand it and you don't know how to do it correctly.
  • no laughing matter 2012-07-19 16:34
    curtmack:

    Programming languages which require that developers know and care about the syntax for creating, handling, and closing filehandles are ill-suited for business use. We should make a programming language that implicitly creates files on disk to store data. Variables are read from, and written to, files with the same name. Scope is handled with directories. Files are automatically closed when variables leave scope. Options set when declaring a variable could force file deletion when it leaves scope, or give it an explicit filename that differs from its variable name. In this way we wouldn't have to use filehandles at all, which helps prevent developer confusion. To write to a disk, simply create a variable with the filename and write to that variable. Reading from a disk is easier, just create a variable and it will be initialized with the current file contents.
    Google "Smalltalk image-based persistence": it's a language that has solved both problems with a far simpler approach.

    curtmack:
    The best part is that we'd avoid using memory entirely. It would be so efficient!

    But what would you do on an embedded system without a filesystem?
  • no laughing matter 2012-07-19 16:42
    caper:
    Unless they are doing something very dumb, PHP is not the problem:
    ...
    I'm guessing ...
    Stop guessing, just read the article. Turns out guessing was exactly the problem:
    Justin Reese:

    4.As part of that process, I run XML attributes through a very basic type-guesser I wrote
    5.That type-guesser says "if you're numeric and not float-like, you must be an int, so… I HEREBY DUB THEE SIR (int) $value

    PHP is part of the problem, as it supports the culture of dynamic-typing / duck-typing.

    XML can be written including a schema, no need to "guess" types.
  • Shark8 2012-07-20 16:27
    Mozzis:
    If I am writing in C, it is at least partly because I don't want all of the hand-holding and its associated overhead that would be implied by a check for overflow everytime I added two numbers together. And don't tell me it can be done in "only a couple of assmembly instructions" - if I am using C, I expect *maximum performance*, and if results need checking, I will do it myself thank you very much.


    Bad idea; oftentimes an optimizing compiler can get things right a good deal better than an experienced low-level language programmer. The reason is that while the programmer may know a few optimizing tricks, he is unlikely to know the same number that are in the optimizing-compiler.

    See: http://www.seas.gwu.edu/~adagroup/sigada-website/lawlis.html
  • Shark8 2012-07-20 16:32
    Jay:
    Ol' Bob:
    From "The Tao of Software":

    Programming languages which require that developers know and care about the number of bits in a numeric variable are ill-suited for business use.


    Hmm, then to the best of my knowledge, there are no programming languages that are well-suited for business use. In what language can all numeric types store the number 5 billion? Maybe there are some out there, but I don't think I've ever used them.


    COBOL, LISP, and Ada can all handle business programming. The "weakest" in these according to your criterion is Ada because the LRM allows it to reject programs it can't compile; and if the 5-billion is entered into an Ada program where the compiler is so restricted (think embedded 16-bit processors) that compiler is free to reject the program.
  • skipc 2012-07-21 21:23
    I was on a 5 hour call when i did support at Microsoft. I forget the initial issue but the problem turned out to be that the guy had named his computer "007" and Windows was plugging that in as IP address 0.0.0.7.
  • Azarien 2012-07-23 06:55
    TRWTF is that the int was silently saturated at INT_MAX. If it became something like -1965481203, the reason would be obvious.
  • QJo 2012-07-25 08:22
    There was a situation where a change of infrastructure meant that files were suddenly automatically sorted on a print queue by order of size. As our reports were generated: front page, body, back page (in separate files - we had our reasons, mainly to do with mailing packages), suddenly finding that all the front pages, then all the back pages, then all the report bodies generated afterwards, was a bit of a nasty shock.

    The solution involved adding a datestamp (in mumeric format) to the filename (in itself not a trivial deal because we were limited to 39 characters, IIRC - yes, it was VMS). The bright spark who came up with the answer coded the date into a number (10 digits). (I can't remember the details of this, or why this solved the problem - it's a long time ago.)

    Later that month, on the 22nd to be precise, the report s/w fell over.
  • Martin 2012-07-27 12:31
    Anon:
    Geoff:
    Not a PHP guy so I don't know but, if you try to convert a string to an int and its to big you get the largest possible signed integer value? Its not an exception, or overflow error?

    What the hell kind of sense does that make?


    Welcome to the wonderful world of PHP.

    In the world of PHP, if you ask it to do something stupid that it doesn't understand, it decides that it's better to just do something rather than give you one of those troublesome error messages that stops your program from running. Nobody likes those.

    Heh. I recently learned that an (int)floatingPointNumber does the same in C#. At least C# has the Convert class as an alternative, which will throw an exception on overflow.

    But for a supposedly modern language, C# has an amazing number of features with shoot-yourself-in-the-foot potential. Most of them copied from C.
  • padjo 2012-07-31 08:57
    Hah, I've done that exact thing, including the 64bit/32bit mismatch. 2147483647 is now etched in my memory for ever.
  • The poop of DOOM 2012-08-06 09:55
    Marc:
    Incremental, numeric keys are a bad idea, anyway - they're a one-way trip to unscalable applications.

    Not to mention porting that data to another database. Eg. if you have a kind of "registry" database. You need to implement the same functionality in another project, so you say: "Yeah, I've got the code. I'll just add it to that other project" and do a basic export - import... only the keys already exist in the other database. Nice going, incremental, numeric keys!
  • ccj 2013-02-28 13:31
    TRWTF is a 'type guesser' -- why not just alter the XML spec and/or DB schema such that each datum must be strongly typed? (Note: I work in a research-oriented digital signal processing shop and have never had 'business' people as clients... I'm guessing my suggestion might be non-trivial with some blighted non-1337 humans involved) Good catch all the same, kudos!

    captcha: "enim" for when you need an enum type immediately!