- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Frist of all, let me tell you that this tutorial is excellent. It shows real world JavaScript Best Practise: bad code.
Admin
Coincidentally I've been reading Matt Parker's "Humble Pi" this last few days which explores (among loads of other WTFisms) the mistake of treating data like this whose characters are numerical digits as numbers. As it hit the UK best seller list in fairly recent history, you sort of hope that at least some would-be British programmers may start to approach the question of postcodes and telephone numbers with a little more insight -- but then again they're probably more likely to have read the first chapter of Bentley's "Programming Pearls" and completely misunderstood the context.
Admin
Another thing to consider is that this would fail valid Zip+4 postal codes, which is generally preferred by the USPS theses days. Besides the regular expression for a Zip code is not that difficult:
Admin
We can keep pounding this into beginning (and sometimes not even that beginning) programmers, and it will not take. Zips are not numbers. SSNs are not numbers. Telephone numbers are not numbers. Bank account numbers are not numbers; thank heavens IBAN has made that obvious at least in Europe, but even then people insist on treating the "number" part of an IBAN as a number. Hell, house numbers are not numbers - you may think you can add two to get to the next house over, padawan, but you're wrong, and what was the last time you multiplied two addresses together anyway? All those are identifiers, not numbers.
And yet every single hack programmer out there will store a house number as a number plus a suffix. All too often, we (read: I) don't even have a choice, because it's what government APIs expect of us. Beh. Still not a number.
Admin
You find an updated version of "Please enter a valid zip, dude." on gist.github.com named "best-practices.markdown". (gist.github.com/PwrSerge/ae45a407879e315240609bea15167d34)
Admin
I will add, this wasn't found on just a gist or a github tutorial, but was published by a large tutorial site that charges money for tutorials. I couldn't confirm that this content was still on that site (and based on domain names, that site seems to have been swallowed up by a DIFFERENT site).
Admin
Seen a lot of this with VBA (for Excel and Access in my case), the so-called "expert" forums are an absolute scream, so much so if I'm googling for an answer to something I'll always go to UtterExcel or something to cheer myself up a bit.
I mean you might get nonsense on stackoverflow but you can be pretty sure someone will pop up and call it out and the thread will iterate towards a sensible answer. Those sites for beginner languages often end up like echo chambers of dumb-assery.
To that list above we can add VINs, Credit Card numbers, IP addresses (well OK, you can do stuff with them, but don't store it as a pair of bloody floats, believe me, I've seen it done),
A similar disease is directly translating existing manual processes into digital ones without stopping to ask if that is actually the best way to get from input to desired output. Yes we can design input screens that are visual metaphors of older approaches to reduce training, and create reports which match what the old handwritten table looked like, but that doesn't mean the data schema has to look like some sort of scan of the finance directors desk, with munged together version control to create new tables every time we change exchange rates or prices. The example I'm thinking of, the design was sort forced upon the developers by the client's board of directors, who between them could neither open a spreadsheet or type a letter. This was only a decade ago and they still had PAs who would print stuff and bring it to them to be "edited" by red pencil and handed back ... oh, that included email. Yes, that system was a peach.
Admin
Hmm, do I infer that if I were to think of a well known tutorial site, while thinking of a name implying MANY sights, sorry, sites I just might find it there?
Would make sense, that one has a very mixed content, some of it is excellent top drawer stuff, other courses, not so much.
Admin
Best JavaScript practice? Find a different language. Seriously, for all the cliches about C making it easy to shoot yourself in the foot, JavaScript comes with a whole battery of BFGs, loaded and ready to fire.
Especially for those new to the profession - learn good general programming practices on a more seasoned language first before diving into the hot mess that is JavaScript.
Admin
Wider use of IPv6(1) will put a stake through the heart of that particular vampire, I hope.
(1) I say that, but statistics from my UTM firewall say that 40% of all IP packets that pass through it are IPv6.
Admin
If you use the 16.16 format, you can get them into a single float. Far more efficient! (Or you can be boring like me and use a string and have everything actually work.)
Admin
TIL.... I live about 20-25 miles from Holtsville and never knew their zip code starts with an odd zero (I actually looked it up!). There are 3 zip codes: 00501, 00544, and 11742 (which is in line with the others in the area).
This code is a WTF for sure, but another WTF is that the original article makes no mention of the entire block of states in New England and also New Jersey, Puerto Rico, the Virgin Islands, and Army and Navy post offices in Europe -- all of which have zip codes that start with zero -- PLUS a few places where they needed an extra zip code and ran out of "local" numbers (Dillingham, Alaska 00001, for instance). Sorry, Remy, I know you gave one particular example, did not in any way suggest that Holtsville, NY was the only place this would be a problem, and I know it was not your intention to indicate there is only one exception. But you should have made mention of the magnitude of this bungle to demonstrate the colossal scope of its WTFedness.
Admin
Having lived through the pointless horrors of Y2K (solution: store the year in nibbles, not bytes. Even Cobol can do this), I suggest that the root problem, even today, is that people are obsessed with "efficiency."
If you think something is a number, you store it as a number. (I've even seen horrors like int16 in .Net, for which there is rarely an excuse.) Save a byte here or there (and the more brillant amongst us will claim knowledge of the memory model, in which a five digit zip string will actually waste three more bytes), and presto, you've got ...
... Well, you've got this horrible thing. I mean, you'd think numerate people could spot the fact that leading zeros are significant. Then again, how many numerate people rely on these stupid tutorials?
Admin
... This is a good example why the validation is NOT that simple. This approach fails for postal codes -- among others -- from the UK.
Admin
You do realise that an IPv4 address is actually a 32-bit number? I'm assuming that your comment was tongue in cheek.
You're right, if you are just reporting or storing that number a string is sufficient - as it is for most of the other examples people have mentioned.
However, if you need to use the bit mask to discover, for example, the first and last addresses in that allocation, to add the whole block to a firewall, you have to treat it as a 32-bit int.
Admin
Not to mention that USA != TheWorld. Canada, among others, includes alpha chars in their postal codes.
Admin
The other benefit of this code is that it allows you to enter your zip code in hexidecimal, which is a very important use case no one here seems to be thinking about!
Admin
What's your problem with British people? Turns out you get awful programmers in all parts of the world.
Admin
So all the WTF's resulting from innumerate programmers trying to do integer comparisons and even arithmetic on strings can be balanced against the WTF's from idiots handling identifiers as numbers?
Admin
Yup. I'd say it's about 50/50. (Don't try to parse that as the decimal representation of a fraction.)
Admin
I can agree with most of that, but not necessarily house numbers. It can be useful to store those as numbers, so that you can sort them correctly (unlike most of your other examples, they're variable-length), and not, say, have your driver discover after he's gone to 123 Any St. and 456 Any St. that he has to double back and go to no. 7.
Admin
At least with telephone numbers I can see the point of saving them numeric to avoid the fact that everyone in their mother will format it differently. Of course that's not the proper way to do it but that's generally how it works
Absolutely agree with you though
Admin
i like coding and i try to learn javascript its not that bad "http://thedailywtf.com/" doing good buddy Remy: Obvious spam link removed here, but the bot promoting another crappy tutorial site I thought was funny.
Admin
Agreed. For some applications, phone numbers are structured data, and not just any text. For example, their equality is not the same as string equality. You probably don't want to let a user register multiple accounts with the same phone number, just formatted differently.
Many of the examples above (credit card number, IP address) have a strictly enforced format and identity semantics. Strings are a far too general type, but I suppose it's not worth our time to find something more efficient. (Validation, and then storing as a string, tends to work)
Admin
Telephone numbers have meaning. If you want to record "the user entered this value in the UI; here, human, figure out what they meant", then just store the raw value as a string. You might be able to help the destination human by putting working validation on the input form.
If you want to compare and/or manipulate phone numbers, then you have to store them in a meaningful way. That includes things like acknowledging that there are multiple formats and there are non-canonical representations. Figure out how to canonicalize, compare, manipulate, serialize, and deserialize. After you've done all that, storage will be trivial.
Same goes for IP addresses. IP4 addresses were intended to be stored as collections of bits and doing bitwise operations on them is common, so that may be a storage optimization that makes sense. IP6 usage is simpler, there's not a whole lot of reason to optimize storage unless you are writing software that runs at the network layer.
In the industry I currently work, we handle a lot of social security numbers. We've made the mistake of not canonicalizing, so sometimes we store with dashes, sometimes we store without dashes. It creates a huge mess and was a mistake.
Previously, I worked in an industry where drug NDC numbers were used often. These are even worse because there are interesting rules where leading zeros can sometimes be dropped from some of the groups and dashes are often omitted in practice. So, "50242-0140-02" can be written as 5024214002 or 5024201402 (but never 502421402). Both are valid. There are various reasons why collisions don't happen, but comparison logic is a bit challenging.
Admin
Well, that assumes a numeric sort on a sequence of address numbers on the same street maps to the geographic layout of that street. That's usually true, but it's definitely not always true!
Admin
"123 ½ Main Street." These are not common, but plenty of them exist.
Admin
The house#s bit is truth. On my street
25 21 431 429 427 2637 19
OK, technically 2637 is the corner lot and the # is on the OTHER street. But when they laid out the 2nd part of the street with smaller lot sizes, the needed more #s so they inserted the 400s between 21 and 19...
Admin
That's only true because when the NANP (North American Numbering Plan) was setup, it was one of the considerations.
However, the customs of our little village are not the customs of the world.
Admin
Indeed, they have many more than two problems, and should use a regexp to reduce that number to only two.
Admin
This has driven me crazy as an end-user, when I couldn't use a web site because my zip code begins with "0". It was obvious what was going on behind the scenes. Unbelievable.
Admin
Validation rules can be useful when you absolutely know what values are legal. They'll avoid a lot of fat-finger mistakes. Just be sure you absolutely know what's legal--and if you're dealing with anything potentially foreign that assumption is basically sure to fail.
Admin
The right thing to do with things like phone numbers is canonicalize them, e.g. remove all the separator characters. But it's still not appropriate to store the result as a number.
Admin
That assumption will screw you up- in many places, numbers running sequentially up the street, odds one side, evens the other is often true, but around here there are streets where the numbers run up one side and back down the other (so at the "beginning" of the street you are stood with no 1 on, say, the left and the largest number on your right). I lived on a street where they'd mixed both, and because one side had apartment blocks where each apartment had an individual house number, and the other just 3 storey townhouses, the street numbers ran in evens from 2 to 700 on one side (with around 50 or so having their entrance around the corner on another street, and all of them having vehicular access on other streets) , and 1 to 121 on the other in the opposing direction. I have also seen streets were the numbering is completely illogical, possibly the order the houses were built or first occupied, or maybe it got screwed up as the street was realigned or properties rebuilt. Before we get into little networks of roads with the same "street" name, streets where the name changes mid way along and then changes back, with or without the numbering resetting, and countless other weird iterations.
In short, use geolocation, to do otherwise falling into the same sort of trap the original WTF describes.
Admin
"I can agree with most of that, but not necessarily house numbers. It can be useful to store those as numbers, so that you can sort them correctly"
You're assuming, incorrectly, that all houses even have numbers.
Admin
Yeah, but then again, you can't check for everything. Our in-house solution is to check the validity of Dutch post codes (99% of our customers, and 90% of theirs, are in the Netherlands) and not bother if the user chooses another country. That way, we can at least use one simple regex and have still most cases covered. There's no telling what Somewhereelsania might start using in five years' time, anyway.
Admin
Well, except that in my street that wouldn't work to begin with... the houses are numbered odd on one side and even on the other, but they're not evenly spread. So, say, 2-8 have no opposite neighbours, 3-9 are opposite 12-18, 25-37 have no opposites again, and guess where number 24 is?
Admin
And? Let them. What are you going to do with it except display it back to the user? And then there are those "call 0900-ADVERTISEMENT" numbers you get in the USA.
Admin
Yes using length validation against an integer zip code is a bad idea, but the problem isn't the integer zip code, rather it's the validation. The correct validation would be 0 <= n <= 99999.
Admin
You should never assume anything about house numbers in addresses.
In the UK and Ireland, it's relatively common for house numbers to be 1 to A on one side of the street, and A+1 to B on the other side of the street. Finding the right place can be tough.
It's also very possible for houses to have a name, but no street number. For example, this is a valid Irish address: Fairyhouse, Lock Rd, Lucan, Co. Dublin. How many address validators will barf on that address?
Admin
You should never assume anything about house numbers in addresses.
In the UK and Ireland, it's relatively common for house numbers to be 1 to A on one side of the street, and A+1 to B on the other side of the street. Finding the right place can be tough.
It's also very possible for houses to have a name, but no street number. For example, this is a valid Irish address: Fairyhouse, Lock Rd, Lucan, Co. Dublin. How many address validators will barf on that address?
Admin
Ahh for the days when Holtsville was 11742 (Lived there many a year)
Admin
There is an argument to be made that the assumption of a number being a value is baked into programming languages. Zeroes at the beginning of a zip code mean nothing when storing it as an int. Really to validate a zip code is to check the length and that each character is a digit, it doesn't matter how it is stored in memory. USA != World, but USA == allICareAbout ; )
Admin
There is an argument to be made that the assumption of a number being a value is baked into programming languages. Zeroes at the beginning of a zip code mean nothing when storing it as an int. Really to validate a zip code is to check the length and that each character is a digit, it doesn't matter how it is stored in memory. USA != World, but USA == allICareAbout ; )
Admin
1983 called and mentioned that zip+4 is now a thing. Your marketing department called and said they can halve their mailing costs if we collect them.
Admin
Just to make this even better: parseInt() does not always default to radix 10 in JavaScript. Before ES5 it would default to radix 8 for anything starting with a 0, making the example broken codes even more broken.
Admin
The best practice for Javascript is, "Don't use Javascript".
Admin
Wait, I don't understand. "Phone number" and "social security number" have the word "number" right in the name! How can they not be numbers? Next you'll be telling me that a law proposed by a politician that is called the "Freedom and Truth Protection Act" might not actually be about protecting freedom and truth but just be about giving tax money to his friends and campaign supporters! Or something crazy like that.
Admin
"The correct validation would be 0 <= n <= 99999." I humbly disagree. If someone enters "1937" for his zip code, that should be an error. A zip code is supposed to be 5 characters. MAYBE he means 01937. But more likely he made a typo and left off a digit. Americans understand that a zip code is 5 digits and that leading zeros are not superfluous.
Admin
Yeah, I'm also NL and gave a lengthy explanation but it got inexplicably "Held for moderation", Jammer.
My favourite is streets where the street name changes halfway through (for a square or something) and then changes back with (of course) completely illogical numbering throughout.
But it isn't just NL, every country does stuff like that.