The Daily WTF: Curious Perversions in Information Technology

2011-03-28 Reply Admin

This has Nagesh written all over it.

2011-03-28 Reply Admin

Just wow...

2011-03-28 Reply Admin

ÃƒÂÃ‚Â´ÃƒÂÃ‚Â°ÃƒÂÃ‚Â½ÃƒÂÃ‚Â¸ÃƒÂÃ‚Â¹ÃƒÂÃ‚Â»:
This has Nagesh written all over it.

Nagesh, take a bow... You are now officially a meme (to the 6 people that frequent this site).

2011-03-28 Reply Admin

So, what happens when phone numbers go hex?

2011-03-28 Reply Admin

Idea Extend string class to implement "replaceAll" function for language which may or may not have it already.

Quicksilver:
1. Idea number = number.replaceAll("\\D",""); no to short and who knows \D anyway?

Idea number = number.replaceAll("[^\d"],""); yes but what if reader ist not that fluent in regexp, got to remove these predefined groups

Idea number = number.replaceAll("[^0123456789]"],""); OK but what if the reader does not know about negation...

idea see article

perfect!!!

2011-03-28 Reply Admin

CodeMonkey:
So, what happens when phone numbers go hex?

You rewrite the code (even this shitty code) to support it. You don't code to prospective realities, you code to the current reality.

Although I imagine that would be a great WTF. "Some junior decided that phone numbers might be in hex someday, so he wrote a bunch of code to support it. He has another few months of beatings to endure."

2011-03-28 Reply Admin

Oh. Oh God.

...I thought Python was relatively free of WTF programmers (myself excluded, of course).

2011-03-28 Reply Admin

I'd give the guy some credit if he even put it in a "reusable" method for each field...what am i thinking???

Matt Westwood · 2011-03-28 Reply Admin

Look hang on, I'm not so sure this will work. I'm not sure about the "replace" function. What if it only replaces the first instance of the character? I'm worried this isn't going to catch every instance of iffy characters (and I haven't got time to read the documentation), so I reckon:

homephone[1] = homephone[1].replace(" ", "").replace("-", "").replace("(", "") : :

homephone[2] = homephone[2].replace(" ", "").replace("-", "").replace("(", "") : :

... and so on ought to do it. Um, how long can this string be, by the way ...?

2011-03-28 Reply Admin

Please help me, do you have a unicode version?

2011-03-28 Reply Admin

BentFranklin:
programmer = programmer.replace("right away")

LIKE IT!

On a side note: We need a "I like it" button or other way to rate comments

hoodaticus · 2011-03-28 Reply Admin

Macro King:
frits:
Even VB can do it better:
        For i = 0 To homephone.Length - 1
You forgot OnError, ResumeNext. And you're going to need that because VB strings are 1 based!

The real WTF etc etc...

Not in .NET AFAIK. But that would be TRWTF if true.

Pr0gramm3r · 2011-03-28 Reply Admin

This is only a WTF because the guy didn't use a for loop.

char badCharacters[100];

for(int i = 0; i < 100; i++){
  word = word.replace(badCharacters[i],'');
}

If you saw this code you would definitely think it was bad, but not worthy of a daily wtf. Can we get real articles please?

2011-03-28 Reply Admin

Anonymous:
Computers are a great way to simplify tedious manual tasks. Of course, some developers don't fully understand this concept.

of course he understood:

for i in homephone workphone cellphone; do echo ' -()[].`^!@#$%&*+=/<>:;~ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'|sed 's/\(...\)/\1\n/g'|sed 's/\(.\)/.replace("\1", "")/g'|sed "s/^/$i = $i/";echo;done

they even didn’t remove

'"\

to keep the script simple.

2011-03-28 Reply Admin

boog:
@Deprecated:
employees.replace("programmer", "monkey");
I'm guessing that's how they ended up with this code in the first place.

No - a monkey wouldn't have the attention span to come up with something this long and repetitive. Besides, they tend to throw the crap at their peers directly, rather than leaving it for the next generation of monkeys to find in a few years time...

2011-03-28 Reply Admin

Anonymous:
Mark:
Dude. Google "regular expressions". I swear it will blow your effing mind.
I'm worried about the amount of people who are suggesting regular expressions for such a trivial problem. Regular expressions have their place but they are totally the wrong tool for a simple non-numeric replace like this. Fun fact: in the .NET framework the regular expression handling classes have the highest cyclomatic complexity of any classes in the entire framework (last time I checked, circa 3.0). Yes, I know this code isn't .NET but it highlights an inescapable fact about regexes - they are a seriously expensive operation, in any language. Don't abuse them for silly little jobs like this one.

Depending on the language, the replace() code may actually be using regexes internally. Java's String.replaceAll() method takes a regex as the first parameter, so it's not inconceivable that this unspecified language is already using a spectacularly ineffecient number of regular expressions.

2011-03-28 Reply Admin

Nice use of fluent programming style.

2011-03-28 Reply Admin

Skilldrick:
Silverwizard:
This is MASSIVE duplicates code, you think he'd just do: homephone = makeValid(homephone); workphone = makeValid(workphone); cellphone = makeValid(cellphone);
Yeesh! Everyone knows that methods are useful for saving time!

OMG You're right, I hadn't realised that the duplicate code could be extracted into a separate function - that's definitely the biggest WTF here. [/sarcasm]

Sure, like an outer join to 3 different databases; one for each type of phone number. This allows for flexibility if home/work/cell phone numbers start to have different characteristics. And, if different types of phone numbers are required in the future, just copy a database, some quick editing, and add another join.

2011-03-28 Reply Admin

TRWTF: You only wrote six sentences and still managed to make a typo on "regular" and misspell "weird".

2011-03-28 Reply Admin

Kevin:
What is particularly brilliant about this is that, rather than call a single function for the replaces, it is repeated three times in so homophone, workphone and cellphone can be customized individually.

Homophone? What like there, their and they're?

2011-03-28 Reply Admin

Sobriquet:

eMBee:

in pike i'd do this:

string phonenumber = "(123)456-789/0";
string clean_phonenumber = `-(phonenumber, @Array.uniq(sort( `-(phonenumber, @"1234567890"/"")/"" )));

or

multiset allowed = mkmultiset("1234567890"/"");
Array.filter(phonenumber/"", lambda(string digit){ return allowed[digit]; }) * "";

In Perl it's

tr/0-9//cd

and I'm done.

Yeah, and then a non-perl developer like me comes along later forced to make sense of your perl script and sees "tr/0-9//cd" and says WTF? At least I could look at today's TDWTF code and get an idea of what it does.

2011-03-28 Reply Admin

CodeMonkey:
So, what happens when phone numbers go hex?

comment down the bottom

and we will add a nice story for akismet, too. I wonder if it might be because there are too many commas vs characters in the post. How far would I need to go to balance this out? So far, I have tried to post 3 times, but clearly I'm not fooling akismet who knows better. Maybe I will simply write some big long sentences without any puctuation to see whether we can get something through we have already tried to post five times and each times akismet was not happy still I suppose it is far far better to encourage people to go to the effort of writing soliloquies to avoid the spambot rather than risk a couple of adds for nike shoes. Maybe we will try something differet I wonder if akismet likes poetry: There was movement at the station for the word had got around that the colt from old regret had got away

Or maybe It was somewhere in the country in a land of rock and scrub that they formed an institution called the Geebung Polo CLub

akismet:
Simple, comment out the replaces for A,B,C,D,E,F, a,b,c,d,e,f, x,X

2011-03-28 Reply Admin

SomeYoungGuy:
Sobriquet:
eMBee:
in pike i'd do this:
string phonenumber = "(123)456-789/0";
string clean_phonenumber = `-(phonenumber, @Array.uniq(sort( `-(phonenumber, @"1234567890"/"")/"" )));
or
multiset allowed = mkmultiset("1234567890"/"");
Array.filter(phonenumber/"", lambda(string digit){ return allowed[digit]; }) * "";
In Perl it's
tr/0-9//cd
and I'm done.
Yeah, and then a non-perl developer like me comes along later forced to make sense of your perl script and sees "tr/0-9//cd" and says WTF? At least I could look at today's TDWTF code and get an idea of what it does.

Most of the time perl regular expressions look like total garbage and cannot be deciphered. But come on, you couldn't spend all of 2 minutes to look this one up online?

2011-03-28 Reply Admin

Why are we always so obsessed with Data formats? It seems to go against internationalization and other ideas. I frequently get sick of (online) forms that don't let me proceed without adding some mundane detail that is not relevant to them. Further, by insisting that I enter 'realistic' values in the fields they feel are important, they are making it a lot harder for themselves (somewhere down the track) to verify whether my submission is genuine.

If someone doesn't give a valid looking email address, and you feel that a valid address is important, you can ignore their submission on the backend without bothering them on the front end. There is a principle in Data Matching that raw data usually provides more value and more indentification detail than any efforts to scrub, enhance or 'fix' the data.
Telephone numbers come in many formats around the world, and given the way they are structured it is quite possible that the spacing or other punctuation may give some identification as to where it comes from (where as stripping it might make two totally different numbers from two totally different locales appear similar).

Observe some numbers from different parts of the world (they have been changed from existing numbers I found, and it's possible that some of them are no longer valid after my changes, but hopefully they illustrate the point): US: +1 847 576 6246 Czec Republic: +52 (55) 5243-6123 Indonesia: (62-21) 571-8855 UK: +971 4 33 15 476 Italy: +39 02545371 Australia: +61 3 9856 7698 Each of them has a different spacing, and some have brackets and dashes, but the format is relevant. What if we remove it? 18475766246 525552436123 62215718855 97143315476 3902545371 61398567698 Of course, the area codes should dictate that we never get duplicates, but what if people don't use full area codes? How many people use full (national and) international prefixes when submitting phone numbers online?

Of course, if the system in question was insignificant and was only for Fr Brian to keep track of parishioners, there's no issue, but if this is on a system that has global reach and might (whether today or in the future) need to be able to store multiple locale's phone numbers, then this solution is naive at best. It might be desirable to validate that phone numbers are in a prescribed format (someone bodgey from Russia {for example} might not know conventions used in the US to write phone numbers) but even this shouldn't necessarily be articulated to the end user (but rather dropped silently into a 'look at later' file).

There is much more that could be said, but ultimately, the real WTF is people who insist on trying to validate or alter potentially global data. Forcing someone to observe a specific format (whether in numbers or addresses or something else) will increase the number of bad results you get because users will try to beat the system with another value. Many (American) sites have options for people from outside the US to enter data, but still have requirements on either state or zipcode fields. Sites like this will get many people from elsewhere enter anything the validator accepts to get over this requirement. The only effect is that the data is suddenly compromised.

But who cares, right?

2011-03-29 Reply Admin

It's good to know that ",__?" is a valid phone-number. Too bad I can't try dialing it - all my phones, and Skype too, are lacking the necessary buttons, unfortunately.

2011-03-29 Reply Admin

Both the person who wrote the requirements and the person who implemented them has no understanding of telephone numbers.

It amazes me how many web sites out there validate phone numbers as "Long Integer", or worse, "Long Integer starting with 0". Clueless.

And without the brackets and plus symbol, the user does not know what to dial when using the stored number, which defeats the purpose of storing it in the first place.

2011-03-29 Reply Admin

You could kill that with a backslash

2011-03-29 Reply Admin

I'd prefer this, regex is for pussy's:

String newPhoneNr = ""; for(int i = 0; i < phoneNr.Length; i++) if(phoneNr[i] == '0' || phoneNr[i] == '1' || phoneNr[i] == '2' || phoneNr[i] == '3' || phoneNr[i] == '4' || phoneNr[i] == '5' || phoneNr[i] == '6' || phoneNr[i] == '7' || phoneNr[i] == '8' || phoneNr[i] == '9') newPhoneNr += phoneNr[i];

2011-03-29 Reply Admin

Your Name:
You may want to do other validation for the phone number as well. For the US, for instance, you'll want to make sure it's ten digits. Also, some checking of the actual digits[...]

Brilliant, you found a way to make the code worse. Now you don't just have a cumbersome way to filter out (some) characters from the phone field, you've also arbitrarily restricted yourself to US numbers. Since obviously you'll never have a customer who isn't a US citizen or a US citizen who happens to live abroad...

2011-03-29 Reply Admin

What if the user start inputting chinese characters ? This implementation is a good start just incomplete, you need to extends this to exclude the whole non numeric UTF-8 + the Klingon alphabet.

Also insert these characters on a reference data table so that you can adapt the exclusion list without code changes, for complete future proofness and flexibility.

These sloppy developers, always taking the easy way out !

2011-03-29 Reply Admin

Anonymous:
Note also that the code calls replace so many time that regex performance issues pale in comparison.

The code quoted in the article uses an unnecessary number of replaces but if you are trying to suggest that regexes are the most efficient way of performing this task then you are sorely mistaken. Ever used a profiler before? You should give it a shot instead of making totally wild guesses about performance (yes, I profile my code if I'm concerned about performance - real perfomance metrics are far better than guesswork).

blarg:
Are you one of those guys who spends 3 days finding a 'clever' workaround to save 2 clock cycles for a task which is called once an hour and has a disk/network bottleneck anyway?

No, I'm the sort of person who uses the right tool for the job.

boog:
Is execution time the only value by which you measure one tool's "total wrongness" vs. another?

No, but it's certainly high on my list of criteria. More so that complexity to author - within reason, of course (you only author once but you execute many times, so if I can improve execution speed by spending an extra 20 minutes thinking about the most efficient way to write the code, I see that as a success).

I think a lot of these comments nicely highlight a common problem I see when people first learn regexes - they think they're the holy grail to every string manipulation problem out there. "But I can do this with one simple regex"! Simple for you but wasteful and labour intensive for your processor. This isn't about premature optimisation, it is about avoiding hugely costly tasks for simple and repetetive jobs. If you don't understand that, you will never be a good coder.

"Some people, when confronted with a problem, think 'I know, I'll use regular expressions'..."

You know the rest.

2011-03-29 Reply Admin

Simon:
Depending on the language, the replace() code may actually be using regexes internally. Java's String.replaceAll() method takes a regex as the first parameter, so it's not inconceivable that this unspecified language is already using a spectacularly ineffecient number of regular expressions.

Sorry Simon, I left you out in my previous reply. What you're basically saying is true, it's not inconceivable that <some langauge> uses regex underneath replace. But I would expect this to be extremely rare for simple pattern matching replacements like string.Replace(thisString, withThisString) because regex is actually quite ill-suited to this task (as well as being very expensive). There are simpler ways of doing it, plain and simple, so to underpin this type of replace with regex is just foolhardy (or lazy, and I'll concede that there are some lazy language designers out there).

2011-03-29 Reply Admin

All that and he still got it massively wrong.

Replacing + with an empty string is going to screw it up for any sort of international phonenumber of the +61234567890 format.

2011-03-29 Reply Admin

Please do not attempt to clean up any phone numbers! Either they match your pattern, or return them to the user for review, as there's probably a typo ... p.e. (JS):

var matches = RegExp(/^\s*(+[1-9])?([0-9\ \-]+)(\s*\/\s*[0-9]+)?\s*$/).exec(phoneNumber);;
if (matches) {
  // return int. prefix, phone number, extension
  return matches[1] + matches[2] + matches[3];
}
// no match, return empty string for invalid, do not attempt to clean up
return '';

2011-03-29 Reply Admin

That routine needs a rewrite! Not all unicode characters are properly replaced.

2011-03-29 Reply Admin

Jerry:
caper:
The WTF is the requirement "Only the numbers should be saved."
Oh is that what he was trying to do?
Easy. Add zero. If it isn't a number you'll get an error.

Bzzzt. Wrong.

If you want to check if it's a number, sure that might work. Although I'm guessing that all languages will have a better method.

But that's not the goal. He's stripping out all the non-numeric characters. So '(020)-1234-5678' becomes '02012345678'. We don't care if the starting point isn't a number, we're trying to make it into one.

2011-03-29 Reply Admin

false:
JavaScript .replace() replaces only first occurrence of the needle, so I guess this is TRWTF here.

You must have a stranger javascript interpreter then as it replaces all of them for me... even gives you an example with more than one occurance on the W3C site. :S

2011-03-29 Reply Admin

hold up.... looks like i have the strange interpreter.... wtf?? it shouldnt work but does?? back to bed me thinks, try again tomorrow

WthyrBendragon · 2011-03-29 Reply Admin

Gads! It's a limited length string! Just walk the fscking thing! Even as a PL/SQL Function this is elementary.

Execute a toNumber() on each character and if the cast succeeds you keep the character. The exception trap just discards the character so you can proceed next.

A final viability check for length against known/acceptable dialed number formats and you're pretty much done.

2011-03-29 Reply Admin

TRWTF is so many people acting smart about fixing code that they read on a site about terrible code.

Look at me! I'm slightly more intelligent than a retard! I WIN!

2011-03-29 Reply Admin

Stevie D:
Jerry:
Oh is that what he was trying to do?
Easy. Add zero. If it isn't a number you'll get an error.
Bzzzt. Wrong.

If you want to check if it's a number, sure that might work. Although I'm guessing that all languages will have a better method.

But that's not the goal. He's stripping out all the non-numeric characters. So '(020)-1234-5678' becomes '02012345678'. We don't care if the starting point isn't a number, we're trying to make it into one.

In my math book, adding zero to (020)-1234-5678 (as suggested), yields -6892, not 02012345678!

2011-03-29 Reply Admin

Insanitisation shurely?

2011-03-29 Reply Admin

This is absolutely a lesson that should be taught to programming students: When you're filtering user inputs, a whitelist is pretty much always more secure than a blacklist.

2011-03-29 Reply Admin

I'd say walking the value is almost certainly the correct way of doing it. Using a regexp for this task is just stupid and inefficient.

boog · 2011-03-29 Reply Admin

Anonymous:
boog:
Is execution time the only value by which you measure one tool's "total wrongness" vs. another?
No, but it's certainly high on my list of criteria. More so that complexity to author - within reason, of course (you only author once but you execute many times, so if I can improve execution speed by spending an extra 20 minutes thinking about the most efficient way to write the code, I see that as a success).

If you spend an extra 20 minutes thinking about the most efficient way to write code, and all that thinking saves you ~1 millisecond of execution time per execution, and that code executes only 10000 times (or even 100000) in its lifetime, I don't know how anyone could see that as a success.

Anonymous:
I think a lot of these comments nicely highlight a common problem I see when people first learn regexes - they think they're the holy grail to every string manipulation problem out there.

Knowing when optimization is necessary is far more important to me than always choosing the most-optimized solution. I believe a good solution now is better than the best solution later. My customers won't notice 1 extra millisecond of execution time, but they will notice if I miss a deadline.

Anonymous:
Simple for you but wasteful and labour intensive for your processor. This isn't about premature optimisation, it is about avoiding hugely costly tasks for simple and repetetive jobs.

You think telephone entry validation is a repetitive job?

Anonymous:
If you don't understand that, you will never be a good coder.

Congrats to you! It takes a lot of arrogance to tell others that if they don't agree with you then they will never be good coders. I didn't know you had it in you.

Anonymous:
"Some people, when confronted with a problem, think 'I know, I'll use regular expressions'..."
You know the rest.

In my experience, people who spout that crap tend to have a fairly poor grasp of regular expressions in general.

Aren't bullshit anecdotes fun?

2011-03-29 Reply Admin

[quote user="blarg"]Are you one of those guys who spends 3 days finding a 'clever' workaround to save 2 clock cycles for a task which is called once an hour and has a disk/network bottleneck anyway?[/quote] No, I'm the sort of person who uses the right tool for the job. [/quote]

All evidence suggests otherwise if you think wasting time optimizing this is a worthwhile thing to do. Perhaps your time and your companies money is worth a lot less than mine, which explains why you would be happy to work on optimizations that have absolutely no noticeable benefit. Parsing a telephone number is a very different scenario to doing high frequency trading or low latency messaging etc - the right tool for the job is not going to be the same. Boog is completely right.

matthewr81 · 2011-03-29 Reply Admin

Icould od much better:
Rob:
This guy was paid by the line, right?
No. He wouldn't do 3 replaces per line if it was so.

Not sure about that... he might have been running his "buy one line, get two lines free" promotion

2011-03-29 Reply Admin

Anonymous:
"Some people, when confronted with a problem, think 'I know, I'll use regular expressions'..."
You know the rest.

"Premature optimizations are the root of ..."

You know the rest.

2011-03-29 Reply Admin

boog:
If you spend an extra 20 minutes thinking about the most efficient way to write code, and all that thinking saves you ~1 millisecond of execution time per execution, and that code executes only 10000 times (or even 100000) in its lifetime, I don't know how anyone could see that as a success.

What the hell? I work on flight controllers that process input signals hundreds or even thousands of times every second. I cycle through 100000 calculations in just a couple of minutes of runtime.

boog:
Knowing when optimization is necessary is far more important to me than always choosing the most-optimized solution. I believe a good solution now is better than the best solution later. My customers won't notice 1 extra millisecond of execution time, but they will notice if I miss a deadline.

OK, so we're obviously coming from seriously different backgrounds here. In my line of work, 1 extra millisecond of execution time equates to tens of missed input signals and the redundancy in an active flight control system can only handle so many missed inputs before the system becomes unstable and, you know, planes start crashing. My customers will definitely notice that.

boog:
You think telephone entry validation is a repetitive job?

Well, my comments are not specifically related to the article code, they are related to my own beliefs and working practices. See above for why my working practices are so stringent.

boog:
Congrats to you! It takes a lot of arrogance to tell others that if they don't agree with you then they will never be good coders. I didn't know you had it in you.

No disrepsect boog, but I obviously work in a way more rigorous environment than you. Have you ever worked in any safety-critical environment? Don't bother answering that because I can tell immediately that you haven't.

boog:
In my experience, people who spout that crap tend to have a fairly poor grasp of regular expressions in general.

So I'll give you this one, because I actually hate that expression - regexes have so many legitimate uses that to blithely state that they automatically introduce a new problem is just stupid. It seemed like an appropriately pithy way to sign off on a rant about misusing regular expressions but yes, it's a crap sentiment.

2011-03-29 Reply Admin

Design Pattern:
Anonymous:
"Some people, when confronted with a problem, think 'I know, I'll use regular expressions'..."
You know the rest.
"Premature optimizations are the root of ..."

You know the rest.

I'll just quote the post you obviously didn't bother reading:

Anonymous:
This isn't about premature optimisation, it is about avoiding hugely costly tasks for simple and repetetive jobs. If you don't understand that, you will never be a good coder.

But to be fair, I shouldn't have bothered with that Zawinski quote in the first place; I don't even agree with it.

No Letters Allowed!

Leave a comment on “No Letters Allowed!”