• данийл (unregistered)

    This has Nagesh written all over it.

  • Not of this Earth (unregistered) in reply to данийл

    Just wow...

  • C-Octothorpe (unregistered) in reply to данийл
    данийл:
    This has Nagesh written all over it.

    Nagesh, take a bow... You are now officially a meme (to the 6 people that frequent this site).

  • CodeMonkey (unregistered)

    So, what happens when phone numbers go hex?

  • blarg (unregistered) in reply to Quicksilver
    1. Idea Extend string class to implement "replaceAll" function for language which may or may not have it already.
    Quicksilver:
    1. Idea number = number.replaceAll("\\D",""); no to short and who knows \D anyway?
    1. Idea number = number.replaceAll("[^\d"],""); yes but what if reader ist not that fluent in regexp, got to remove these predefined groups

    2. Idea number = number.replaceAll("[^0123456789]"],""); OK but what if the reader does not know about negation...

    3. idea see article

    perfect!!!

  • Jellineck (unregistered) in reply to CodeMonkey
    CodeMonkey:
    So, what happens when phone numbers go hex?

    You rewrite the code (even this shitty code) to support it. You don't code to prospective realities, you code to the current reality.

    Although I imagine that would be a great WTF. "Some junior decided that phone numbers might be in hex someday, so he wrote a bunch of code to support it. He has another few months of beatings to endure."

  • MK|C (unregistered)

    Oh. Oh God.

    ...I thought Python was relatively free of WTF programmers (myself excluded, of course).

  • Anonymous Bastard's Son (unregistered)

    I'd give the guy some credit if he even put it in a "reusable" method for each field...what am i thinking???

  • (cs)

    Look hang on, I'm not so sure this will work. I'm not sure about the "replace" function. What if it only replaces the first instance of the character? I'm worried this isn't going to catch every instance of iffy characters (and I haven't got time to read the documentation), so I reckon:

    homephone[1] = homephone[1].replace(" ", "").replace("-", "").replace("(", "") : :

    homephone[2] = homephone[2].replace(" ", "").replace("-", "").replace("(", "") : :

    ... and so on ought to do it. Um, how long can this string be, by the way ...?

  • Zapp Brannigan (unregistered)

    Please help me, do you have a unicode version?

  • Nick (unregistered) in reply to BentFranklin
    BentFranklin:
    programmer = programmer.replace("right away")

    LIKE IT!

    On a side note: We need a "I like it" button or other way to rate comments

  • (cs) in reply to Macro King
    Macro King:
    frits:
    Even VB can do it better:
            For i = 0 To homephone.Length - 1
    

    You forgot OnError, ResumeNext. And you're going to need that because VB strings are 1 based!

    The real WTF etc etc...

    Not in .NET AFAIK. But that would be TRWTF if true.
  • (cs)

    This is only a WTF because the guy didn't use a for loop.

    char badCharacters[100];
    
    for(int i = 0; i < 100; i++){
      word = word.replace(badCharacters[i],'');
    }
    

    If you saw this code you would definitely think it was bad, but not worthy of a daily wtf. Can we get real articles please?

  • m (unregistered) in reply to Anonymous
    Anonymous:
    Computers are a great way to simplify tedious manual tasks. Of course, some developers don't fully understand this concept.
    of course he understood:
    for i in homephone workphone cellphone; do echo ' -()[].`^!@#$%&*+=/<>:;~ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'|sed 's/\(...\)/\1\n/g'|sed 's/\(.\)/.replace("\1", "")/g'|sed "s/^/$i = $i/";echo;done
    they even didn’t remove
    '"\
    to keep the script simple.
  • Simon (unregistered) in reply to boog
    boog:
    @Deprecated:
    employees.replace("programmer", "monkey");
    I'm guessing that's how they ended up with this code in the first place.

    No - a monkey wouldn't have the attention span to come up with something this long and repetitive. Besides, they tend to throw the crap at their peers directly, rather than leaving it for the next generation of monkeys to find in a few years time...

  • Simon (unregistered) in reply to Anonymous
    Anonymous:
    Mark:
    Dude. Google "regular expressions". I swear it will blow your effing mind.
    I'm worried about the amount of people who are suggesting regular expressions for such a trivial problem. Regular expressions have their place but they are totally the wrong tool for a simple non-numeric replace like this. Fun fact: in the .NET framework the regular expression handling classes have the highest cyclomatic complexity of any classes in the entire framework (last time I checked, circa 3.0). Yes, I know this code isn't .NET but it highlights an inescapable fact about regexes - they are a seriously expensive operation, in any language. Don't abuse them for silly little jobs like this one.

    Depending on the language, the replace() code may actually be using regexes internally. Java's String.replaceAll() method takes a regex as the first parameter, so it's not inconceivable that this unspecified language is already using a spectacularly ineffecient number of regular expressions.

  • LoveKnuckle (unregistered)

    Nice use of fluent programming style.

  • Gary Olson (unregistered) in reply to Skilldrick
    Skilldrick:
    Silverwizard:
    This is MASSIVE duplicates code, you think he'd just do: homephone = makeValid(homephone); workphone = makeValid(workphone); cellphone = makeValid(cellphone);

    Yeesh! Everyone knows that methods are useful for saving time!

    OMG You're right, I hadn't realised that the duplicate code could be extracted into a separate function - that's definitely the biggest WTF here. [/sarcasm]

    Sure, like an outer join to 3 different databases; one for each type of phone number. This allows for flexibility if home/work/cell phone numbers start to have different characteristics. And, if different types of phone numbers are required in the future, just copy a database, some quick editing, and add another join.

  • Mark (unregistered)

    TRWTF: You only wrote six sentences and still managed to make a typo on "regular" and misspell "weird".

  • wha (unregistered) in reply to Kevin
    Kevin:
    What is particularly brilliant about this is that, rather than call a single function for the replaces, it is repeated three times in so homophone, workphone and cellphone can be customized individually.

    Homophone? What like there, their and they're?

  • SomeYoungGuy (unregistered) in reply to Sobriquet
    Sobriquet:
    eMBee:
    in pike i'd do this:
    string phonenumber = "(123)456-789/0";
    string clean_phonenumber = `-(phonenumber, @Array.uniq(sort( `-(phonenumber, @"1234567890"/"")/"" )));
    or
    multiset allowed = mkmultiset("1234567890"/"");
    Array.filter(phonenumber/"", lambda(string digit){ return allowed[digit]; }) * "";

    In Perl it's

    tr/0-9//cd
    and I'm done.

    Yeah, and then a non-perl developer like me comes along later forced to make sense of your perl script and sees "tr/0-9//cd" and says WTF? At least I could look at today's TDWTF code and get an idea of what it does.

  • jane (unregistered) in reply to CodeMonkey
    CodeMonkey:
    So, what happens when phone numbers go hex?

    comment down the bottom

    and we will add a nice story for akismet, too. I wonder if it might be because there are too many commas vs characters in the post. How far would I need to go to balance this out? So far, I have tried to post 3 times, but clearly I'm not fooling akismet who knows better. Maybe I will simply write some big long sentences without any puctuation to see whether we can get something through we have already tried to post five times and each times akismet was not happy still I suppose it is far far better to encourage people to go to the effort of writing soliloquies to avoid the spambot rather than risk a couple of adds for nike shoes. Maybe we will try something differet I wonder if akismet likes poetry: There was movement at the station for the word had got around that the colt from old regret had got away

    Or maybe It was somewhere in the country in a land of rock and scrub that they formed an institution called the Geebung Polo CLub

    akismet:
    Simple, comment out the replaces for A,B,C,D,E,F, a,b,c,d,e,f, x,X
  • Hater (unregistered) in reply to SomeYoungGuy
    SomeYoungGuy:
    Sobriquet:
    eMBee:
    in pike i'd do this:
    string phonenumber = "(123)456-789/0";
    string clean_phonenumber = `-(phonenumber, @Array.uniq(sort( `-(phonenumber, @"1234567890"/"")/"" )));
    or
    multiset allowed = mkmultiset("1234567890"/"");
    Array.filter(phonenumber/"", lambda(string digit){ return allowed[digit]; }) * "";

    In Perl it's

    tr/0-9//cd
    and I'm done.

    Yeah, and then a non-perl developer like me comes along later forced to make sense of your perl script and sees "tr/0-9//cd" and says WTF? At least I could look at today's TDWTF code and get an idea of what it does.

    Most of the time perl regular expressions look like total garbage and cannot be deciphered. But come on, you couldn't spend all of 2 minutes to look this one up online?

  • Devil's Advocate (unregistered)

    Why are we always so obsessed with Data formats? It seems to go against internationalization and other ideas. I frequently get sick of (online) forms that don't let me proceed without adding some mundane detail that is not relevant to them. Further, by insisting that I enter 'realistic' values in the fields they feel are important, they are making it a lot harder for themselves (somewhere down the track) to verify whether my submission is genuine.

    If someone doesn't give a valid looking email address, and you feel that a valid address is important, you can ignore their submission on the backend without bothering them on the front end. There is a principle in Data Matching that raw data usually provides more value and more indentification detail than any efforts to scrub, enhance or 'fix' the data.
    Telephone numbers come in many formats around the world, and given the way they are structured it is quite possible that the spacing or other punctuation may give some identification as to where it comes from (where as stripping it might make two totally different numbers from two totally different locales appear similar).

    Observe some numbers from different parts of the world (they have been changed from existing numbers I found, and it's possible that some of them are no longer valid after my changes, but hopefully they illustrate the point): US: +1 847 576 6246 Czec Republic: +52 (55) 5243-6123 Indonesia: (62-21) 571-8855 UK: +971 4 33 15 476 Italy: +39 02545371 Australia: +61 3 9856 7698 Each of them has a different spacing, and some have brackets and dashes, but the format is relevant. What if we remove it? 18475766246 525552436123 62215718855 97143315476 3902545371 61398567698 Of course, the area codes should dictate that we never get duplicates, but what if people don't use full area codes? How many people use full (national and) international prefixes when submitting phone numbers online?

    Of course, if the system in question was insignificant and was only for Fr Brian to keep track of parishioners, there's no issue, but if this is on a system that has global reach and might (whether today or in the future) need to be able to store multiple locale's phone numbers, then this solution is naive at best. It might be desirable to validate that phone numbers are in a prescribed format (someone bodgey from Russia {for example} might not know conventions used in the US to write phone numbers) but even this shouldn't necessarily be articulated to the end user (but rather dropped silently into a 'look at later' file).

    There is much more that could be said, but ultimately, the real WTF is people who insist on trying to validate or alter potentially global data. Forcing someone to observe a specific format (whether in numbers or addresses or something else) will increase the number of bad results you get because users will try to beat the system with another value. Many (American) sites have options for people from outside the US to enter data, but still have requirements on either state or zipcode fields. Sites like this will get many people from elsewhere enter anything the validator accepts to get over this requirement. The only effect is that the data is suddenly compromised.

    But who cares, right?

  • A. Meiburg (unregistered)

    It's good to know that ",__?" is a valid phone-number. Too bad I can't try dialing it - all my phones, and Skype too, are lacking the necessary buttons, unfortunately.

  • Darth Paul (unregistered)

    Both the person who wrote the requirements and the person who implemented them has no understanding of telephone numbers.

    It amazes me how many web sites out there validate phone numbers as "Long Integer", or worse, "Long Integer starting with 0". Clueless.

    And without the brackets and plus symbol, the user does not know what to dial when using the stored number, which defeats the purpose of storing it in the first place.

  • Ari (unregistered)

    You could kill that with a backslash

  • Deresen (unregistered)

    I'd prefer this, regex is for pussy's:

    String newPhoneNr = ""; for(int i = 0; i < phoneNr.Length; i++) if(phoneNr[i] == '0' || phoneNr[i] == '1' || phoneNr[i] == '2' || phoneNr[i] == '3' || phoneNr[i] == '4' || phoneNr[i] == '5' || phoneNr[i] == '6' || phoneNr[i] == '7' || phoneNr[i] == '8' || phoneNr[i] == '9') newPhoneNr += phoneNr[i];

  • st0815 (unregistered) in reply to Your Name
    Your Name:
    You may want to do other validation for the phone number as well. For the US, for instance, you'll want to make sure it's ten digits. Also, some checking of the actual digits[...]

    Brilliant, you found a way to make the code worse. Now you don't just have a cumbersome way to filter out (some) characters from the phone field, you've also arbitrarily restricted yourself to US numbers. Since obviously you'll never have a customer who isn't a US citizen or a US citizen who happens to live abroad...

  • unregistered (unregistered)

    What if the user start inputting chinese characters ? This implementation is a good start just incomplete, you need to extends this to exclude the whole non numeric UTF-8 + the Klingon alphabet.

    Also insert these characters on a reference data table so that you can adapt the exclusion list without code changes, for complete future proofness and flexibility.

    These sloppy developers, always taking the easy way out !

  • Anonymous (unregistered) in reply to blarg
    Anonymous:
    Note also that the code calls replace so many time that regex performance issues pale in comparison.
    The code quoted in the article uses an unnecessary number of replaces but if you are trying to suggest that regexes are the most efficient way of performing this task then you are sorely mistaken. Ever used a profiler before? You should give it a shot instead of making totally wild guesses about performance (yes, I profile my code if I'm concerned about performance - real perfomance metrics are far better than guesswork).
    blarg:
    Are you one of those guys who spends 3 days finding a 'clever' workaround to save 2 clock cycles for a task which is called once an hour and has a disk/network bottleneck anyway?
    No, I'm the sort of person who uses the right tool for the job.
    boog:
    Is execution time the only value by which you measure one tool's "total wrongness" vs. another?
    No, but it's certainly high on my list of criteria. More so that complexity to author - within reason, of course (you only author once but you execute many times, so if I can improve execution speed by spending an extra 20 minutes thinking about the most efficient way to write the code, I see that as a success).

    I think a lot of these comments nicely highlight a common problem I see when people first learn regexes - they think they're the holy grail to every string manipulation problem out there. "But I can do this with one simple regex"! Simple for you but wasteful and labour intensive for your processor. This isn't about premature optimisation, it is about avoiding hugely costly tasks for simple and repetetive jobs. If you don't understand that, you will never be a good coder.

    "Some people, when confronted with a problem, think 'I know, I'll use regular expressions'..."

    You know the rest.

  • Anonymous (unregistered) in reply to Simon
    Simon:
    Depending on the language, the replace() code may actually be using regexes internally. Java's String.replaceAll() method takes a regex as the first parameter, so it's not inconceivable that this unspecified language is already using a spectacularly ineffecient number of regular expressions.
    Sorry Simon, I left you out in my previous reply. What you're basically saying is true, it's not inconceivable that <some langauge> uses regex underneath replace. But I would expect this to be extremely rare for simple pattern matching replacements like string.Replace(thisString, withThisString) because regex is actually quite ill-suited to this task (as well as being very expensive). There are simpler ways of doing it, plain and simple, so to underpin this type of replace with regex is just foolhardy (or lazy, and I'll concede that there are some lazy language designers out there).
  • Turs (unregistered)

    All that and he still got it massively wrong.

    Replacing + with an empty string is going to screw it up for any sort of international phonenumber of the +61234567890 format.

  • noland (unregistered)

    Please do not attempt to clean up any phone numbers! Either they match your pattern, or return them to the user for review, as there's probably a typo ... p.e. (JS):

    var matches = RegExp(/^\s*(+[1-9])?([0-9\ \-]+)(\s*\/\s*[0-9]+)?\s*$/).exec(phoneNumber);;
    if (matches) {
      // return int. prefix, phone number, extension
      return matches[1] + matches[2] + matches[3];
    }
    // no match, return empty string for invalid, do not attempt to clean up
    return '';
    
  • Mikołaj (unregistered)

    That routine needs a rewrite! Not all unicode characters are properly replaced.

  • Stevie D (unregistered) in reply to Jerry
    Jerry:
    caper:
    The WTF is the requirement "Only the numbers should be saved."
    Oh is that what he was trying to do?

    Easy. Add zero. If it isn't a number you'll get an error.

    Bzzzt. Wrong.

    If you want to check if it's a number, sure that might work. Although I'm guessing that all languages will have a better method.

    But that's not the goal. He's stripping out all the non-numeric characters. So '(020)-1234-5678' becomes '02012345678'. We don't care if the starting point isn't a number, we're trying to make it into one.

  • Jonesy (unregistered) in reply to false
    false:
    JavaScript .replace() replaces only first occurrence of the needle, so I guess this is TRWTF here.

    You must have a stranger javascript interpreter then as it replaces all of them for me... even gives you an example with more than one occurance on the W3C site. :S

  • Jonesy (unregistered) in reply to Jonesy

    hold up.... looks like i have the strange interpreter.... wtf?? it shouldnt work but does?? back to bed me thinks, try again tomorrow

  • (cs)

    Gads! It's a limited length string! Just walk the fscking thing! Even as a PL/SQL Function this is elementary.

    Execute a toNumber() on each character and if the cast succeeds you keep the character. The exception trap just discards the character so you can proceed next.

    A final viability check for length against known/acceptable dialed number formats and you're pretty much done.

  • Sudo (unregistered)

    TRWTF is so many people acting smart about fixing code that they read on a site about terrible code.

    Look at me! I'm slightly more intelligent than a retard! I WIN!

  • Design Pattern (unregistered) in reply to Stevie D
    Stevie D:
    Jerry:
    Oh is that what he was trying to do?

    Easy. Add zero. If it isn't a number you'll get an error.

    Bzzzt. Wrong.

    If you want to check if it's a number, sure that might work. Although I'm guessing that all languages will have a better method.

    But that's not the goal. He's stripping out all the non-numeric characters. So '(020)-1234-5678' becomes '02012345678'. We don't care if the starting point isn't a number, we're trying to make it into one.

    In my math book, adding zero to (020)-1234-5678 (as suggested), yields -6892, not 02012345678!

  • Membrane (unregistered) in reply to Dan

    Insanitisation shurely?

  • Spoom (unregistered)

    This is absolutely a lesson that should be taught to programming students: When you're filtering user inputs, a whitelist is pretty much always more secure than a blacklist.

  • Someone who can't be bothered to login from work (unregistered)

    I'd say walking the value is almost certainly the correct way of doing it. Using a regexp for this task is just stupid and inefficient.

  • (cs) in reply to Anonymous
    Anonymous:
    boog:
    Is execution time the only value by which you measure one tool's "total wrongness" vs. another?
    No, but it's certainly high on my list of criteria. More so that complexity to author - within reason, of course (you only author once but you execute many times, so if I can improve execution speed by spending an extra 20 minutes thinking about the most efficient way to write the code, I see that as a success).
    If you spend an extra 20 minutes thinking about the most efficient way to write code, and all that thinking saves you ~1 millisecond of execution time per execution, and that code executes only 10000 times (or even 100000) in its lifetime, I don't know how anyone could see that as a success.
    Anonymous:
    I think a lot of these comments nicely highlight a common problem I see when people first learn regexes - they think they're the holy grail to every string manipulation problem out there.
    Knowing when optimization is necessary is far more important to me than always choosing the most-optimized solution. I believe a good solution now is better than the best solution later. My customers won't notice 1 extra millisecond of execution time, but they will notice if I miss a deadline.
    Anonymous:
    Simple for you but wasteful and labour intensive for your processor. This isn't about premature optimisation, it is about avoiding hugely costly tasks for simple and repetetive jobs.
    You think telephone entry validation is a repetitive job?
    Anonymous:
    If you don't understand that, you will never be a good coder.
    Congrats to you! It takes a lot of arrogance to tell others that if they don't agree with you then they will never be good coders. I didn't know you had it in you.
    Anonymous:
    "Some people, when confronted with a problem, think 'I know, I'll use regular expressions'..."

    You know the rest.

    In my experience, people who spout that crap tend to have a fairly poor grasp of regular expressions in general.

    Aren't bullshit anecdotes fun?

  • blarg (unregistered) in reply to Anonymous

    [quote user="blarg"]Are you one of those guys who spends 3 days finding a 'clever' workaround to save 2 clock cycles for a task which is called once an hour and has a disk/network bottleneck anyway?[/quote] No, I'm the sort of person who uses the right tool for the job. [/quote]

    All evidence suggests otherwise if you think wasting time optimizing this is a worthwhile thing to do. Perhaps your time and your companies money is worth a lot less than mine, which explains why you would be happy to work on optimizations that have absolutely no noticeable benefit. Parsing a telephone number is a very different scenario to doing high frequency trading or low latency messaging etc - the right tool for the job is not going to be the same. Boog is completely right.

  • (cs) in reply to Icould od much better
    Icould od much better:
    Rob:
    This guy was paid by the line, right?
    No. He wouldn't do 3 replaces per line if it was so.

    Not sure about that... he might have been running his "buy one line, get two lines free" promotion

  • Design Pattern (unregistered) in reply to Anonymous
    Anonymous:
    "Some people, when confronted with a problem, think 'I know, I'll use regular expressions'..."

    You know the rest.

    "Premature optimizations are the root of ..."

    You know the rest.

  • Anonymous (unregistered) in reply to boog
    boog:
    If you spend an extra 20 minutes thinking about the most efficient way to write code, and all that thinking saves you ~1 millisecond of execution time per execution, and that code executes only 10000 times (or even 100000) in its lifetime, I don't know how anyone could see that as a success.
    What the hell? I work on flight controllers that process input signals hundreds or even thousands of times every second. I cycle through 100000 calculations in just a couple of minutes of runtime.
    boog:
    Knowing when optimization is necessary is far more important to me than always choosing the most-optimized solution. I believe a good solution now is better than the best solution later. My customers won't notice 1 extra millisecond of execution time, but they will notice if I miss a deadline.
    OK, so we're obviously coming from seriously different backgrounds here. In my line of work, 1 extra millisecond of execution time equates to tens of missed input signals and the redundancy in an active flight control system can only handle so many missed inputs before the system becomes unstable and, you know, planes start crashing. My customers will definitely notice that.
    boog:
    You think telephone entry validation is a repetitive job?
    Well, my comments are not specifically related to the article code, they are related to my own beliefs and working practices. See above for why my working practices are so stringent.
    boog:
    Congrats to you! It takes a lot of arrogance to tell others that if they don't agree with you then they will never be good coders. I didn't know you had it in you.
    No disrepsect boog, but I obviously work in a way more rigorous environment than you. Have you ever worked in any safety-critical environment? Don't bother answering that because I can tell immediately that you haven't.
    boog:
    In my experience, people who spout that crap tend to have a fairly poor grasp of regular expressions in general.
    So I'll give you this one, because I actually hate that expression - regexes have so many legitimate uses that to blithely state that they automatically introduce a new problem is just stupid. It seemed like an appropriately pithy way to sign off on a rant about misusing regular expressions but yes, it's a crap sentiment.
  • Anonymous (unregistered) in reply to Design Pattern
    Design Pattern:
    Anonymous:
    "Some people, when confronted with a problem, think 'I know, I'll use regular expressions'..."

    You know the rest.

    "Premature optimizations are the root of ..."

    You know the rest.

    I'll just quote the post you obviously didn't bother reading:

    Anonymous:
    This isn't about premature optimisation, it is about avoiding hugely costly tasks for simple and repetetive jobs. If you don't understand that, you will never be a good coder.
    But to be fair, I shouldn't have bothered with that Zawinski quote in the first place; I don't even agree with it.

Leave a comment on “No Letters Allowed!”

Log In or post as a guest

Replying to comment #342343:

« Return to Article