• Daniel T (unregistered)

    First?

  • Garfunkalow (unregistered)

    regex anyone?
    that whole nasty thing can be shrunk to a nice little regex
    \w+([-+.]\w+)@\w+([-.]\w+).\w+([-.]\w+)* i think that would do it maybe not.. but still.... 

  • Jonathan (unregistered)

    Wow!

    He must be paid per line b/c I tend to do something like this:

    Function isValidEmail(myEmail)
      dim isValidE
      dim regEx
     
      isValidE = True
      set regEx = New RegExp
     
      regEx.IgnoreCase = False
     
      regEx.Pattern = "^[a-zA-Z][\w.-][a-zA-Z0-9]@[a-zA-Z0-9][\w.-][a-zA-Z0-9].[a-zA-Z][a-zA-Z.]*[a-zA-Z]$"
      isValidE = regEx.Test(myEmail)
     
      isValidEmail = isValidE
    End Function%>

  • (cs) in reply to Garfunkalow

    I've met a bunch of "programmers" who don't even know what a Regular Expression is.  Functions like this one don't surprise me in the least.

  • Daniel T (unregistered) in reply to Daniel T

    there_is_a_at_simbol - best variable name ever :-)

    --Daniel T

  • (cs)
    Alex Papadimoulis:
    Function isEmail(ByRef Invalue)
    Dim valueStr
    valueStr = trim(CStr(Invalue))
    If (Not(detectedEmailBadCharacters(valueStr))) Then If (Not(detectDoubleDotInARow(valueStr))) Then
    If (Not(detectDoubleAtSimbol(valueStr))) Then If (isUserNameOk(valueStr)) Then If (isExtensionOK(valueStr)) Then If (isHostNameOK(valueStr)) Then isEmail = True Else isEmail = False End If Else isEmail = False End If Else isEmail = False End If Else isEmail = False End If Else isEmail = False End If Else isEmail = False End If End Function

    'Functions snipped



    I'm not going to try to do an exhaustive list.  But there is no reason to nest each of those validations.  And if I were writing something like this, I would keep the sense of all my functions the same, and not have to use the "NOT' keyword in front of some but not others.  Fairly ugly.

  • (cs)
    Alex Papadimoulis:
    Function isEmail(ByRef Invalue)
    Dim valueStr
    valueStr = trim(CStr(Invalue))
    If (Not(detectedEmailBadCharacters(valueStr))) Then If (Not(detectDoubleDotInARow(valueStr))) Then
    If (Not(detectDoubleAtSimbol(valueStr))) Then If (isUserNameOk(valueStr)) Then If (isExtensionOK(valueStr)) Then If (isHostNameOK(valueStr)) Then isEmail = True Else isEmail = False End If Else isEmail = False End If Else isEmail = False End If Else isEmail = False End If Else isEmail = False End If Else isEmail = False End If End Function



    At the very least, couldn't he have simplified this function to this, and gotten rid of all those nested if statements?

    Function isEmail(ByRef Invalue)
    Dim valueStr
    valueStr = trim(CStr(Invalue))
    isEmail = (Not(detectedEmailBadCharacters(valueStr))) And _
    (Not(detectDoubleDotInARow(valueStr))) And _
    (Not(detectDoubleAtSimbol(valueStr))) And _
    (isUserNameOk(valueStr)) And _
    (isExtensionOK(valueStr)) And _
    (isHostNameOK(valueStr))
    End Function

  • Dave (unregistered)

    Another case of not using regular expressions.....  Unfortunately this is much too common...

  • asdf (unregistered) in reply to einstruzende

    Word to the wise is that regular expressions and email addresses don't mix all that well.

    http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html

  • Mutt (unregistered)

    Function detectDoubleAtSimbol(strInput)
    Dim ch, there_is_a_at_simbol, i
    ch = ""
    there_is_a_at_simbol = False
    For i = 1 To Len(strInput)
    ch = mid(strInput,i,1)
    If((ch = "@") And (there_is_a_at_simbol = False)) Then
    there_is_a_at_simbol = True
    ElseIf((ch = "@") And (there_is_a_at_simbol = True)) Then
    detectDoubleAtSimbol = True
    Exit Function
    End If
    Next
    detectDoubleAtSimbol = False
    End Function


    So the WTF is that he misspelled Symbol, right?

  • (cs)

    This is certainly no shining example of good code but these past two WTFs have really just been bad code IMHO.  I guess there really are a finite number of WTF patterns out there and we're getting close to having seen them all.

  • (cs) in reply to Jonathan
    Anonymous:
    Wow!

    He must be paid per line b/c I tend to do something like this:

    Function isValidEmail(myEmail)
      dim isValidE
      dim regEx
     
      isValidE = True
      set regEx = New RegExp
     
      regEx.IgnoreCase = False
     
      regEx.Pattern = "^[a-zA-Z][\w\.-]*[a-zA-Z0-9]@[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]\.[a-zA-Z][a-zA-Z\.]*[a-zA-Z]$"
      isValidE = regEx.Test(myEmail)
     
      isValidEmail = isValidE
    End Function%>


    So you set IngoreCase = false, then use [a-zA-Z] in every location.  That is a nice WTF.

    And lets look at that pattern.  Start of String, followed by one alpha character, followed by zero or more word characters, dots, or dash, followed by a single alphanumeric character, follow the strudel, followed by exactly one alphanumeric character, followed by zero or more word characters, dots, or dashes, followed by one alphanumeric, followed by a dot, followed by exactly one alpha, followed by zero or more alphas or dots, follwed by exactly one alpha, and the end of string.

    So [email protected] is not a valid email?

    You might want to look at this for details on using a regex to validate an RFC822 email address.
  • (cs) in reply to RevMike

    forgot the link

  • Eduardo Habkost (unregistered) in reply to Dave

    It would be really nice if the only problem was it is not using regular expressions. :)

  • (cs) in reply to RevMike

    http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html

    Software is acting funny for me.  :(

  • (cs) in reply to Jonathan
    Anonymous:
    Wow!

    He must be paid per line b/c I tend to do something like this:

      regEx.Pattern = "^[a-zA-Z][\w\.-]*[a-zA-Z0-9]@[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]\.[a-zA-Z][a-zA-Z\.]*[a-zA-Z]$"

    Stop doing that.  You are not accepting these valid addresses:
    abigail @example.com
    *@example.net
    "\""@foo.bar
    fred&[email protected]
    [email protected]
    "127.0.0.1"@[127.0.0.1]
    Use a real RFC822 compliant validator like Mail::RFC822::Address

  • brad (unregistered) in reply to kipthegreat

    The nesting of the if statements is actually for performance... VB is nice enough not to short circuit like that, which is a WTF itself.

    brad

  • (cs) in reply to loneprogrammer

    Here's a very simple email validation: ^.+?@.+\..+?$. It doesn't get much simpler.

  • (cs) in reply to brad

    Anonymous:
    The nesting of the if statements is actually for performance... VB is nice enough not to short circuit like that, which is a WTF itself.

    brad

    If I remember correctly VB can short circuit it just does not do this by default.  I believe the operators are: AndAlso and OrElse

  • (cs)

    I'm beginning to think that Regular Expressions ought to be the very next thing tought to computer science students after the prerequeset Hello World app.

  • brad (unregistered) in reply to DrJames

    AndAlso and OrElse werent until VB.Net and that appears to be old school vb/asp (prolly asp)

    brad.

  • Anonymous (unregistered) in reply to DrJames

    I like how none of his variables are typed.

    (AndAlso and OrElse are only available in VB.NET, my guess is that this if VB6).

  • (cs) in reply to loneprogrammer
    loneprogrammer:
    Anonymous:
    Wow!

    He must be paid per line b/c I tend to do something like this:

      regEx.Pattern = "^[a-zA-Z][\w\.-]*[a-zA-Z0-9]@[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]\.[a-zA-Z][a-zA-Z\.]*[a-zA-Z]$"

    Stop doing that.  You are not accepting these valid addresses:
    abigail @example.com
    *@example.net
    "\""@foo.bar
    fred&[email protected]
    [email protected]
    "127.0.0.1"@[127.0.0.1]
    Use a real RFC822 compliant validator like Mail::RFC822::Address



    I use something like this on my own website (I am reconstructing this from memory.. it's something like this though..)-
    /^[a-z0-9_]+(-[a-z0-9_]+)*@[a-z0-9]+(-[a-z0-9]+)*(\.[a-z0-9]+(-[a-z0-9]+)*)*\.[a-z]{2,3}$/i

    Maybe everyone can tell me how wtf'ed it is... it's a personal site, so its not like I have to worry about losing money if valid e-mail addresses are rejected..
  • David Wolever (unregistered) in reply to brad

    Because I'm sure performance was the first thing on this guy's mind ;)

  • (cs)

    Function isUserNameOk(strInput)
      Dim intIn
      intIn = InStr(strInput,"@")
      If intIn = 0 Then
        isUserNameOk = False
      Else  
        isUserNameOk = True
      End If
    End Function

    Hmmmm ... so, according to the above function, @jeff.com is a valid email address?  (Instr(..) returns 1, which is not zero)

    The whole thing is pretty horrible; even if you don't know reg expressions, don't have them available, or don't want to use them, there is no excuse for writting sloppy string checking code like this.

     

  • (cs)
    Alex Papadimoulis:
    Function isEmail(ByRef Invalue)
    Dim valueStr
    valueStr = trim(CStr(Invalue))
    If (Not(detectedEmailBadCharacters(valueStr))) Then


    I know, he forgot to write:
      If (Not(<font style="color: rgb(255, 0, 0);" size="4">IsTrue</font>(detectedEmailBadCharacters(valueStr)))) Then
    

  • (cs) in reply to nordyj2001
    nordyj2001:
    I'm beginning to think that Regular Expressions ought to be the very next thing tought to computer science students after the prerequeset Hello World app.


    That'd be fine with me.  I was at least 3 or month out of college before I ever even heard the term "RegEx."  Granted, I minored rather than majored in computer science.


  • (cs)

    Wow. Lack of regular expressions is only the tip of the iceberg. Even if there was no such thing as RegEx there's plenty of WTFs to be found here. For example, witness detectDoubleDotInARow, detectDoubleAtSimbol [sic]: 20 lines of code to do

    If InStr(strInput, "..") > 0 Or InStr(strInput, "@@") > 0 Then Return True

     

     

  • Joe (unregistered)

    Since there seems to be a large number of people that write email validators in this thread let me say this really loud:

    The plus sign ('+') is a valid character in email addresses!

    In fact just about every ASCII character is valid. 

    Read the RFC.  It's important!  +, %, $, !, ., etc 

    Gmail has this really neat feature where you can go [email protected] and it goes to "myname" and you can filter it for "dailywtf" and automatically give it a tag or trash it (good for killing spam).

  • (cs)

    Fantastic.  After all that, there are still countless ways to break the damn thing......when you start going down this path writing code, shouldn't you start wondering "Hmm, is this how everyone has to do it?!?  This really sucks.

  • (cs) in reply to Joe
    Anonymous:

    Since there seems to be a large number of people that write email validators in this thread let me say this really loud:

    The plus sign ('+') is a valid character in email addresses!

    In fact just about every ASCII character is valid. 

    Read the RFC.  It's important!  +, %, $, !, ., etc 

    Gmail has this really neat feature where you can go [email protected] and it goes to "myname" and you can filter it for "dailywtf" and automatically give it a tag or trash it (good for killing spam).

    This actually isn't unique to gmail. It's an old trick for mailer daemons.

    But yes, it's surprising how many web forms reject perfectly valid email addresses. Especially when code exists on most every web language specifically to validate email addresses using fast functionality.

    Rolling your own email validation is a WTF all by itself.

     

  • some1 (unregistered) in reply to ItsAllGeekToMe

    I want to thank thedailywtf for providing us with clever techniques, tricks and code samples over the years. I am going to make sure that this advanced email validation algorithm will be implemented in our production system. Thank you thedailyWTF!!!!

  • Kelly (unregistered)

    bad_chars = "!$%^""&*()+={[}]:;'~#<,>?/|\ " 'string of Nasty characters to be checked for.
    This is the best part.  As if the variable name bad_chars doesn't say exactly what the comment says.  A WTF I see every day :)
  • (cs)

    FWIW, here's what I use:

    // is_valid_email
    // argument: email
    //  this is a string that may or may not be
    //  a syntactically valid email address
    // return value
    //  returns true if the email is syntactically valid
    //  returns false if the email is not syntactically valid

    function is_valid_email(email)
    {    var syntax_ok;
        var at_position = email.indexOf("@");

        if (at_position == -1)
        {    syntax_ok = false;
        }
        else
        {    var local_part = email.substring(0, at_position);
            var domain_part = email.substring(at_position + 1, email.length);

            var local_part_pattern =
                /^[a-zA-Z0-9!#$%&amp;'*+-/=?^_`{|}~]+(.[a-zA-Z0-9!#$%&amp;'*+-/=?^_`{|}~]+)*$/;
            var domain_part_pattern =
                /^a-zA-Z0-9?(.a-zA-Z0-9?)+$/;

            if (    local_part.match(local_part_pattern)
                && domain_part.match(domain_part_pattern)
            )
            {    syntax_ok = true;
            } else
            {    syntax_ok = false;
            }
        }

        return syntax_ok;
    }

  • Enough Already (unregistered) in reply to stevekj

    ALRIGHT. WE GET IT. STOP USING THIS 'JOKE' FOR EVERY WTF.

  • Enough Already (unregistered) in reply to stevekj

    Some douche wrote:

    "I know, he forgot to write: If (Not(IsTrue(detectedEmailBadCharacters(valueStr)))) Then"

    ALRIGHT. WE GET IT. STOP USING THIS 'JOKE' FOR EVERY WTF.

  • Fleagle (unregistered) in reply to A Wizard A True Star
    A Wizard A True Star:

    Wow. Lack of regular expressions is only the tip of the iceberg. Even if there was no such thing as RegEx there's plenty of WTFs to be found here. For example, witness detectDoubleDotInARow, detectDoubleAtSimbol [sic]: 20 lines of code to do

    If InStr(strInput, "..") > 0 Or InStr(strInput, "@@") > 0 Then Return True



    Actually, detectDoubleAtSimbol just looks for two at sImbols in the string, not two in a row. At least the guy got the name of this unnecessary function correct. Well, not counting spelling.

  • (cs)

    He should have had each of the functions he calls from the if statements throw an exception.  That way he could catch them instead of letting them return a possibly troublesome "false" value.

  • (cs) in reply to JohnO

    JohnO:
    This is certainly no shining example of good code but these past two WTFs have really just been bad code IMHO.  I guess there really are a finite number of WTF patterns out there and we're getting close to having seen them all.

    Or so you think.  There is a saying that goes "As soon as you think you've made something idiot-proof someone comes along and makes a better idiot"

  • llxx (unregistered)

       The nesting is awful. Ignoring all those WTF'd functions, the best way would be to invert some and AND them together...

  • Paul O (unregistered) in reply to RevMike

    Maybe the obscure reference to a music professor at the University of Southern North Dakota at Hoople (google if necessary) is what makes RFC822 so difficult for folks to understand.

    But I would expect any code to acknowledge the requirement(s) - either RFC822 or exceptions (or acceptable email forms).

    As for "validating" email addresses, why bother?  "[email protected]" is perfectly valid, syntactically.  But it's no more effective at delivering mail than anything these "validators" might reject.

  • llxx (unregistered) in reply to llxx

    I'd say leave to the user to put a valid e-mail in and just check it with a tiny regex like

    +@+.+
  • (cs) in reply to Otto
    Otto:
    Rolling your own email validation is a WTF all by itself.
    When confronted with existing validators that are as broken as the above-cited ones, reading the RFC and building a compliant validator is not a WTF.  Rolling your own email validation based on what you've seen in normal use usually is.
  • (cs) in reply to einstruzende

    einstruzende:
    I've met a bunch of "programmers" who don't even know what a Regular Expression is.  Functions like this one don't surprise me in the least.

    And how, brother. Several years ago, my wife was working on a an ASP project with a team that was... let's say, somewhat out of their league. After her workday got to be twelve to fifteen hours long, I got sick and tired of being without my wife and informed them that I'd be joining their team to until the project was under control. (Note: this was the first ASP project I had ever worked on.) Every morning, I'd go to my day job, knock off around 5, and head over to my wife's office to work for them for an additional four or five hours.

    After replacing  two convoluted functions similar to these with a pair of regular expressions (one for e-mail addresses, one for phone numbers), I got a phone call from my wife, who told me the project manager decided that he wanted to change the format of the phone number, and promptly broke my validation code. When I checked, it appeared that he'd attempted to convert the regular expression back into some kind of awful hybrid of regexp and an "iterate over the string one character at a time" beast that... I don't know, I blocked it out. But I fixed it, and left strict instructions that the PM was not to touch my code again.

    As a side note: it can be pretty liberating to work for a guy who can't fire you.

  • (cs) in reply to llxx
    Anonymous:
    +@+.+


    I'm assuming that by "+" you meant something representing one or more characters, rather than the more standard ".+".

    a@b is a perfectly valid email address in certain contexts.  Granted, on the Internet at large it won't work, but if you craft an application that rejects it when you know that your SMTP server will be able to handle it, you've tied the hands of one of your users for no reason.
  • (cs) in reply to sammybaby
    sammybaby:

    ...After replacing  two convoluted functions similar to these with a pair of regular expressions (one for e-mail addresses, one for phone numbers), ...

    As a side note: it was acceptable in this situation to not accept the whole range of addresses described by RFC 822, so the regular expression wasn't the type of thing likely to make your eyeballs fall out.

  • Andir (unregistered) in reply to sammybaby

    To the guys AND'ing...(and myself)

    I actually thought this at first:

    Function isEmail(ByVal Invalue as String) as Boolean
      Invalue  = trim(Invalue) 
      isEmail = (Not(detectedEmailBadCharacters(Invalue))) AND _
                (Not(detectDoubleDotInARow(Invalue))) AND _
                (Not(detectDoubleAtSimbol(Invalue))) AND _
                (isUserNameOk(Invalue)) AND _
                (isExtensionOK(Invalue)) AND _
                (isHostNameOK(Invalue))
    End Function
    Function isHostNameOK(ByVal strInput as String) as Boolean
      isHostNameOK = (InStr(strInput," ") = 0) AND _
                     (InStr(strInput,"@") <> 1) AND _
                     (InStr((Len(strInput)-1),strInput,"@") = 0) AND _
                     (InStr((Len(strInput)-1),strInput,".") = 0) AND _
                     (InStr((Len(strInput)-1),strInput,"_") = 0)
    End Function

    Then I thought...well shit, which is faster, so I ran 10 second iterations of each method and found the nested if statements were actually 10% faster.  +/- 0.1%  error margin.  I completely ignored fast code for a clean look.  I forgot that VB evaluates EVERY AND condition before making up it's mind.  (you can test this by changing it to:

    Function isEmail(ByVal Invalue as String) as Boolean
      Invalue  = trim(Invalue) 
      isEmail = (Not(detectedEmailBadCharacters(Invalue))) AND _
                (Not(detectDoubleDotInARow(Invalue))) AND _
                (Not(detectDoubleAtSimbol(Invalue))) AND _
                (isUserNameOk(Invalue)) AND _
                (isExtensionOK(Invalue)) AND _
                (isHostNameOK(Invalue)) AND msgbox("test")
    End Function

    All-in-all, it's pretty bad the way it is.  I did like the spelling of symbol.

  • awdball (unregistered)

    My favorite line in this whole mess is in detectedEmailBadCharacters.

       counter = 5

  • (cs)

    I once got this RegExp for email validation from some semi-official XSchema resource, in an attempt to define an email schema type.
    ^[A-Za-z0-9!#-'*+-/=?^_\{-~]+(\.[A-Za-z0-9!#-'\*\+\-/=\?\^_{-~]+)@[A-Za-z0-9!#-'*+-/=?^_\{-~]+(\.[A-Za-z0-9!#-'\*\+\-/=\?\^_{-~]+)$

    Not having read the email RFC, I don't know exactly how compliant this is, but I believe it handles every e-mail address that does not contain outlandish features such as comments.

    To the one who asked, why bother to validate?
    As a convenience. You do not validate e-mail addresses to prevent fraud addresses; that's what confirmation mails are for. You validate them to catch typos. You validate them not to prevent some H4X0R to abuse your system, but to offer some asistance to the poor sod who has no clue about computers and can't type, either.

  • (cs) in reply to Enough Already
    Anonymous:
    Some douche wrote:

    "I know, he forgot to write: If (Not(IsTrue(detectedEmailBadCharacters(valueStr)))) Then"

    ALRIGHT. WE GET IT. STOP USING THIS 'JOKE' FOR EVERY WTF.


    A brillant observation!

Leave a comment on “There's More Than One Way To Validate An Email”

Log In or post as a guest

Replying to comment #:

« Return to Article