• div0 (unregistered) in reply to RayMarron

    This should be decided at random because there should be no bias.

    BTW, real email address verification using regexes works like this: http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html (yes, you probably have expected this coming)

  • (cs)

    I've found the problem. It should have been:

    Dim ch, there_is_an_at_symbol, i

    not

    Dim ch, there_is_a_at_simbol, i

    It would have worked much better if s/he could spell.

    I am he of whom enquiries may be made…

  • (cs) in reply to asdf
    Anonymous:
    Word to the wise is that regular expressions and email addresses don't mix all that well.

    http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html


    True, but this guy could accomplish everything he's doing with just a few regular expressions. (Theoretically, he could probably do it all with just one - I don't know how powerful VB6's engine is - but that would probably make it hard-to-read.) The bigger issue is that ad-hoc validation of e-mail addresses - whether by using regexes, by emulating finite state machines, or by fucking the what - doesn't tend to be very tolerant. But if you're going to do it, you should use regexes.
  • A chicken passeth by (unregistered) in reply to Ruakh

    To tell you the truth, I don't like regexes, 'cos they're not very maintainable and are only readable by experienced programmers (gotta think of the poor guy who takes over my code).

    That said, whoever came up with this probably forgot all about InStr and Split. In lieu of regexes I find them quite useful for verifying anything with a given set pattern of symbols.

  • A chicken passeth by (unregistered) in reply to A chicken passeth by

    Ok, he did use InStr. My bad. -.-

  • andrey (unregistered) in reply to einstruzende
    einstruzende:
    I've met a bunch of "programmers" who don't even know what a Regular Expression is.  Functions like this one don't surprise me in the least.

    I'm a CS student, and I know that I went through several programming courses before anyone even mentioned regular expressions.  IMHO, this should be taught in the very first programming class a student attends, not a year or two down the road.
  • (cs)

    Unreal.

    Here are a couple I haven't seen mentioned yet:

    Alex Papadimoulis:
    Dim bad_chars, foundOne, counter, strIn, i, ch, a, bad_char


    Ouch, "bad_chars" and "bad_char" just beg to be confused. Clearly it's a good thing to run with Option Explicit and then give your variables similar names. (I can tell that Option Explicit is somewhere at the top by the way the Dim statements list the variables in the order they are used in the code and lack any type -- was probably added in as an afterthought because he kept tripping over his lame variable names.)

    Also, it looks like some of the sub-validation routines (or sub-par validation routines, as it were) are passed a reference to the object, while others are passed the value. I'm not a pro coder (sysadmin hat), so would there be any justifiable reason for doing this in some cases and not in others? (And in general it seems kind of stupid to pass a reference to a string object to a validation routine -- maybe he intends to add some code to "fix" invalid addresses instead of just returning false?)
  • llxx (unregistered) in reply to Andir
    Anonymous:

    found the nested if statements were actually 10% faster.  +/- 0.1%  error margin.  I completely ignored fast code for a clean look.  I forgot that VB evaluates EVERY AND condition before making up it's mind.

    Then isn't that a WTF in the compiler itself? Or does VB not allow short-circuit evaluation? I don't know much since I never really use VB much at all...

  • (cs) in reply to asdf
    Anonymous:
    Word to the wise is that regular expressions and email addresses don't mix all that well.

    http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html

    This is for RFC822 validation, which yields adresses much more complicated than 99.9% of the mail validators need.

    DrJames:

    Here's a very simple email validation: ^.+?@.+\..+?$. It doesn't get much simpler.

    Allows you to pass many things you're not supposed to, like "gibberish@@@foo@.;bar"

    Anonymous:

    Then I thought...well shit, which is faster, so I ran 10 second iterations of each method and found the nested if statements were actually 10% faster.  +/- 0.1%  error margin.  I completely ignored fast code for a clean look.  I forgot that VB evaluates EVERY AND condition before making up it's mind.

    Ya, and you have to use a specific AND operator for VB to short-circuit, probably cause some MS monkey considered that making the basic AND short-circuiting would break the existing code and unleash hell on earth (he probably forgot that VB had been in existance for some years...)

  • (cs) in reply to Fleagle
    Anonymous:
    Actually, detectDoubleAtSimbol just looks for two at sImbols in the string, not two in a row. At least the guy got the name of this unnecessary function correct. Well, not counting spelling.

    That'd be something like

    If InStr(strInput,"@") == InStrR(strInput,"@") Then Return True

    (please correct me; it's a long time since I've touched VB, and I can't remember what reverse Instr is called - I think it has a different argument order to the normal one too).

  • A chicken passeth by (unregistered) in reply to makomk

    inStrRev.


    Back on topic: What about using this instead?

    *note: the following code is pseudo and may not work on your compiler wthout tweaking.


    Dim isValid = True
    Dim iCharArr = Array(",", "|", ">", "etc", "etc", "...")
    Dim checkStr() As String

    For counter = 1 to UBound(iCharArr)
        checkStr = Split(<insert Email here>, iCharArr(counter))
             
         If (UBound(checkStr) > 0) then
               isValid = False
               Exit For
        End If
    Next counter

    Yes, you'll still have to check for the presence of @ and . in other places (they use different logic, UBound(checkStr) = 1...).

  • Adriano (unregistered) in reply to andrey
    I'm a CS student, and I know that I went through several programming courses before anyone even mentioned regular expressions.  IMHO, this should be taught in the very first programming class a student attends, not a year or two down the road.
    If you add up everything that should be taught in the very first programming class a student attends, you get 2 or 3 year's worth of lessons.

  • (cs)

    This is mostly just a really bad case of “I don’t understand this boolean stuff.” The string processing ineptitude is just minor, in comparison.

    FWIW, almost every single hand-rolled email validation routine out there is really bad (although usually they’re not WTFs). Email validation is very hard stuff, much less innocent a problem than it looks. Find a library that offers you functions to do it correctly – in Perl-land, that would be Email::Valid, f.ex.

  • Sean (unregistered)

    It's funny how this site mostly shows horrible .NET/VBScript/C#/etc code.

    Just more proof that people who use .NET and other proprietary technologies, have NO CLUE of what they are doing. =p

  • Thomas Magle Brodersen (unregistered)

    I knew a guy who had this e-mail address: p@dk (he worked for the dk top level domain...)
    He never could get his address to validate :-)

  • (cs) in reply to A chicken passeth by
    Anonymous:
    To tell you the truth, I don't like regexes, 'cos they're not very maintainable and are only readable by experienced programmers

    The biggest problem with regular expressions is that people write them without caring about future readers.

    For everything but the simplest regex, any decent programmer should use IgnorePatternWhiteSpace. It performs two vital functions. It allows comments to be written inline and it also allows the regex to be formatted in a readable manner.

    int Fact(int a){if(a==0)return 1;else return a*Fact(a-1);}

    If a programmer formatted code like the above it would be considered a big WTF, but somehow it becomes acceptable when using regexes.

    It doesn't help either that todays editors aren't equipped to handle regex syntax. Even regex editors usually don't provide any formatting capabilities and instead focus almost exclusivly on writing the regex instead of making it readable.

  • (cs) in reply to Mutt
    Anonymous:
    So the WTF is that he misspelled Symbol, right?


    No, that's the cherry on top.

  • (cs) in reply to DrJames
    DrJames:

    If I remember correctly VB can short circuit it just does not do this by default.  I believe the operators are: AndAlso and OrElse



    AndAlso and OrElse are from Visual Fred. They don't have anything to do with Visual Basic.
  • (cs) in reply to llxx
    Anonymous:

    I'd say leave to the user to put a valid e-mail in and just check it with a tiny regex like

    +@+.+


    That's typically written as .+@.+\..+

  • (cs) in reply to awdball
    Anonymous:

    My favorite line in this whole mess is in detectedEmailBadCharacters.

       counter = 5



    You never know when you'll need a counter initialized to 5. When the day finally comes, it's good to already have one lying around.

  • (cs) in reply to Alexis de Torquemada
    Alexis de Torquemada:
    DrJames:

    If I remember correctly VB can short circuit it just does not do this by default.  I believe the operators are: AndAlso and OrElse



    AndAlso and OrElse are from Visual Fred. They don't have anything to do with Visual Basic.

    Check out http://visualbasic.about.com/od/usingvbnet/l/bldykvbnetlogop.htm

    But please don't ask me what they were smoking.

    All this regex validation is pure wankery. The only thing you really need is to make sure there's just one @, and if  you really care, use javascript to attempt to resolve the domain. Otherwise you either have to use the 100-line regex from hell or accept that you will leave out some addresses. (Although a number of mail servers, mailers, and clients, to say nothing of domains, will throw a fit if you try to use an otherwise "valid" rfc822 address.) Especially since normal regexes are almost guaranteed to ignore international characters. [^@]+@[^@]+ will let all valid rfc822 addresses through even if it does let through others.

    If it's so hard and so unrewarding, just DON'T BOTHER.
  • (cs) in reply to asdf
    Anonymous:
    Word to the wise is that regular expressions and email addresses don't mix all that well.

    http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html


    It's immediately obvious that this regex is highly self-similar. For example, I (rather arbitrarily) singled out the substring

    (?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>

    and found 27 occurrences of it in the regex. So this regex was probably assembled from a simpler set of much more simple recurring regexes. It's surely entertaining to put it all in one 6343 character monster and post that to a website, but a full-fledged regular grammar shouldn't be hard to understand at all and would be a good deal more maintainable.
  • Paul O (unregistered) in reply to Alexis de Torquemada

    Ah, yes.  Always entertaining when someone doesn't bother to actually download and read the source code, but thinks instead that they can read the encoding of the same.

    Here's a hint:  download the file and unzip it.

  • Adam Heath (unregistered)

    Use a regex.

    $RFC822PAT = <<'EOF';
    [\040\t](?:([^\\x80-\xff\n\015()](?:(?:\[^\x80-\xff]|([^\\x80-<br> xff\n\015()](?:\[^\x80-\xff][^\\x80-\xff\n\015()])))[^\\x80-\xf
    f\n\015()]
    ))[\040\t])(?:(?:[^(\040)<>@,;:".\[]\000-\037\x80-\x
    ff]+(?![^(\040)<>@,;:".\[]\000-\037\x80-\xff])|"[^\\x80-\xff\n\015
    "]
    (?:\[^\x80-\xff][^\\x80-\xff\n\015"])")[\040\t](?:([^\\x80-<br> xff\n\015()](?:(?:\[^\x80-\xff]|([^\\x80-\xff\n\015()](?:\[^\x80
    -\xff][^\\x80-\xff\n\015()]
    )))[^\\x80-\xff\n\015()]))[\040\t]
    )(?:.[\040\t](?:([^\\x80-\xff\n\015()](?:(?:\[^\x80-\xff]|([^<br> \x80-\xff\n\015()](?:\[^\x80-\xff][^\\x80-\xff\n\015()])))[^\<br> x80-\xff\n\015()]))[\040\t])(?:[^(\040)<>@,;:".\[]\000-\037\x8
    0-\xff]+(?![^(\040)<>@,;:".\[]\000-\037\x80-\xff])|"[^\\x80-\xff\n
    \015"](?:\[^\x80-\xff][^\\x80-\xff\n\015"])")[\040\t](?:([^\\x
    80-\xff\n\015()](?:(?:\[^\x80-\xff]|([^\\x80-\xff\n\015()](?:\[^
    \x80-\xff][^\\x80-\xff\n\015()])))[^\\x80-\xff\n\015()]))[\040
    \t]))@[\040\t](?:([^\\x80-\xff\n\015()](?:(?:\[^\x80-\xff]|([
    ^\\x80-\xff\n\015()]
    (?:\[^\x80-\xff][^\\x80-\xff\n\015()])))[^<br> \x80-\xff\n\015()]))[\040\t])(?:[^(\040)<>@,;:".\[]\000-\037<br> x80-\xff]+(?![^(\040)<>@,;:".\[]\000-\037\x80-\xff])|[(?:[^\\x80-
    \xff\n\015[]]|\[^\x80-\xff])])[\040\t](?:([^\\x80-\xff\n\015()
    ](?:(?:\[^\x80-\xff]|([^\\x80-\xff\n\015()](?:\[^\x80-\xff][^\<br> x80-\xff\n\015()])))[^\\x80-\xff\n\015()]))[\040\t])(?:.[\04
    0\t](?:([^\\x80-\xff\n\015()](?:(?:\[^\x80-\xff]|([^\\x80-\xff<br> n\015()](?:\[^\x80-\xff][^\\x80-\xff\n\015()])))[^\\x80-\xff\n<br> 015()]))[\040\t])(?:[^(\040)<>@,;:".\[]\000-\037\x80-\xff]+(?!
    [^(\040)<>@,;:".\[]\000-\037\x80-\xff])|[(?:[^\\x80-\xff\n\015[<br> ]]|\[^\x80-\xff])
    ])[\040\t](?:([^\\x80-\xff\n\015()](?:(?:\[^<br> x80-\xff]|([^\\x80-\xff\n\015()](?:\[^\x80-\xff][^\\x80-\xff\n\01
    5()]
    )))[^\\x80-\xff\n\015()]))[\040\t]))|(?:[^(\040)<>@,;:".
    \[]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\[]\000-\037\x80-\xff]
    )|"[^\\x80-\xff\n\015"](?:\[^\x80-\xff][^\\x80-\xff\n\015"])")[^
    ()<>@,;:".\[]\x80-\xff\000-\010\012-\037]
    (?:(?:([^\\x80-\xff\n\0
    15()](?:(?:\[^\x80-\xff]|([^\\x80-\xff\n\015()](?:\[^\x80-\xff][
    ^\\x80-\xff\n\015()])))[^\\x80-\xff\n\015()]))|"[^\\x80-\xff<br> n\015"](?:\[^\x80-\xff][^\\x80-\xff\n\015"])")[^()<>@,;:".\[]<br> x80-\xff\000-\010\012-\037])<[\040\t](?:([^\\x80-\xff\n\015()](?
    :(?:\[^\x80-\xff]|([^\\x80-\xff\n\015()]
    (?:\[^\x80-\xff][^\\x80-
    \xff\n\015()])))[^\\x80-\xff\n\015()]))[\040\t])(?:@[\040\t]
    (?:([^\\x80-\xff\n\015()]
    (?:(?:\[^\x80-\xff]|([^\\x80-\xff\n\015
    ()](?:\[^\x80-\xff][^\\x80-\xff\n\015()])))[^\\x80-\xff\n\015()
    ]
    ))[\040\t])(?:[^(\040)<>@,;:".\[]\000-\037\x80-\xff]+(?![^(\0
    40)<>@,;:".\[]\000-\037\x80-\xff])|[(?:[^\\x80-\xff\n\015[]]|\
    [^\x80-\xff])
    ])[\040\t](?:([^\\x80-\xff\n\015()](?:(?:\[^\x80-<br> xff]|([^\\x80-\xff\n\015()](?:\[^\x80-\xff][^\\x80-\xff\n\015()]
    )))[^\\x80-\xff\n\015()]))[\040\t])(?:.[\040\t](?:([^\\x80
    -\xff\n\015()](?:(?:\[^\x80-\xff]|([^\\x80-\xff\n\015()](?:\[^\x
    80-\xff][^\\x80-\xff\n\015()])))[^\\x80-\xff\n\015()]))[\040\t
    ])(?:[^(\040)<>@,;:".\[]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\
    []\000-\037\x80-\xff])|[(?:[^\\x80-\xff\n\015[]]|\[^\x80-\xff])
    ])[\040\t](?:([^\\x80-\xff\n\015()](?:(?:\[^\x80-\xff]|([^\\x
    80-\xff\n\015()]
    (?:\[^\x80-\xff][^\\x80-\xff\n\015()])))[^\\x80
    -\xff\n\015()]))[\040\t]))(?:,[\040\t](?:([^\\x80-\xff\n\015(
    )](?:(?:\[^\x80-\xff]|([^\\x80-\xff\n\015()](?:\[^\x80-\xff][^\
    \x80-\xff\n\015()])))[^\\x80-\xff\n\015()]))[\040\t])@[\040\t
    ](?:([^\\x80-\xff\n\015()](?:(?:\[^\x80-\xff]|([^\\x80-\xff\n\0
    15()](?:\[^\x80-\xff][^\\x80-\xff\n\015()])))[^\\x80-\xff\n\015
    ()]
    ))[\040\t])(?:[^(\040)<>@,;:".\[]\000-\037\x80-\xff]+(?![^(
    \040)<>@,;:".\[]\000-\037\x80-\xff])|[(?:[^\\x80-\xff\n\015[]]|
    \[^\x80-\xff])
    ])[\040\t](?:([^\\x80-\xff\n\015()](?:(?:\[^\x80
    -\xff]|([^\\x80-\xff\n\015()](?:\[^\x80-\xff][^\\x80-\xff\n\015()
    ]
    )))[^\\x80-\xff\n\015()]))[\040\t])(?:.[\040\t](?:([^\\x
    80-\xff\n\015()](?:(?:\[^\x80-\xff]|([^\\x80-\xff\n\015()](?:\[^
    \x80-\xff][^\\x80-\xff\n\015()])))[^\\x80-\xff\n\015()]))[\040
    \t])(?:[^(\040)<>@,;:".\[]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".
    \[]\000-\037\x80-\xff])|[(?:[^\\x80-\xff\n\015[]]|\[^\x80-\xff
    ])])[\040\t](?:([^\\x80-\xff\n\015()](?:(?:\[^\x80-\xff]|([^\
    \x80-\xff\n\015()]
    (?:\[^\x80-\xff][^\\x80-\xff\n\015()])))[^\\x
    80-\xff\n\015()]))[\040\t]))):[\040\t](?:([^\\x80-\xff\n\015
    ()]
    (?:(?:\[^\x80-\xff]|([^\\x80-\xff\n\015()](?:\[^\x80-\xff][^<br> \x80-\xff\n\015()])))[^\\x80-\xff\n\015()]))[\040\t]))?(?:[^
    (\040)<>@,;:".\[]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\[]\000-
    \037\x80-\xff])|"[^\\x80-\xff\n\015"]
    (?:\[^\x80-\xff][^\\x80-\xff<br> n\015"])")[\040\t](?:([^\\x80-\xff\n\015()](?:(?:\[^\x80-\xff]|
    ([^\\x80-\xff\n\015()](?:\[^\x80-\xff][^\\x80-\xff\n\015()])))
    [^\\x80-\xff\n\015()]
    ))[\040\t])(?:.[\040\t](?:([^\\x80-\xff
    \n\015()](?:(?:\[^\x80-\xff]|([^\\x80-\xff\n\015()](?:\[^\x80-\x
    ff][^\\x80-\xff\n\015()])))[^\\x80-\xff\n\015()]))[\040\t])(
    ?:[^(\040)<>@,;:".\[]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\[]<br> 000-\037\x80-\xff])|"[^\\x80-\xff\n\015"](?:\[^\x80-\xff][^\\x80-<br> xff\n\015"])")[\040\t](?:([^\\x80-\xff\n\015()](?:(?:\[^\x80-\x
    ff]|([^\\x80-\xff\n\015()]
    (?:\[^\x80-\xff][^\\x80-\xff\n\015()])
    ))[^\\x80-\xff\n\015()])
    )[\040\t]))@[\040\t](?:([^\\x80-\x
    ff\n\015()](?:(?:\[^\x80-\xff]|([^\\x80-\xff\n\015()](?:\[^\x80-
    \xff][^\\x80-\xff\n\015()])))[^\\x80-\xff\n\015()]))[\040\t])
    (?:[^(\040)<>@,;:".\[]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\[<br> ]\000-\037\x80-\xff])|[(?:[^\\x80-\xff\n\015[]]|\[^\x80-\xff])]
    )[\040\t]
    (?:([^\\x80-\xff\n\015()](?:(?:\[^\x80-\xff]|([^\\x80-
    \xff\n\015()]
    (?:\[^\x80-\xff][^\\x80-\xff\n\015()])))[^\\x80-\x
    ff\n\015()]))[\040\t])(?:.[\040\t](?:([^\\x80-\xff\n\015()](
    ?:(?:\[^\x80-\xff]|([^\\x80-\xff\n\015()](?:\[^\x80-\xff][^\\x80
    -\xff\n\015()]
    )))[^\\x80-\xff\n\015()]))[\040\t])(?:[^(\040)<
    >@,;:".\[]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\[]\000-\037\x8
    0-\xff])|[(?:[^\\x80-\xff\n\015[]]|\[^\x80-\xff])
    ])[\040\t](?:
    ([^\\x80-\xff\n\015()]
    (?:(?:\[^\x80-\xff]|([^\\x80-\xff\n\015()]
    (?:\[^\x80-\xff][^\\x80-\xff\n\015()])))[^\\x80-\xff\n\015()])
    )[\040\t]))>)
    EOF

    $RFC822PAT =~ s/\n//g;

  • Anonymous Coward (unregistered)

    the best regex email validator:

    function validate_email(){
     return (ereg('^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$', $_POST['email']));
    }

  • (cs)

    OK, what makes this soooo much worse is that the guy obviously does have a some knowledge of regular extressiosn because he uses them in his functions. How can you not stop and think that perhaps something is wrong when youhave to write the guts of 100 lines of code just to validate an email address!

  • (cs) in reply to Adam Heath
    Anonymous:
    Use a regex.<br>


    We've seen this joke a couple hundred times now, thank you.

  • Malhar (unregistered) in reply to kipthegreat

    kipthegreat:

    At the very least, couldn't he have simplified this function to this, and gotten rid of all those nested if statements?

    Function isEmail(ByRef Invalue)
    Dim valueStr
    valueStr = trim(CStr(Invalue))
    isEmail = (Not(detectedEmailBadCharacters(valueStr))) And _
    (Not(detectDoubleDotInARow(valueStr))) And _
    (Not(detectDoubleAtSimbol(valueStr))) And _
    (isUserNameOk(valueStr)) And _
    (isExtensionOK(valueStr)) And _
    (isHostNameOK(valueStr))
    End Function


     

    Remember, this is VB!! So can't use "And". "And" does not follow the rules of short-circuiting boolean conditions. You must use "AndAlso" or "OrElse" to achieve!!.

     

    - Malhar

  • (cs) in reply to Malhar

    When You'Re On Crack, It's Clear That Inventing Something Like "Option _
    ShortCircuit Off" Would Be A Total Travesty! Better To Just Invent Another _
    Boolean Operator (Hah, George Boole Was Stupid For Not Having Thought _
    Of Short-Circuiting!)

    (When I'm Nazi World Dictator, I'm Gonna Have My Stormtroopers Pile Up All _
    VB CD-ROMs And Set Fire To It. Can't Risk Shooting 'Em Into Space, Some _
    Pissed Off Aliens Might Come Destroy Earth As A Way Of Saying 'Thank You'.)

  • rp (unregistered) in reply to loneprogrammer
    loneprogrammer:
    Stop doing that.  You are not accepting these valid addresses:
    abigail @example.com
    *@example.net
    "\""@foo.bar
    fred&[email protected]
    [email protected]
    "127.0.0.1"@[127.0.0.1]
    Use a real RFC822 compliant validator like Mail::RFC822::Address

    I don't want to accept any of those, since when I grep text (such as a body of e-mail messages) for e-mail addresses, what I want to find is actual e-mail addresses, which any occurrences of such extremities are very unlikely to be. They are more likely to be quotes from the referenced article, for one thing.

    I submit that it would usually be wrong to recognise as an e-mail address any substring that matches the parsing rules in RFC 822.

  • (cs) in reply to loneprogrammer
    loneprogrammer:
    Stop doing that.  You are not accepting these valid addresses:
    abigail @example.com
    *@example.net
    "\""@foo.bar
    fred&[email protected]
    [email protected]
    "127.0.0.1"@[127.0.0.1]
    Use a real RFC822 compliant validator like Mail::RFC822::Address

    I don't want to accept any of those, since when I grep text (such as a body of e-mail messages) for e-mail addresses, what I want to find is actual e-mail addresses, which any occurrences of such extremities are very unlikely to be. They are more likely to be quotes from the referenced article, for one thing.

    I submit that it would usually be wrong to recognise as an e-mail address any substring that matches the parsing rules in RFC 822.

  • (cs) in reply to masklinn

    masklinn:

    Allows you to pass many things you're not supposed to, like "gibberish@@@foo@.;bar"

    Anonymous:

    Then I thought...well shit, which is faster, so I ran 10 second iterations of each method and found the nested if statements were actually 10% faster.  +/- 0.1%  error margin.  I completely ignored fast code for a clean look.  I forgot that VB evaluates EVERY AND condition before making up it's mind.

    Ya, and you have to use a specific AND operator for VB to short-circuit, probably cause some MS monkey considered that making the basic AND short-circuiting would break the existing code and unleash hell on earth (he probably forgot that VB had been in existance for some years...)

    I've always been of the opinion that email validators aren't overly useful in 90% of applications.
     
    I'm usually submitting my email address to gain access to something, wherein the confirmation that I've entered a valid email address is when I reply to an email sent to my submitted address. While email validators can ensure that I don't enter gibberish@@@foo@.;bar as an email address (I take it the forum software doesn't validate email addresses too hard either), they can't, without a lot of effort, correct the real problems, wherein a typo sees my validation email being sent to [email protected]. If I enter a clearly bollocks email, it's obvious I didn't want to receive the email.

    I'd also like to take this time to have a rant about people who ask me to type my email address twice to confirm. Surely, SURELY, they know that I'm going to copy and paste it??? It's a useless field!

    Actually, thinking of those 10% of useful cases, I can't think of any examples. Can anyone enlighten me?

  • (cs) in reply to nordyj2001

    nordyj2001:
    I'm beginning to think that Regular Expressions ought to be the very next thing tought to computer science students after the prerequeset Hello World app.

    yeah, go ahead and scare them away. try to explain how xml/xslt work to a newbie, that could be easier? at least it doesn't look so cryptic and understanding what goes on is a bit easier than fuguring out a regular expression.

    after someone learns the basics of C++, Pascal and Java, perhaps the person will know enough about computers not to be scared too much.

  • (cs) in reply to Cyresse

    Good point masklinn. Actually the only way to validate an email is to send a confirmation message to it and receive a response. else any other "valid" email address may be invalid when you try to send a mail to it - like [email protected]

     

    Although the code was actually full of WTF from the beginning to the end. Looks to me like a newbie with basic knowledge of visual basic and almost no knowledge of the libraries wrote this code. What's sad is that if this guy (or girl) spent 15 minutes with google he (or she) would find a much better code and the efforts would be far less...

  • Welcome To The Machine (unregistered) in reply to Malhar
    Anonymous:

    kipthegreat:

    At the very least, couldn't he have simplified this function to this, and gotten rid of all those nested if statements?

    ...

    Remember, this is VB!! So can't use "And". "And" does not follow the rules of short-circuiting boolean conditions. You must use "AndAlso" or "OrElse" to achieve!!.

     

    - Malhar

    Sheesh - all of you guys keep banging on about how you can use AndAlso etc... that's only in VB.net - and those functions were added to allow short-circuiting without changing the current OR/AND functionality, this is because some people (a WTF in itself) require all parts of the IF statement to be executed regardless of their values, otherwise their code doesn't work.

    What bemuses me is that the coder here knew how to use InStr and yet didn't use it for finding all those pesky double dots and ATs!

  • Lee (unregistered) in reply to CornedBee

    I like counter = 5, then never used again.  But better still foundOne=false, then set it to false each time round the loop, but we never set it to true and never test it anywhere? 

  • Kefer (unregistered) in reply to Enough Already

    What is this ´JOKE´ you´re speaking of?

    ;oP

  • zootm (unregistered) in reply to RevMike
    RevMike:
    http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html

    Software is acting funny for me.  :(

    Hahaha, I wanted to post that one :)
  • cdg (unregistered) in reply to Alexis de Torquemada
    Alexis de Torquemada:
    Anonymous:

    I'd say leave to the user to put a valid e-mail in and just check it with a tiny regex like

    +@+.+



    That's typically written as .+@.+\..+

    and is wrong, since blaah@au (example) should qualify :P

     

  • (cs)

    I also like how

    detectedEmailBadCharacters twice sets foundOne to False but never assigns anything else (or reads it again) before exiting. A bit like how 
    isExtensionOK sets the
    there_is_a_dot function to false and never refers to it again.

  • Robert (unregistered) in reply to einstruzende

    Companies deserve what they get when they hire the cheapest programming talent they can find--and have poor interviewing processes.

     

  • c3o (unregistered) in reply to Robert

    Two people here have posted their own validation regexp containing  .[a-z]{2,3}$
    Never heard of .info, .name,
    .aero or .museum, I take it?

  • (cs) in reply to Paul O

    As for "validating" email addresses, why bother?  "[email protected]" is perfectly valid, syntactically.  But it's no more effective at delivering mail than anything these "validators" might reject.

    Yay! Finally somebody who thinks like I do on this subject. My email "validator" two email address entry boxes and a note beneath which says something like "Please make sure you enter your email address correctly. If you don't, we won't be able to send you email."

    Fingers crossed that this posts correctly.

  • (cs) in reply to Garfunkalow

    Anonymous:
    regex anyone?
    that whole nasty thing can be shrunk to a nice little regex
    \w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)* i think that would do it maybe not.. but still.... 

    Yay, another regular expression jockey. Look at me! I can write a regex!

    :|

    At any rate, your regex doesn't validate email addresses properly.

  • (cs) in reply to rogthefrog

    Another bit - validating email address syntax is pointless in web apps. When you say "hay your email address isn't valid, dude" people type in [email protected], so you get a syntactically valid email address that's still bad, and you miss a good opportunity to improve your post-hoc email filtering with one positively bad token.

    Everybody does post-process user data, right? Or does everybody use user-provided data as-is and then wonder whyTF their mass mailer is so goddamn slow?

  • (cs) in reply to rogthefrog

    One more thing. If I ever see [A-Za-z] again I will smack a random programmer upside the head. No wonder databases are full of crap data if nobody takes the time to lowercase data that should be in lowercase.

  • Reginald (unregistered) in reply to rogthefrog
    rogthefrog:
    One more thing. If I ever see [A-Za-z] again I will smack a random programmer upside the head.


    Then you'd better get to slappin'. Cause I'm guilty as charged!
  • (cs) in reply to Reginald

    Anonymous:
    rogthefrog:
    One more thing. If I ever see [A-Za-z] again I will smack a random programmer upside the head.


    Then you'd better get to slappin'. Cause I'm guilty as charged!

    Consider yourself SMACKED, smacked, sMaCkEd, totallySmacked, totally_smacked, and TotallySmacked.

  • Jeff (unregistered)

    Brilliant use of variants.  I can't think of a variable more suitable to store a boolean value.  My favorite routine, was probably the 'detectDouble*' routines.  I also enjoyed the fact that the 'foundOne' variable in detectbademailcharacters is never set to true.  There's nothing like seeing nested for loops, as well.  If only this developer could add another for loop, and put a database call of some sort into it, that didn't hit indexes.  Then it would be just about perfect.

    Thank you once again for the wonderful Monday reading!

  • (cs) in reply to Jeff

    The only thing this guy got right was to put each If statement on its own line. VB doesn't short-circuit, so he's at least saving some processing time by bailing early on the comparisons.

    That being said, it's probably one of my biggest pet peeves when VB coders don't explicitly declare data types on their variables, or the return types on functions. The guy obviously knows you can put more in the parameter list besides the variable names, he knows about ByRef. How about you add "As String" so we know what to expect?

    I skimmed through the responses, but I didn't notice anyone catching this particular WTF. He passes in the "Invalue" by reference, and then his first step is to make a copy of it in a locally-declared variable. Why bother?! Pass the variable in ByVal instead of ByRef, and you've saved yourself the trouble!

    One final note: You can save a few lines of code and potential indenting problems with If/Else/End If by simply setting isEmail to False at the very start, and only try to set the value if you're changing it to True.

Leave a comment on “There's More Than One Way To Validate An Email”

Log In or post as a guest

Replying to comment #:

« Return to Article