• Validate This (unregistered) in reply to socknet
    socknet:
    jonnyq:
    One would hope that the person providing the email address would have access to it for confirmation...

    I mean seriously... two people this dumb?

    In my opinion, if you're just trying to prevent typing mistakes: anything, followed by an @, followed by anything, followed by a ., followed by anything. That's all I ever use. If you're trying to validate that an email address is real - send a confirmation email. Anything in between is prone to mistakes and probably isn't helping anyone.

    Please provide contact details for your legal representation:

    Name: _____ Phone: _____ Email: ______

    Please provide contact details for your IT Support team:

    Phone: _____ Email: ______

    Please provide your preferred email address to be created"

    Email: ______

    etc.

    There are MANY cases where you might ask someone to provide an email address which they may not have access to.

    Sending real emails to these addresses is pretty silly. 99.9% of the time, the simple validation will suffice. Whether or not you care about that missing 0.1%, and how you do so, would be completely dependent on context.

    So how are you validating that they're entering the correct name and phone number? Or are you assuming that any (insert country specific format) sequence of digits is the correct phone number for that person?

    In each of those cases the difference between entering a validly formatted but incorrect e-mail address and an invalidly formatted e-mail address is zilch. You still have garbage data either way.

  • (cs)

    All I can say is, these are both quite a bit less ugly than PERL's version: http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html

  • (cs) in reply to Splognosticus
    Splognosticus:
    (?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

    So easy a child could do it.

    Fail (according to the perl module Mail::RFC822::Address: regexp-based address validation):

    (?:(?:\r\n)?[ \t])(?:(?:(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t] )+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?: \r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:( ?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\0 31]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)
    ](?:(?:\r\n)?[ \t])
    )(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+ (?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?: (?:\r\n)?[ \t])))|(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z |(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n) ?[ \t]))<(?:(?:\r\n)?[ \t])(?:@(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:
    r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n) ?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t] )))(?:,@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])* )(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t] )+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))) :(?:(?:\r\n)?[ \t]))?(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+ |\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r \n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?: \r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t ]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031 ]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)]( ?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(? :(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(? :\r\n)?[ \t])))>(?:(?:\r\n)?[ \t]))|(?:[^()<>@,;:\".[] \000-\031]+(?:(? :(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)? [ \t]))"(?:(?:\r\n)?[ \t])):(?:(?:\r\n)?[ \t])(?:(?:(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]| \.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<> @,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|" (?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t] )(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\ ".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(? :[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[ ]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))|(?:[^()<>@,;:\".[] \000- \031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|( ?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))<(?:(?:\r\n)?[ \t])(?:@(?:[^()<>@,; :\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([ ^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\" .[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[
    ]\r\]|\.)](?:(?:\r\n)?[ \t])))(?:,@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".
    [] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]
    r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\] |\.)](?:(?:\r\n)?[ \t])))):(?:(?:\r\n)?[ \t]))?(?:[^()<>@,;:\".[] \0 00-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\ .|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@, ;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(? :[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t]) (?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\". []]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[ ^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[] ]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))>(?:(?:\r\n)?[ \t]))(?:,\s( ?:(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\ ".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:( ?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[ ["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t ])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t ])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(? :.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+| \Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))|(?: [^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[
    ]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))<(?:(?:\r\n) ?[ \t])(?:@(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[[" ()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n) ?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<> @,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))(?:,@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@, ;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t] )(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\ ".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))):(?:(?:\r\n)?[ \t]))? (?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\". []]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?: \r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[[ "()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]) ))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t]) +|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:
    .(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z |(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))>(?:( ?:\r\n)?[ \t]))))?;\s*)

    And even that is only with comments removed......

  • (cs) in reply to Paul
    Paul:
    Try and enumerate a few, please. ...

    Why reinvent the wheel by doing some other stupid "validation" beforehand?

    ...

    In case you really don't know the answer, algorithmic validation can return a response in milliseconds without a context switch for the user. Full validation takes minutes.
  • (cs)

    The better way is to pop up not one, but TWO confirmation dialogs.

    #1 Are you sure that "hugmy@bum.com" is your e-mail address?

    #2 Are you REEAALLY sure?

    And then the user will be all like "zOMG I can't believe I typed it wrong" and they'll totally fix it.

    /true story

  • trtrwtf (unregistered) in reply to jonnyq
    jonnyq:
    N. Tufnel:
    Paul:
    The easiest, and most correct way to validate an email address is to send an email to it.

    In what sense does that validate anything, if you don't have access to that address's mailbox?

    One would hope that the person providing the email address would have access to it for confirmation...

    I mean seriously... two people this dumb?

    In my opinion, if you're just trying to prevent typing mistakes: anything, followed by an @, followed by anything, followed by a ., followed by anything. That's all I ever use. If you're trying to validate that an email address is real - send a confirmation email. Anything in between is prone to mistakes and probably isn't helping anyone.

    I think there's some confusion about the meaning of "validate" here. Some people seem to think it means "make sure this email address is really the one that belongs to this person", others believe (correctly) that it means "make sure that this email address is a well-formed address".

    The former case might be a common scenario, but when you "validate" something you're checking to see if it's valid - not whether it's correct.

  • trtrwtf (unregistered) in reply to Rick
    Rick:
    Paul:
    Try and enumerate a few, please. ...

    Why reinvent the wheel by doing some other stupid "validation" beforehand?

    ...

    In case you really don't know the answer, algorithmic validation can return a response in milliseconds without a context switch for the user. Full validation takes minutes.

    And if you happen to be doing something other than real-time interaction with a single user checking a single email address, validation by send-an-email-and-see-if-they-bother-to-reply is not going to do you a lot of good.

    But of course, nobody would ever have occasion to check a list of addresses, or anything silly like that.

  • socknet (unregistered) in reply to Validate This
    Validate This:
    So how are you validating that they're entering the correct name and phone number? Or are you assuming that any (insert country specific format) sequence of digits is _the_ correct phone number for that person?

    Not really relevant to this discussion on email validation, but perhaps there are common methods which people use for names and phone numbers, feel free to google it if you are interested.

    Validate This:
    In each of those cases the difference between entering a validly formatted but incorrect e-mail address and an invalidly formatted e-mail address is zilch. You still have garbage data either way.

    Correct, but doesn't really add anything to the conversation. As mentioned, most of the time you don't have to prove with absolute certainty that an email address is correct, being 'reasonably sure' is usually close enough (asking people to input an email address twice seems to be common nowdays and probably catches a lot of entry mistakes). When you do need to be 100% sure on the email address, that is when it is a good time to do things such as send validation emails which require a response.

  • socknet (unregistered) in reply to redundantman
    redundantman:
    The better way is to pop up not one, but TWO confirmation dialogs.

    #1 Are you sure that "hugmy@bum.com" is your e-mail address?

    #2 Are you REEAALLY sure?

    And then the user will be all like "zOMG I can't believe I typed it wrong" and they'll totally fix it.

    /true story

    It is even better if you have a 3rd box which says: "so you are saying your email is hugym@bum.com ?" and then if they click 'yes', you can have a 4th saying "liar!"

  • Optimus Dime (unregistered) in reply to trtrwtf
    trtrwtf:
    Rick:
    Paul:
    Try and enumerate a few, please. ...

    Why reinvent the wheel by doing some other stupid "validation" beforehand?

    ...

    In case you really don't know the answer, algorithmic validation can return a response in milliseconds without a context switch for the user. Full validation takes minutes.

    And if you happen to be doing something other than real-time interaction with a single user checking a single email address, validation by send-an-email-and-see-if-they-bother-to-reply is not going to do you a lot of good.

    But of course, nobody would ever have occasion to check a list of addresses, or anything silly like that.

    What, exactly, would you be 'checking' for on this mythical occasion.

    captcha: acsi - The only true and validated character set.

  • JB (unregistered)

    How come the domain part doesn't allow a final dot like URIs?

    http://www.google.com./search?q=uri

  • Splognosticus (unregistered) in reply to Yazeran
    Yazeran:
    Splognosticus:
    (?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

    So easy a child could do it.

    Fail (according to the perl module Mail::RFC822::Address: regexp-based address validation):

    (?:(?:\r\n)?[ \t])(?:(?:(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t] )+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?: \r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:( ?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\0 31]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)
    ](?:(?:\r\n)?[ \t])
    )(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+ (?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?: (?:\r\n)?[ \t])))|(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z |(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n) ?[ \t]))<(?:(?:\r\n)?[ \t])(?:@(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:
    r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n) ?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t] )))(?:,@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])* )(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t] )+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))) :(?:(?:\r\n)?[ \t]))?(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+ |\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r \n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?: \r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t ]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031 ]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)]( ?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(? :(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(? :\r\n)?[ \t])))>(?:(?:\r\n)?[ \t]))|(?:[^()<>@,;:\".[] \000-\031]+(?:(? :(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)? [ \t]))"(?:(?:\r\n)?[ \t])):(?:(?:\r\n)?[ \t])(?:(?:(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]| \.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<> @,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|" (?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t] )(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\ ".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(? :[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[ ]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))|(?:[^()<>@,;:\".[] \000- \031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|( ?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))<(?:(?:\r\n)?[ \t])(?:@(?:[^()<>@,; :\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([ ^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\" .[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[
    ]\r\]|\.)](?:(?:\r\n)?[ \t])))(?:,@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".
    [] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]
    r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\] |\.)](?:(?:\r\n)?[ \t])))):(?:(?:\r\n)?[ \t]))?(?:[^()<>@,;:\".[] \0 00-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\ .|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@, ;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(? :[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t]) (?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\". []]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[ ^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[] ]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))>(?:(?:\r\n)?[ \t]))(?:,\s( ?:(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\ ".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:( ?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[ ["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t ])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t ])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(? :.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+| \Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))|(?: [^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[
    ]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))<(?:(?:\r\n) ?[ \t])(?:@(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[[" ()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n) ?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<> @,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))(?:,@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@, ;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t] )(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\ ".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))):(?:(?:\r\n)?[ \t]))? (?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\". []]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?: \r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[[ "()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]) ))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t]) +|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:
    .(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z |(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))>(?:( ?:\r\n)?[ \t]))))?;\s*)

    And even that is only with comments removed......

    Heyyyy... Hard-coding every possible email address in an obfuscated expression is cheating.

  • SlainVeteran (unregistered)

    domain_help+td-wtf@tk is a perfectly valid email address. I doubt many email address validators in the wild would OK it.

  • trtrwtf (unregistered) in reply to Optimus Dime
    Optimus Dime:
    trtrwtf:
    Rick:
    Paul:
    Try and enumerate a few, please. ...

    Why reinvent the wheel by doing some other stupid "validation" beforehand?

    ...

    In case you really don't know the answer, algorithmic validation can return a response in milliseconds without a context switch for the user. Full validation takes minutes.

    And if you happen to be doing something other than real-time interaction with a single user checking a single email address, validation by send-an-email-and-see-if-they-bother-to-reply is not going to do you a lot of good.

    But of course, nobody would ever have occasion to check a list of addresses, or anything silly like that.

    What, exactly, would you be 'checking' for on this mythical occasion.

    captcha: acsi - The only true and validated character set.

    One scenario might be checking data entry - someone types in a bunch of email addresses (from a handwritten sign-up sheet, perhaps) - and you want to verify that they haven't fat-fingered any of the addresses. Granted, you won't catch jeo.smith@foo.com, but you'd get joe,smith

    Or you might want to scan a document for potential email addresses (to automatically make them mail-to links, or to suck addresses into an address book, or whatnot). Being able to recognize a valid email address might be useful in that circumstance, no?

    Or you might want to make sure a user isn't just mashing the keyboard when "email address" is a required field on your form. Again, they could just enter joe.smith@foo.com, but you make them work a bit more. (this is not a bulk validation case, I know)

    Point is, "validate" does not mean the same thing as "verify".

  • Anonymous (unregistered) in reply to dtobias
    dtobias:
    I've run into sites that refuse to accept my perfectly valid .name and .info addresses because they think that TLDs shouldn't be more than three letters.

    I've seen code that rejected 2-letters ccTLDs. Since all TLD are 3-letters and noone heard of places called France or United Kingdom ;)

  • trtrwtf (unregistered) in reply to Splognosticus
    Splognosticus:
    Yazeran:
    Splognosticus:
    (?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

    So easy a child could do it.

    Fail (according to the perl module Mail::RFC822::Address: regexp-based address validation):

    (?:(?:\r\n)?[ \t])(?:(?:(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t] )+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?: \r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:( ?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\0 31]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)
    ](?:(?:\r\n)?[ \t])
    )(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+ (?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?: (?:\r\n)?[ \t])))|(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z |(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n) ?[ \t]))<(?:(?:\r\n)?[ \t])(?:@(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:
    r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n) ?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t] )))(?:,@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])* )(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t] )+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))) :(?:(?:\r\n)?[ \t]))?(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+ |\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r \n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?: \r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t ]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031 ]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)]( ?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(? :(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(? :\r\n)?[ \t])))>(?:(?:\r\n)?[ \t]))|(?:[^()<>@,;:\".[] \000-\031]+(?:(? :(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)? [ \t]))"(?:(?:\r\n)?[ \t])):(?:(?:\r\n)?[ \t])(?:(?:(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]| \.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<> @,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|" (?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t] )(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\ ".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(? :[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[ ]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))|(?:[^()<>@,;:\".[] \000- \031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|( ?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))<(?:(?:\r\n)?[ \t])(?:@(?:[^()<>@,; :\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([ ^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\" .[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[
    ]\r\]|\.)](?:(?:\r\n)?[ \t])))(?:,@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".
    [] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]
    r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\] |\.)](?:(?:\r\n)?[ \t])))):(?:(?:\r\n)?[ \t]))?(?:[^()<>@,;:\".[] \0 00-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\ .|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@, ;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(? :[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t]) (?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\". []]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[ ^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[] ]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))>(?:(?:\r\n)?[ \t]))(?:,\s( ?:(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\ ".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:( ?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[ ["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t ])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t ])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(? :.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+| \Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))|(?: [^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[
    ]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))<(?:(?:\r\n) ?[ \t])(?:@(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[[" ()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n) ?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<> @,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))(?:,@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@, ;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t] )(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\ ".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))):(?:(?:\r\n)?[ \t]))? (?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\". []]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?: \r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[[ "()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]) ))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t]) +|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:
    .(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z |(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))>(?:( ?:\r\n)?[ \t]))))?;\s*)

    And even that is only with comments removed......

    Heyyyy... Hard-coding every possible email address in an obfuscated expression is cheating.

    It's perl. If it's not obfuscated, what's the point?

  • (cs) in reply to JB
    JB:
    How come the domain part doesn't allow a final dot like URIs?

    http://www.google.com./search?q=uri

    Because it's covered by a different RFC?

  • Paul (unregistered) in reply to trtrwtf
    trtrwtf:
    And if you happen to be doing something other than real-time interaction with a single user checking a single email address, validation by send-an-email-and-see-if-they-bother-to-reply is not going to do you a lot of good.

    But of course, nobody would ever have occasion to check a list of addresses, or anything silly like that.

    See-if-they-bother-to-reply is an added bonus feature of sending a confirmation email, which lets you know that it's a real mailbox.

    The first response, which will be marginally slower than comes from your email-sending function stating that it managed to parse the address to extract enough information to be able to send it.

    One advantage of this, over a regular expression, is that the validation will only pass for valid addresses, and will only fail for invalid addresses.

    Another advantage is that it actually checks against real-world usage. See my post about reinventing the wheel. If you insist on using a regular expression that does not conform to the RFC, how do you guarantee that it matches the foibles your mail sending application?

    Look at it this way - If you have to check a list of addresses, what is the point of running it through a validator that doesn't work?

  • Dan (unregistered) in reply to XXXXX

    Still better than just guessing the rules.

    The most annoying thing is unsubscribe pages that use different validation rules than the form that got you on the list - they tell you you can't unsubscribe because you are not giving them a valid email address, for an address they are sending daily emails to.

  • PoPSiCLe (unregistered)

    Hm. I've been using this (found somewhere, origin unknown) with a fair amount of success on several websites - I haven't really looked too closely, and it's probably failing some valid emails.

    function check_email_address($email) { // First, we check that there's one @ symbol, and that the lengths are right if (!ereg("^[^@]{1,64}@[^@]{1,255}$", $email)) { // Email invalid because wrong number of characters in one section, or wrong number of @ symbols. return false; } // Split it into sections to make life easier $email_array = explode("@", $email); $local_array = explode(".", $email_array[0]); for ($i = 0; $i < sizeof($local_array); $i++) { if (!ereg("^(([A-Za-z0-9!#$%&'*+/=?^_{|}~-][A-Za-z0-9!#$%&'*+/=?^_{|}~.-]{0,63})|("[^(\|")]{0,62}"))$", $local_array[$i])) { return false; } } if (!ereg("^[?[0-9.]+]?$", $email_array[1])) { // Check if domain is IP. If not, it should be valid domain name $domain_array = explode(".", $email_array[1]); if (sizeof($domain_array) < 2) { return false; // Not enough parts to domain } for ($i = 0; $i < sizeof($domain_array); $i++) { if (!ereg("^(([A-Za-z0-9][A-Za-z0-9-]{0,61}[A-Za-z0-9])|([A-Za-z0-9]+))$", $domain_array[$i])) { return false; } } } return true; }

  • trtrwtf (unregistered) in reply to Paul

    [quote user="Paul] The first response, which will be marginally slower than comes from your email-sending function stating that it managed to parse the address to extract enough information to be able to send it. [snip] Look at it this way - If you have to check a list of addresses, what is the point of running it through a validator that doesn't work? [/quote]

    So the mail processing software manages to parse an email address, and always does it correctly, but no other software is capable of this task? Hm. I smell magic.

    I agree with you that "I'll just whip up an email validator" is probably wtf thinking, but this is one of those functions that ought to live in a library. I'm pretty sure I don't like requiring all email validation to generate spam as a side effect.

  • trtrwtf (unregistered) in reply to PoPSiCLe
    PoPSiCLe:
    Hm. I've been using this (found somewhere, origin unknown) with a fair amount of success on several websites - I haven't really looked too closely, and it's probably failing some valid emails.

    function check_email_address($email) { // First, we check that there's one @ symbol, and that the lengths are right if (!ereg("^[^@]{1,64}@[^@]{1,255}$", $email)) { // Email invalid because wrong number of characters in one section, or wrong number of @ symbols. return false; } // Split it into sections to make life easier $email_array = explode("@", $email); $local_array = explode(".", $email_array[0]); for ($i = 0; $i < sizeof($local_array); $i++) { if (!ereg("^(([A-Za-z0-9!#$%&'*+/=?^_{|}~-][A-Za-z0-9!#$%&'*+/=?^_{|}~.-]{0,63})|("[^(\|")]{0,62}"))$", $local_array[$i])) { return false; } } if (!ereg("^[?[0-9.]+]?$", $email_array[1])) { // Check if domain is IP. If not, it should be valid domain name $domain_array = explode(".", $email_array[1]); if (sizeof($domain_array) < 2) { return false; // Not enough parts to domain } for ($i = 0; $i < sizeof($domain_array); $i++) { if (!ereg("^(([A-Za-z0-9][A-Za-z0-9-]{0,61}[A-Za-z0-9])|([A-Za-z0-9]+))$", $domain_array[$i])) { return false; } } } return true; }

    Correct me if I'm wrong but doesn't this pass something like joe.smith@12243234.345345.4563465.34524353.23452354.2345?

  • (cs)

    in some email system they sometime use ' character to screw thing up.

    Just ask Sharon D'Souza!!!

  • Validate This (unregistered) in reply to trtrwtf
    trtrwtf:
    Correct me if I'm wrong but doesn't this pass something like joe.smith@12243234.345345.4563465.34524353.23452354.2345?

    Numerical domain names are valid

  • (cs)
    public static boolean isValidEmailAddress(String aEmailAddress){
        if (aEmailAddress == null) return false;
        boolean result = true;
        try {
          InternetAddress emailAddr = new InternetAddress(aEmailAddress);
          if ( ! hasNameAndDomain(aEmailAddress) ) {
            result = false;
          }
        }
        catch (AddressException ex){
          result = false;
        }
        return result;
      }
    
      private static boolean hasNameAndDomain(String aEmailAddress){
        String[] tokens = aEmailAddress.split("@");
        return 
         tokens.length == 2 &&
         Util.textHasContent( tokens[0] ) && 
         Util.textHasContent( tokens[1] ) ;
      }
      
      //..elided
    }
     

    Here's a simple function that will work for everyone.

  • O'Brien (unregistered) in reply to Nagesh
    Nagesh:
    in some email system they sometime use ' character to screw thing up.

    Just ask Sharon D'Souza!!!

    You! You're the incompetent retard that makes half the damn websites on the Internet fuck up my name!

  • Eevee (unregistered)

    Given that "#"@"#"@[IPv6:::ffff:173.230.158.172] is a perfectly valid email address, I'd say the most reliable way to verify validity is:

    $email =~ /@/

    And then, yeah, just send an email to it.

  • Worf (unregistered) in reply to SlainVeteran
    SlainVeteran:
    domain_help+td-wtf@tk is a perfectly valid email address. I doubt many email address validators in the wild would OK it.

    Anyone remember ye olde uucp format as well with the ! in the email addresses? And I think # is also valid as well, but I suspect not too many validators accept it...

  • (cs) in reply to socknet
    socknet:
    Please provide contact details for your legal representation:

    Name: _____ Phone: _____ Email: ______

    This and the next case may make sense: you're giving the system an email address to be used later.

    Please provide your preferred email address to be created"

    Email: ______

    This, however, doesn't IMO. After all, I can't put 'billg@microsoft.com' in there, and that'll pass your validation.

    First, it would almost always be better to just put a 'username' box there, and add on the @domain yourself. If there's a choice of multiple domains, then give a drop-down box of them. Only if there are a ton of choices (e.g. you'll let them create an arbitrary subdomain too) does it make sense to have them provide a full email address.

    Second, 'valid email' doesn't imply a valid entry for that box either. It has to not exist -- so you have the same problem as before, except the reverse. If you have to check that anyway, why not just feed whatever the user enters to the system and let it fail? Why are you doing extra work? (Of course, you have to if you want to impose constraints that your backing system wouldn't, e.g. you want to disallow . from your email addresses or something. But then you're doing something different anyway.)

    99.9% of the time, the simple validation will suffice. Whether or not you care about that missing 0.1%, and how you do so, would be completely dependent on context.

    See, I disagree. I think it'd be much closer to 99.9% the other way. How many times have you entered someone else's email address in a form somewhere vs your own? And almost all the time you enter your own, I think it makes sense to send a confirmation email.

  • Eevee (unregistered) in reply to Worf
    Worf:
    Anyone remember ye olde uucp format as well with the ! in the email addresses? And I think # is also valid as well, but I suspect not too many validators accept it...
    I tried, briefly, to use # as the local-part delimiter in my email address. It doesn't need quoting and it's not reserved for anything.

    I gave up when even TDWTF rejected it.

  • (cs) in reply to trtrwtf
    trtrwtf:
    jonnyq:
    N. Tufnel:
    Paul:
    The easiest, and most correct way to validate an email address is to send an email to it.

    In what sense does that validate anything, if you don't have access to that address's mailbox?

    One would hope that the person providing the email address would have access to it for confirmation...

    I mean seriously... two people this dumb?

    In my opinion, if you're just trying to prevent typing mistakes: anything, followed by an @, followed by anything, followed by a ., followed by anything. That's all I ever use. If you're trying to validate that an email address is real - send a confirmation email. Anything in between is prone to mistakes and probably isn't helping anyone.

    I think there's some confusion about the meaning of "validate" here. Some people seem to think it means "make sure this email address is really the one that belongs to this person", others believe (correctly) that it means "make sure that this email address is a well-formed address".

    The former case might be a common scenario, but when you "validate" something you're checking to see if it's valid - not whether it's correct.

    That's because there are two different groups that have this problem. One group is going to sell the address to a data-farmer and they want to guard against the user lying to them. The other group is collecting email addresses for the user's benefit and they want to help the user fill out the form accurately. Every solution proposed by one group will be rejected by the other.

    Let's just face the simple fact that email validation isn't a big deal in the general case and probably doesn't even deserve a library method. In the specific case of knowing that the email address is truely valid and controlled by that person, a library method wouldn't be sufficient.

    Doing it "right" is almost always wrong. Almost nobody cares if an email address meets RFC2822. They either care that it connects to a person or that there is a hint that the address may have been mistyped. A verification email solves the former and a half-assed validation regex solves the latter. "Proper" validation does neither as many bad email addresses are technically valid and almost all valid email addresses are not in use.

  • XXXXX (unregistered) in reply to Nagesh
    Nagesh:
        try {
          InternetAddress emailAddr = new InternetAddress(aEmailAddress);
        }
        catch (AddressException ex){
          result = false;
        }
     

    Here's a simple function that will work for everyone.

    Becuase everyone uses java & JavaMail. Any other language/platform/API it won't work very well.

  • Meep (unregistered) in reply to Paul
    Paul:
    The easiest, and most correct way to validate an email address is to send an email to it.

    Oh, great, so you're going to get an email address like:

    little bobby mails
    DATA
    .
    MAIL From: sirspamsalot@auto
    RCPT To: yourmother@ho.com
    DATA
    ...
    
  • Machtyn (unregistered)

    Correct me if I'm wrong, but it appears that the first validator is trying to block the use of special characters. However, if that character is the first character in the address, it will be accepted. (i.e. *myname@domain.com will pass). Funny the difference between x<0 to x<=0 and x>0 to x>=0.

  • socknet (unregistered) in reply to EvanED
    EvanED:
    99.9% of the time, the simple validation will suffice. Whether or not you care about that missing 0.1%, and how you do so, would be completely dependent on context.

    See, I disagree. I think it'd be much closer to 99.9% the other way. How many times have you entered someone else's email address in a form somewhere vs your own? And almost all the time you enter your own, I think it makes sense to send a confirmation email.

    You misunderstood my statement here. I am saying that 99.9% of the time, a very basic validation check will be sufficient to ensure an email address is valid and that it is only 0.1% of the time that you have oddball cases (such as '+' characters in the address).

    I was in no way referring to the likelihood that a person is entering their own email address vs the email address of another user.

  • socknet (unregistered) in reply to Machtyn
    Machtyn:
    Correct me if I'm wrong, but it appears that the first validator is trying to block the use of special characters. However, if that character is the first character in the address, it will be accepted. (i.e. *myname@domain.com will pass). Funny the difference between x<0 to x<=0 and x>0 to x>=0.

    looks like "@...@@@@" would be valid too..

  • pfft (unregistered) in reply to socknet
    socknet:
    EvanED:
    99.9% of the time, the simple validation will suffice. Whether or not you care about that missing 0.1%, and how you do so, would be completely dependent on context.

    See, I disagree. I think it'd be much closer to 99.9% the other way. How many times have you entered someone else's email address in a form somewhere vs your own? And almost all the time you enter your own, I think it makes sense to send a confirmation email.

    You misunderstood my statement here. I am saying that 99.9% of the time, a very basic validation check will be sufficient to ensure an email address is valid and that it is only 0.1% of the time that you have oddball cases (such as '+' characters in the address).

    I was in no way referring to the likelihood that a person is entering their own email address vs the email address of another user.

    So what you're saying is that your validation code doesn't validate anything and possibly excludes valid email addresses?

  • (cs)

    http://code.google.com/p/isemail/source/browse/trunk/is_email.php?r=6

    That is a link to what is quite possibly the only truly correct email validator out there. It takes into account rfc3696, rfc2822, rfc5322, rfc5321, rfc4291, and rfc1123, including errata. Rather than try to regex the whole thing (which is provably impossible), it separates everything into pieces and validates components. It also contains a number of flags to allow you block emails that are probably incorrect, or decide to allow everything valid, if nonsensical. For example, x@x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x23456789.x234 is a valid email address, but anyone who purports to have such an address is either lying or on a subnet. Also, you can have an email such as pope@va is a valid (and at some point in the past, used) address.

  • socknet (unregistered) in reply to pfft
    pfft:
    socknet:
    EvanED:
    99.9% of the time, the simple validation will suffice. Whether or not you care about that missing 0.1%, and how you do so, would be completely dependent on context.

    See, I disagree. I think it'd be much closer to 99.9% the other way. How many times have you entered someone else's email address in a form somewhere vs your own? And almost all the time you enter your own, I think it makes sense to send a confirmation email.

    You misunderstood my statement here. I am saying that 99.9% of the time, a very basic validation check will be sufficient to ensure an email address is valid and that it is only 0.1% of the time that you have oddball cases (such as '+' characters in the address).

    I was in no way referring to the likelihood that a person is entering their own email address vs the email address of another user.

    So what you're saying is that your validation code doesn't validate anything and possibly excludes valid email addresses?

    No, most people could see that's not what I'm saying - not sure what logic you are using to think that. Please elaborate on how you reach that conclusion from my statement.

  • dnm (unregistered) in reply to wow

    Why is this dumb?

  • dnm (unregistered) in reply to wow
    wow:
    Paul:
    The easiest, and most correct way to validate an email address is to send an email to it.

    Wow. Just, well, wow. This is possibly the most amazingly dumb thing I have seen posted in a very long time, for more reasons than I can count.

    Why is this dumb?

  • pfft (unregistered) in reply to socknet
    socknet:
    pfft:
    socknet:
    EvanED:
    99.9% of the time, the simple validation will suffice. Whether or not you care about that missing 0.1%, and how you do so, would be completely dependent on context.

    See, I disagree. I think it'd be much closer to 99.9% the other way. How many times have you entered someone else's email address in a form somewhere vs your own? And almost all the time you enter your own, I think it makes sense to send a confirmation email.

    You misunderstood my statement here. I am saying that 99.9% of the time, a very basic validation check will be sufficient to ensure an email address is valid and that it is only 0.1% of the time that you have oddball cases (such as '+' characters in the address).

    I was in no way referring to the likelihood that a person is entering their own email address vs the email address of another user.

    So what you're saying is that your validation code doesn't validate anything and possibly excludes valid email addresses?

    No, most people could see that's not what I'm saying - not sure what logic you are using to think that. Please elaborate on how you reach that conclusion from my statement.

    Can you please cite this "most people" study?

  • csharptest.net (unregistered)
    There are a lot of different ways to validate an email address input field. The easiest – and mostly correct – method is to use a regular expression.

    That couldn't be farther from the truth. Regex is NOT a valid way to verify an email address. You might as well use this regex:

    .+@.+

    That's as close as you can come. Anything else is just plain wrong. You might as well do an if index of '@' and be done with it. What? You don't believe me?

    http://tools.ietf.org/html/rfc2822#section-3.4.1

    3.4.1. Addr-spec specification

    An addr-spec is a specific Internet identifier that contains a locally interpreted string followed by the at-sign character ("@", ASCII value 64) followed by an Internet domain.

    The definition of "locally interpreted" is: you can't interpret it.

    Despite that opening remark I am LMAO off at the attempts to validate there.

  • Christopher (unregistered)

    Dice.com is the worst offender of email address validation. They don't just validate your address; they strip out any "invalid" characters without telling you that they did anything to it! For example, I used the '+' sign in my address ("myaddress+dice@gmail.com"), and they silently changed it to "myaddressdice@gmail.com". (I happened to go back into the user settings to change something else and noticed my stripped email address).

    It's doubleplus ungood that you have to log into the site using your email address, so if you ever change your address to something that they don't like for some arbitrary reason, make sure that it's correct in their system before you log out!

  • (cs) in reply to dnm
    dnm:
    wow:
    Paul:
    The easiest, and most correct way to validate an email address is to send an email to it.

    Wow. Just, well, wow. This is possibly the most amazingly dumb thing I have seen posted in a very long time, for more reasons than I can count.

    Why is this dumb?

    Because you've not taken into account any anti spam methods that may be in place. What you can do is to confirm that the domain exists and accepts mail. However, if the server implements a greet pause as a part of it's policy, it's going to be a very slow process. Alternatively, a DNS lookup of the MX record for that domain ( whilst technically incorrect ) may also suffice.

  • Paul (unregistered) in reply to trtrwtf
    trtrwtf:
    So the mail processing software manages to parse an email address, and always does it correctly, but no other software is capable of this task? Hm. I smell magic.
    1. Not "No Other Software", "A Regular Expression" (i.e. the thing being touted as the right way to validate emails). A regular expression complex enough to do the job will be a pig to get right and a nightmare to debug. As I mentioned in an earlier post, I've never seen one in use that works.

    2. Not Magic. Coordination.

    If both your validator and your mail processing software are both perfectly compliant, then there is no problem.

    If, on the other hand, your mail processing software has foibles, then you should also ensure that your validator has exactly the same foibles, else you run the risk that your validator will pass a compliant address that your mail processing software cannot handle. That strikes me as rather difficult, particularly if your mail processor is a black box to you. That said, you may be content to suffer that if your validator is perfectly RFC-compliant.

    I do concede that if you are validating addresses from a list, rather than immediate user input, an email shouldn't actually be sent out to the recipient's mailbox. However, the DRY way to validate is still to use the same tool that will actually be sending the mail (but offline).

  • socknet (unregistered) in reply to pfft
    pfft:
    socknet:
    pfft:
    socknet:
    EvanED:
    99.9% of the time, the simple validation will suffice. Whether or not you care about that missing 0.1%, and how you do so, would be completely dependent on context.

    See, I disagree. I think it'd be much closer to 99.9% the other way. How many times have you entered someone else's email address in a form somewhere vs your own? And almost all the time you enter your own, I think it makes sense to send a confirmation email.

    You misunderstood my statement here. I am saying that 99.9% of the time, a very basic validation check will be sufficient to ensure an email address is valid and that it is only 0.1% of the time that you have oddball cases (such as '+' characters in the address).

    I was in no way referring to the likelihood that a person is entering their own email address vs the email address of another user.

    So what you're saying is that your validation code doesn't validate anything and possibly excludes valid email addresses?

    No, most people could see that's not what I'm saying - not sure what logic you are using to think that. Please elaborate on how you reach that conclusion from my statement.

    Can you please cite this "most people" study?

    so you are saying your not a person and neither are your parents?

    </pfft logic>

  • pfft (unregistered) in reply to socknet
    socknet:
    pfft:
    socknet:
    pfft:
    socknet:
    EvanED:
    99.9% of the time, the simple validation will suffice. Whether or not you care about that missing 0.1%, and how you do so, would be completely dependent on context.

    See, I disagree. I think it'd be much closer to 99.9% the other way. How many times have you entered someone else's email address in a form somewhere vs your own? And almost all the time you enter your own, I think it makes sense to send a confirmation email.

    You misunderstood my statement here. I am saying that 99.9% of the time, a very basic validation check will be sufficient to ensure an email address is valid and that it is only 0.1% of the time that you have oddball cases (such as '+' characters in the address).

    I was in no way referring to the likelihood that a person is entering their own email address vs the email address of another user.

    So what you're saying is that your validation code doesn't validate anything and possibly excludes valid email addresses?

    No, most people could see that's not what I'm saying - not sure what logic you are using to think that. Please elaborate on how you reach that conclusion from my statement.

    Can you please cite this "most people" study?

    so you are saying your not a person and neither are your parents?

    </pfft logic>

    I'm a person and I understand me. Therefore, most people can understand me.

    </socknet_logic>

  • (cs)

    In addition to sending mail, the true way of verification should also include one phone call made to person who has entered mail address. This is so the person does not use spam filter to block mail sent..

  • ICANNOT (unregistered) in reply to Anonymous

    Well, TLD obviously means "Three Letter Domain".

Leave a comment on “Email Validation Validity”

Log In or post as a guest

Replying to comment #:

« Return to Article