• Someone (unregistered)
  • NYCNetworker (unregistered)

    hmmm

    guess I'm gonna change my email address to

    hello?????@???.??

  • XXXXX (unregistered)

    So what you're saying is, that I should return "CHAIN"?

    I suppose maintaining this code is better than getting stuck on a ^([l]{2,60})([@])([A-Za-z0-9.|-|_]{1,60})(.)([A-Za-z]{2,5})$ gang

  • (cs) in reply to NYCNetworker
    NYCNetworker:
    hmmm

    guess I'm gonna change my email address to

    hello?????@???.??

    Why not simply ".@ "?

  • Splognosticus (unregistered)
    (?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

    So easy a child could do it.

  • (cs)

    OK, maybe those of us who don't administer email systems haven't heard of plus-addressing. But not ever encountering a valid email address with a hyphen in it? Or more than two periods?

  • RobY (unregistered) in reply to operagost

    Not sure about before the @, but most school systems in the area I work all follow the common pattern of [email protected] .

  • (cs) in reply to RobY
    RobY:
    Not sure about before the @, but most school systems in the area I work all follow the common pattern of [email protected] .

    <firstName>.<lastName>@<domain>.com is pretty popular as well

  • Bryan the K (unregistered)

    The WTF is that he used REGEX, right?

  • alnite (unregistered)

    looks like somebody did not know what a regex is.

  • kongr45gpen (unregistered)

    This is the most correct way of proving if an e-mail address is valid or not:

    '([^\\x00-\\x20\\x22\\x28\\x29\\x2c\\x2e\\x3a-\\x3c'.
    '\\x3e\\x40\\x5b-\\x5d\\x7f-\\xff]+|\\x22([^\\x0d'.
    '\\x22\\x5c\\x80-\\xff]|\\x5c[\\x00-\\x7f])*\\x22)'.
    '(\\x2e([^\\x00-\\x20\\x22\\x28\\x29\\x2c\\x2e'.
    '\\x3a-\\x3c\\x3e\\x40\\x5b-\\x5d\\x7f-\\xff]+|'.
    '\\x22([^\\x0d\\x22\\x5c\\x80-\\xff]|\\x5c\\x00'.
    '-\\x7f)*\\x22))*\\x40([^\\x00-\\x20\\x22\\x28'.
    '\\x29\\x2c\\x2e\\x3a-\\x3c\\x3e\\x40\\x5b-\\x5d'.
    '\\x7f-\\xff]+|\\x5b([^\\x0d\\x5b-\\x5d\\x80-\\xff'.
    ']|\\x5c[\\x00-\\x7f])*\\x5d)(\\x2e([^\\x00-\\x20'.
    '\\x22\\x28\\x29\\x2c\\x2e\\x3a-\\x3c\\x3e\\x40'.
    '\\x5b-\\x5d\\x7f-\\xff]+|\\x5b([^\\x0d\\x5b-'.
    '\\x5d\\x80-\\xff]|\\x5c[\\x00-\\x7f])*\\x5d))*'
    
    (http://www.iamcal.com/publish/articles/php/parsing_email/)
  • Anon (unregistered)

    When I was in grad school I had an e-mail address with a + in it. I often found websites that would barf on accepting that as a valid e-mail address.

  • Paul (unregistered)

    The easiest, and most correct way to validate an email address is to send an email to it.

    The most incorrect way is to use some trivial little regex written by someone who hasn't even heard of RFC822, and just intuits what they think an email address might be.

    I have never seen a regex in the wild, that correctly validates email addresses.

    I would much rather read through the version of in the post, isValidEmailAddress to debug it, than a regular expression complex enough to be comprehensive.

  • (cs)

    I've run into sites that refuse to accept my perfectly valid .name and .info addresses because they think that TLDs shouldn't be more than three letters.

  • wow (unregistered) in reply to Paul
    Paul:
    The easiest, and most correct way to validate an email address is to send an email to it.

    Wow. Just, well, wow. This is possibly the most amazingly dumb thing I have seen posted in a very long time, for more reasons than I can count.

  • Jeff (unregistered)

    Well, yes, sending a test email (or nuking from orbit) is the only way to be sure. But short of that, I think you could just send an XML message to a web service somewhere and let it all be Somebody Else's Problem.

  • Bill G. (unregistered) in reply to dtobias
    dtobias:
    I've run into sites that refuse to accept my perfectly valid .name and .info addresses because they think that TLDs shouldn't be more than three letters.
    Back when I invented the internet, 3 letters was enough for anybody.
  • Someone (unregistered)

    A surprising number of websites also barf on [email protected] style addresses. I mean, .name has only been around 10 years, so I can see how they haven't had time to adjust to non 3-character TLDs. Lord knows that if it doesn't conform to [a-z]+@[a-z]+.[a-z]{3} then it must be invalid!

  • (cs) in reply to NYCNetworker
    NYCNetworker:
    hmmm

    guess I'm gonna change my email address to

    hello?????@???.??

    A polite troll leaves a good hint for his target. With that in mind, this email would be a better choice:

    u.fail@regex

  • (cs)

    Email validation by regex is practically a canonical example of some problems just not having a good solution that everyone can agree on. If you validate down to the letter of the RFCs, you will accept addresses that are technically correct but possibly unreachable, like those based on internal subdomains and not fully qualified. Even once you get past the hurdle of defining exactly what constitutes validity there are still many weird gotchas and it is hard to guarantee you've considered them all.

    PHP to the rescue, I guess. Every standard library needs an equivalent to: filter_var($x, FILTER_VALIDATE_EMAIL)

  • Pat L (unregistered) in reply to wow
    wow:
    Paul:
    The easiest, and most correct way to validate an email address is to send an email to it.

    Wow. Just, well, wow. This is possibly the most amazingly dumb thing I have seen posted in a very long time, for more reasons than I can count.

    Rare is the situation where you only care if an email address is theoretically valid, and not whether it actually exists.= and can be sent mail.

  • Justin (unregistered)

    So I guess I'm safe with my aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa@bbb.cc?

  • (cs) in reply to wow
    wow:
    Paul:
    The easiest, and most correct way to validate an email address is to send an email to it.

    Wow. Just, well, wow. This is possibly the most amazingly dumb thing I have seen posted in a very long time, for more reasons than I can count.

    Why? I think it is a useful idea, you ask the user to input an e-mail address and to confirm it, then you send an activation/validation mail to finish whatever process the app must do. It is less annoying than a script or method that tells you the address is incorrect even though it does exist.

  • (cs)

    Ah, OK. TRWTF is that the first example isn't really appropriate for Java. Here's a more OO version.

    public final class EmailAddress {
    	private final String address;
    
    	public EmailAddress (String address) throws IllegalEmailAddressException {
    		
    		// a null string is invalid
    		if (address == null)
    		    throw new IllegalEmailAddressException("This is an invalid email address.");
    
    		// a string without a "@" is an invalid email address
    		if (address.indexOf("@") < 0)
    		    throw new IllegalEmailAddressException("This is an invalid email address.");
    
    		// a string without a "."  is an invalid email address
    		if (address.indexOf(".") < 0)
    		    throw new IllegalEmailAddressException("This is an invalid email address.");
    
    		if (lastEmailFieldTwoCharsOrMore(address) == false)
    		    throw new IllegalEmailAddressException("This is an invalid email address.");
    
    		if (address.indexOf("!") > 0)
    		    throw new IllegalEmailAddressException("This is an invalid email address.");
    
    		if (address.indexOf("#") > 0)
    		    throw new IllegalEmailAddressException("This is an invalid email address.");
    
    		if (address.indexOf("$") > 0)
    		    throw new IllegalEmailAddressException("This is an invalid email address.");
    
    		if (address.indexOf("%") > 0)
    		    throw new IllegalEmailAddressException("This is an invalid email address.");
    
    		if (address.indexOf("&") > 0)
    		    throw new IllegalEmailAddressException("This is an invalid email address.");
    
    		if (address.indexOf("*") > 0)
    		    throw new IllegalEmailAddressException("This is an invalid email address.");
    
    		if (address.indexOf("+") > 0)
    		    throw new IllegalEmailAddressException("This is an invalid email address.");
    
    		if (address.indexOf("-") > 0)
    		    throw new IllegalEmailAddressException("This is an invalid email address.");
    
    		if (address.indexOf("~") > 0)
    		    throw new IllegalEmailAddressException("This is an invalid email address.");
    
    		if (address.indexOf("ä") > 0)
    		    throw new IllegalEmailAddressException("This is an invalid email address.");
    
    		if (address.indexOf("ö") > 0)
    		    throw new IllegalEmailAddressException("This is an invalid email address.");
    
    		if (address.indexOf("å") > 0)
    		    throw new IllegalEmailAddressException("This is an invalid email address.");
    
    		if (address.indexOf(";") > 0)
    		    throw new IllegalEmailAddressException("This is an invalid email address.");
    		
    
    		this.address = address;
    
    	}
    
    	private static final boolean lastEmailFieldTwoCharsOrMore(String emailAddress) {
            	if (emailAddress == null)
                		return false;
            	StringTokenizer st = new StringTokenizer(emailAddress, ".");
            	String lastToken = null;
            	while (st.hasMoreTokens()) {
                		lastToken = st.nextToken();
            		}
    
    		if (lastToken.length() >= 2) {
    			return true;
    		} else {
    			return false;
    		}
        	}
    
    	public String getAddress() {
    		return this.address;
    	}
    }
    
  • SeySayux (unregistered)

    Actually, indexOf usually is a lot faster than regexes. This'd be my first stab at it, without regexes:

    bool isValidMail(String s) {
        return ((sidx_t i = s.indexOf('@')) != -1 && 
            s.indexOf('.',i) != -1 && 
            s.lastIndexOf('.') inr(s.length-1,s.length-3) &&
            [&s](){ 
                for(uchar c : s) { 
                    if(!(isalpha(c) || isnum(c) ||
                        c ina({'@','.','+'})) return false;
                }
                return true; 
            }());
    }
    

    Of course, this one wouldn't prevent adresses such as [email protected], but hey, it'd make a great T-shirt!(*) :P

    • SeySayux

    (*) Not that that regex wouldn't.

  • XXXXX (unregistered) in reply to Paul
    Paul:
    The easiest, and most correct way to validate an email address is to send an email to it.

    The most incorrect way is to use some trivial little regex written by someone who hasn't even heard of RFC822, and just intuits what they think an email address might be.

    I have never seen a regex in the wild, that correctly validates email addresses.

    I would much rather read through the version of in the post, isValidEmailAddress to debug it, than a regular expression complex enough to be comprehensive.

    I hope no one is still writing code against RFC 822. It was superseded 10 years ago by 2822.

  • Paul (unregistered) in reply to wow

    Try and enumerate a few, please.

    If you are actually writing the actual email-sending part of an actual email-sending application, then I agree, you can't send an email to validate the address.

    If you are doing anything else, then the only reliable (and DRY) way to validate an email address is to try and use it as an email address.

    Why reinvent the wheel by doing some other stupid "validation" beforehand? You make the wheel less and less round every time you reinvent it. What if (as is normally the case) your validation incorrectly discards good addresses that your server can cope with? What if your validation is correct, but your email server is wrong? What if your validation deliberately matches your email server's faulty validation, but someone else fixes the server, leaving you with a pointlessly faulty validation?

    Even if you do get your syntactic validation correct, (which appears highly unlikely, given the prevalence of incorrect validation in the world); all you know is that it looks like an email address, you have no idea whether it is an email address (i.e. one that resolves to a mailbox)

  • (cs)

    For those who don't already know, here's one way to verify whether an email address exists: http://www.labnol.org/software/verify-email-address/18220/. All this nonsense about input validation and sending actual messages isn't much better than the code in the article.

  • Paul (unregistered) in reply to XXXXX
    XXXXX:
    I hope no one is still writing code against RFC 822. It was superseded 10 years ago by 2822.

    Whoops, Good point.

  • riiiiiiight (unregistered) in reply to Justin
    Justin:
    So I guess I'm safe with my spaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaace@spa.ce?

    FTFY

  • SG_01 (unregistered) in reply to dtobias
    dtobias:
    I've run into sites that refuse to accept my perfectly valid .name and .info addresses because they think that TLDs shouldn't be more than three letters.

    I think you need to try with the .museum and .travel TLDs then :D

  • grzlbrmft (unregistered) in reply to wow
    wow:
    Paul:
    The easiest, and most correct way to validate an email address is to send an email to it.

    Wow. Just, well, wow. This is possibly the most amazingly dumb thing I have seen posted in a very long time, for more reasons than I can count.

    In case you can count that far, name three, please.

  • dude (unregistered) in reply to frits
    frits:
    For those who don't already know, here's one way to verify whether an email address exists: <snip>. All this nonsense about input validation and sending actual messages isn't much better than the code in the article.

    The only difference between sending an email and what was done in the article is actually sending data to be delivered. It's a lot more work than just sending an email, and you have no way of verifying that the user actually got the email because you didn't send one...

  • Erik (unregistered)

    Hmm, in the past I've had email address with ! and % in them.

    I've got domains with - in them. I use + all the time (and still see breakage).

    However, I was once <user>@<tld> . That one was fun. :)

  • [email protected] (unregistered)

    [code] else { return "C-C-C-COMBO BREAKER!"; }

  • [email protected] (unregistered)
    else {
        return "C-C-C-COMBO BREAKER!";
    }
    
  • N. Tufnel (unregistered) in reply to Paul
    Paul:
    The easiest, and most correct way to validate an email address is to send an email to it.

    In what sense does that validate anything, if you don't have access to that address's mailbox?

  • (cs) in reply to wow
    wow:
    Paul:
    The easiest, and most correct way to validate an email address is to send an email to it.

    Wow. Just, well, wow. This is possibly the most amazingly dumb thing I have seen posted in a very long time, for more reasons than I can count.

    I was going to say this, but about Alex's regex comment. As others have pointed out, checking all possible valid email addresses using a regex is borderline impossible.

    http://www.regular-expressions.info/email.html

    Also fuck you Akismet, this isn't spam.

  • (cs) in reply to dude
    dude:
    frits:
    For those who don't already know, here's one way to verify whether an email address exists: <snip>. All this nonsense about input validation and sending actual messages isn't much better than the code in the article.

    The only difference between sending an email and what was done in the article is actually sending data to be delivered. It's a lot more work than just sending an email, and you have no way of verifying that the user actually got the email because you didn't send one...

    How are you verifying again?

  • Paul (unregistered) in reply to N. Tufnel
    N. Tufnel:
    Paul:
    The easiest, and most correct way to validate an email address is to send an email to it.

    In what sense does that validate anything, if you don't have access to that address's mailbox?

    It validates that whatever you are using to actually send mail can parse the address.

    It validates that the address can be used to reach a mail server.

    Depending on the setup of the receiving server, it validates that mail can be sent to the specified mailbox (rather than returning an error). If there is a catch-all, then this doesn't work.

    As others have mentioned, sending a validation message ensures that the address entered reaches a mailbox that the intended recipient can read.

  • Andrew (unregistered)

    It seems that 90% of sites restrict valid emails, I usually use something like this

    .[^ ]+.@[^ ]+.[^ ]

  • (cs) in reply to N. Tufnel
    N. Tufnel:
    Paul:
    The easiest, and most correct way to validate an email address is to send an email to it.

    In what sense does that validate anything, if you don't have access to that address's mailbox?

    One would hope that the person providing the email address would have access to it for confirmation...

    I mean seriously... two people this dumb?

    In my opinion, if you're just trying to prevent typing mistakes: anything, followed by an @, followed by anything, followed by a ., followed by anything. That's all I ever use. If you're trying to validate that an email address is real - send a confirmation email. Anything in between is prone to mistakes and probably isn't helping anyone.

  • (cs) in reply to Paul
    Paul:
    N. Tufnel:
    Paul:
    The easiest, and most correct way to validate an email address is to send an email to it.

    In what sense does that validate anything, if you don't have access to that address's mailbox?

    It validates that whatever you are using to actually send mail can parse the address.

    It validates that the address can be used to reach a mail server.

    Depending on the setup of the receiving server, it validates that mail can be sent to the specified mailbox (rather than returning an error). If there is a catch-all, then this doesn't work.

    As others have mentioned, sending a validation message ensures that the address entered reaches a mailbox that the intended recipient can read.

    [email protected]

    Mailinator's whole purpose in life is to accept mail and junk it after a very short time. EMail address validation is stupid as a general idea, but makes sense if you have a specific purpose for doing so. Examples:

    If you need to validate that the email address belongs to the user -- Send a one-off email that the user must respond to.

    If you are trying to help a user enter data on a form -- do any half-assed validation and throw up a kindly worded warning if it doesn't match. Allow the user to continue on validation failure.

    Notice in both cases that it is not necessary to have a robust validation procedure. I can't think of a reason, other than while writing mail router software, to strictly validate the format of an email address.

  • David Martensson (unregistered) in reply to SeySayux

    This one is actually better than most I have seen.

    I have actually tried to build a test that was 100 % rfc compliant and the result was horrifyingly large, and we still found existing, working emails that was denied ;)

    We finally went with a commercial component that so far have correctly handled every address we feed it.

    For any lesser ambition, just check the very simplest format and if you need more, use a verification where you actually send an email to the address and require them to click a link.

    With the new Asian letters coming I would not even try to build my own again =)

  • (cs) in reply to Paul
    Paul:
    The easiest, and most correct way to validate an email address is to send an email to it.

    The most incorrect way is to use some trivial little regex written by someone who hasn't even heard of RFC822, and just intuits what they think an email address might be.

    I have never seen a regex in the wild, that correctly validates email addresses.

    I would much rather read through the version of in the post, isValidEmailAddress to debug it, than a regular expression complex enough to be comprehensive.

    I agree 100%. The best way is to contact the purported email server and see if the account is on there.

    Of course, you still have to parse out the server from the email address.

  • (cs)

    (I had to add a bunch of spaces to this to make the damn spam filter let me post, argh.)

    Gmail lets you arbitrarily use periods in your email address:

    bob dot frank at gmail dot com is the same inbox as bobfrank at gmail.com and b dot o dot b dot f dot r dot a dot n dot k at gmail dot com.

    Gmail also lets you use plus signs to assign labels to incoming messages - bob dot frank + noodles at gmail dot com will assign the message the label 'noodles' in your gmail inbox.

    I use it all the time for spam filtering (myaddress + nameofoffendingsite at gmail dot com) but I have run into many, many instances where it thinks my plus sign invalidates the address. Sigh.

    In the mid 90s I had an email address from a free webmail service that allowed you to choose your domain name. I think the site was "My Own Email" or something.

    I signed up with my.name at imatrekkie dot com (yeah yeah) and used it for a few months, signed up for a lot of newsletters. Then I had it forward all my messages to my new-fangled pre-Microsoft Hotmail account.

    Then My Own Email had a major overhaul. Their new software didn't accept email addresses with periods in them. My address was still valid and receiving newsletters and forwarding them to my Hotmail account... but I couldn't log in.

    My old email then got on a spam list and my Hotmail account was absolutely overwhelmed with forwarded crap... and I had no way to log in to delete my account, turn off forwarding, or anything. It was pretty special.

  • (cs) in reply to N. Tufnel
    N. Tufnel:
    Paul:
    The easiest, and most correct way to validate an email address is to send an email to it.

    In what sense does that validate anything, if you don't have access to that address's mailbox?

    You don't know much about SMTP, do you?

  • coop (unregistered) in reply to Paul

    This assumes the e-mail address is accessible on the internet, or is at least on the same network. Consider the case of networks that are completely disconnected from each other...

  • socknet (unregistered) in reply to jonnyq

    [quote user="jonnyq"][quote user="N. Tufnel"]

    One would hope that the person providing the email address would have access to it for confirmation...

    I mean seriously... two people this dumb?

    In my opinion, if you're just trying to prevent typing mistakes: anything, followed by an @, followed by anything, followed by a ., followed by anything. That's all I ever use. If you're trying to validate that an email address is real - send a confirmation email. Anything in between is prone to mistakes and probably isn't helping anyone.[/quote]

    Please provide contact details for your legal representation:

    Name: _____ Phone: _____ Email: ______

    Please provide contact details for your IT Support team:

    Phone: _____ Email: ______

    Please provide your preferred email address to be created"

    Email: ______

    etc.

    There are MANY cases where you might ask someone to provide an email address which they may not have access to.

    Sending real emails to these addresses is pretty silly. 99.9% of the time, the simple validation will suffice. Whether or not you care about that missing 0.1%, and how you do so, would be completely dependent on context.

  • socknet (unregistered) in reply to jonnyq
    jonnyq:
    One would hope that the person providing the email address would have access to it for confirmation...

    I mean seriously... two people this dumb?

    In my opinion, if you're just trying to prevent typing mistakes: anything, followed by an @, followed by anything, followed by a ., followed by anything. That's all I ever use. If you're trying to validate that an email address is real - send a confirmation email. Anything in between is prone to mistakes and probably isn't helping anyone.

    Please provide contact details for your legal representation:

    Name: _____ Phone: _____ Email: ______

    Please provide contact details for your IT Support team:

    Phone: _____ Email: ______

    Please provide your preferred email address to be created"

    Email: ______

    etc.

    There are MANY cases where you might ask someone to provide an email address which they may not have access to.

    Sending real emails to these addresses is pretty silly. 99.9% of the time, the simple validation will suffice. Whether or not you care about that missing 0.1%, and how you do so, would be completely dependent on context.

Leave a comment on “Email Validation Validity”

Log In or post as a guest

Replying to comment #:

« Return to Article