The Daily WTF: Curious Perversions in Information Technology

2011-04-27 Reply Admin

pfft:
socknet:
pfft:
socknet:
pfft:
socknet:
EvanED:

99.9% of the time, the simple validation will suffice. Whether or not you care about that missing 0.1%, and how you do so, would be completely dependent on context.

See, I disagree. I think it'd be much closer to 99.9% the other way. How many times have you entered someone else's email address in a form somewhere vs your own? And almost all the time you enter your own, I think it makes sense to send a confirmation email.

You misunderstood my statement here. I am saying that 99.9% of the time, a very basic validation check will be sufficient to ensure an email address is valid and that it is only 0.1% of the time that you have oddball cases (such as '+' characters in the address).

I was in no way referring to the likelihood that a person is entering their own email address vs the email address of another user.
So what you're saying is that your validation code doesn't validate anything and possibly excludes valid email addresses?

No, most people could see that's not what I'm saying - not sure what logic you are using to think that. Please elaborate on how you reach that conclusion from my statement.

Can you please cite this "most people" study?

so you are saying your not a person and neither are your parents?

</pfft logic>

I'm a person and I understand me. Therefore, most people can understand me.

</socknet_logic>

So you are saying that cats can understand you?

</pfft logic>

(I am still waiting for you to explain the logical steps which you took to take the statement "a simple validation will work most of the time" and conclude "no validation will be performed, but some valid entries will be rejected")

2011-04-27 Reply Admin

A string without a "@" is an invalid email address A string without a "." is an invalid email address Partyin' Partyin' Partyin' Partyin' Fun Fun Fun FUN Tomorrow is Thursday And Friday comes after... wards FRIDAY FRIDAY

2011-04-27 Reply Admin

The REAL reason for "validating" email addresses is to harvest them for SPAM. I own a domain, and I get several SPAM emails to "addresses" that are in headers. Usually these are dumb timestamps, but some email harvester thinks that they are valid email addresses.

As for validating an address, give up! If you are on a local machine with a couple of users, the domain name is implied, so you don't even need an '@', so IN THEORY you could on an email service (gmail) just send a message to "friend" and the implied address would be "[email protected]". Fortunately most email places do check for the '@', but if they are smart, that is about all they do. In fact browser plug-ins highlight what they think are email addresses (simple test!) so you can click on them (like Firefox just did).

Me, I frequently use a dash ('-') to make unique email addresses when requested. That way I can see who is spreading around email addresses (answer: quite a few people).

Face it: there is no simple answer!

Shishire · 2011-04-27 Reply Admin

Herby:
The REAL reason for "validating" email addresses is to harvest them for SPAM. I own a domain, and I get several SPAM emails to "addresses" that are in headers. Usually these are dumb timestamps, but some email harvester thinks that they are valid email addresses.
As for validating an address, give up! If you are on a local machine with a couple of users, the domain name is implied, so you don't even need an '@', so IN THEORY you could on an email service (gmail) just send a message to "friend" and the implied address would be "[email protected]". Fortunately most email places do check for the '@', but if they are smart, that is about all they do. In fact browser plug-ins highlight what they think are email addresses (simple test!) so you can click on them (like Firefox just did).

Me, I frequently use a dash ('-') to make unique email addresses when requested. That way I can see who is spreading around email addresses (answer: quite a few people).

Face it: there is no simple answer!

Incorrect. According to RFC3696, section 3, an email address must contain both a local part and a remote part, separated by an "@" symbol, even if the mail is going to a mailbox on the same system. Many mail programs violate this, and allow address without a remote part.

operagost · 2011-04-27 Reply Admin

Dan:
Still better than just guessing the rules.
The most annoying thing is unsubscribe pages that use different validation rules than the form that got you on the list - they tell you you can't unsubscribe because you are not giving them a valid email address, for an address they are sending daily emails to.

That gets them submitted to black lists, here.

2011-04-27 Reply Admin

No no no.

The easiest and most correct way to validate an e-mail address field is to try sending it an e-mail.

E-mails do not need @ symbols, they do not need a . in the domain, they do not need to contain only letters or numbers.

Nagesh · 2011-04-27 Reply Admin

ÃƒÆ’Ã†â€™Ãƒâ€ Ã¢â‚¬â„¢ÃƒÆ’Ã¢â‚¬Â ÃƒÂ¢Ã¢â€šÂ¬Ã¢â€žÂ¢:
A string without a "@" is an invalid email address A string without a "." is an invalid email address Partyin' Partyin' Partyin' Partyin' Fun Fun Fun FUN Tomorrow is Thursday And Friday comes after... wards FRIDAY FRIDAY

Total false, you corrupted memory chip!

2011-04-27 Reply Admin

You would also need a confirmation of your own existence from a third party. A verification by your mother or father would suffice.

Nagesh · 2011-04-27 Reply Admin

Nagesh2.0:
You would also need a confirmation of your own existence from a third party. A verification by your mother or father would suffice.

Corrupted memory chip has come back to copy my style!!!

2011-04-27 Reply Admin

Nagesh2.0:
You would also need a confirmation of your own existence from a third party. A verification by your mother or father would suffice.

Nope, not even a birth certificate would suffice.

2011-04-27 Reply Admin

Nagesh:

Total false, you corrupted memory chip!

Corrupted memory chip? It was probably manufactured by one of your cousins. Tech support from one of your other cousins didn't help.

Nagesh · 2011-04-27 Reply Admin

ÃƒÆ’Ã†â€™Ãƒâ€ Ã¢â‚¬â„¢ÃƒÆ’Ã¢â‚¬Â ÃƒÂ¢Ã¢â€šÂ¬Ã¢â€:
Nagesh:

Total false, you corrupted memory chip!

Corrupted memory chip? It was probably manufactured by one of your cousins. Tech support from one of your other cousins didn't help.

Are you saying we are related, 8086 procesor?

2011-04-27 Reply Admin

article:
The easiest – and mostly correct – method is to use a regular expression

But the second dude did use a regex

2011-04-27 Reply Admin

Paul:
The easiest, and most correct way to validate an email address is to send an email to it.
The most incorrect way is to use some trivial little regex written by someone who hasn't even heard of RFC822, and just intuits what they think an email address might be.

I have never seen a regex in the wild, that correctly validates email addresses.

I would much rather read through the version of in the post, isValidEmailAddress to debug it, than a regular expression complex enough to be comprehensive.

I seem to recall seeing a site that compared several different (user submitted, I believe) regular expressions that claimed to validate email.
For each one they managed to find addresses that failed.

I've never understood the obsession with email validation myself. It's easy enough to make up something like [email protected] - don't know whether it's valid or not, and neither will a validator...

Validating email addresses is similar to validating dates - only the brave attempt it, and more often than not it is not as critical as people think. Having had to work with all sorts of name matching and identity resolution functionality, I've found that entered data - correct or not - is most useful in its rawest form.

Why force users to make data appear more realistic? It only makes the job of verifying whether it's legitimate a lot more difficult...

2011-04-27 Reply Admin

What bothers me the most is the sites that think they need to verify that the email address is not too long. And who don't know how long an address is allowed to be.

The spec is a bit inconsistent, but it is pretty clear that a 256 character long address can be valid. Loads of systems won't even permit 100 characters in the email address.

As for the standard itself, I don't get how somebody after specifying that the part before the @ could be 64 characters, and the part after the @ could be 255, then go and specify that the entire concatenated string must be at most 256 characters.

If you absolutely have to put a limit on how long email addresses your system will support, you should make the limit 320 characters, as anything above that will be a clear violation of the spec.

2011-04-27 Reply Admin

Paul:
Try and enumerate a few, please.
If you are actually writing the actual email-sending part of an actual email-sending application, then I agree, you can't send an email to validate the address.

If you are doing anything else, then the only reliable (and DRY) way to validate an email address is to try and use it as an email address.

Why reinvent the wheel by doing some other stupid "validation" beforehand? You make the wheel less and less round every time you reinvent it. What if (as is normally the case) your validation incorrectly discards good addresses that your server can cope with? What if your validation is correct, but your email server is wrong? What if your validation deliberately matches your email server's faulty validation, but someone else fixes the server, leaving you with a pointlessly faulty validation?

Even if you do get your syntactic validation correct, (which appears highly unlikely, given the prevalence of incorrect validation in the world); all you know is that it looks like an email address, you have no idea whether it is an email address (i.e. one that resolves to a mailbox)

Agree. Why validate at all? Simply send an email with further instructions. What? The email didn't reach you? Bad luck, there matey.... Of course, it simply means people create a new address for themselves, but many people would be too lazy to create a new email account each time they want to submit a form (and lots seem paranoid that somehow an actual email account {even with bodgey detail} will be more traceable than their {possibly spoofed} IP address).

Not even sure why email is needed in a lot of situations. We seem obsessed with arbitrarily collecting any information users are willing to give - often without thinking about why we collect it. Why, for example, does a forum want name, address, birth date, telephone number etc - They might claim that it is to ensure people behave online, but how thoroughly do they actually check that this 'very critical' data is actually even remotely correct? Other than validate the email address (and sometimes phone number) looks vaguely valid, not a lot.

<off topic> For the overly paranoid reader, it's all because your personal information is worth a lot of money!! </off topic>

2011-04-27 Reply Admin

frits:
dude:
frits:
For those who don't already know, here's one way to verify whether an email address exists: <snip>. All this nonsense about input validation and sending actual messages isn't much better than the code in the article.

The only difference between sending an email and what was done in the article is actually sending data to be delivered. It's a lot more work than just sending an email, and you have no way of verifying that the user actually got the email because you didn't send one...

How are you verifying again?

Why, click on the link in the email of course...Kinda got to have received the email to click the link, right?

2011-04-27 Reply Admin

N. Tufnel:
Paul:
The easiest, and most correct way to validate an email address is to send an email to it.

In what sense does that validate anything, if you don't have access to that address's mailbox?

If your mail server sends it, and it doesn't bounce, then it's valid enough to accept. This isn't rocket science.

2011-04-27 Reply Admin

David Martensson:
This one is actually better than most I have seen.
I have actually tried to build a test that was 100 % rfc compliant and the result was horrifyingly large, and we still found existing, working emails that was denied ;)

We finally went with a commercial component that so far have correctly handled every address we feed it.

For any lesser ambition, just check the very simplest format and if you need more, use a verification where you actually send an email to the address and require them to click a link.

With the new Asian letters coming I would not even try to build my own again =)

We haves new letters coming? That be complicate our language some.

2011-04-27 Reply Admin

socknet:
Validate This:
So how are you validating that they're entering the correct name and phone number? Or are you assuming that any (insert country specific format) sequence of digits is _the_ correct phone number for that person?

Not really relevant to this discussion on email validation, but perhaps there are common methods which people use for names and phone numbers, feel free to google it if you are interested.

Validate This:
In each of those cases the difference between entering a validly formatted but incorrect e-mail address and an invalidly formatted e-mail address is zilch. You still have garbage data either way.

Correct, but doesn't really add anything to the conversation. As mentioned, most of the time you don't have to prove with absolute certainty that an email address is correct, being 'reasonably sure' is usually close enough (asking people to input an email address twice seems to be common nowdays and probably catches a lot of entry mistakes). When you do need to be 100% sure on the email address, that is when it is a good time to do things such as send validation emails which require a response.

Actually I always find the 'enter it twice' amusing - usually you can copy paste the first one.

2011-04-27 Reply Admin

socknet:
jonnyq:
One would hope that the person providing the email address would have access to it for confirmation...
I mean seriously... two people this dumb?

In my opinion, if you're just trying to prevent typing mistakes: anything, followed by an @, followed by anything, followed by a ., followed by anything. That's all I ever use. If you're trying to validate that an email address is real - send a confirmation email. Anything in between is prone to mistakes and probably isn't helping anyone.

Please provide contact details for your legal representation:

Name: _____ Phone: _____ Email: ______

Please provide contact details for your IT Support team:

Phone: _____ Email: ______

Please provide your preferred email address to be created"

Email: ______

etc.

There are MANY cases where you might ask someone to provide an email address which they may not have access to.

Sending real emails to these addresses is pretty silly. 99.9% of the time, the simple validation will suffice. Whether or not you care about that missing 0.1%, and how you do so, would be completely dependent on context.

And what does the user do if they haven't a valid email address for the legal representative, or for their IT support?

Only the most basic validation is required on email addresses (that they contain the '@' and it's not first or last should be sufficient).
Knowing whether an email address exists is difficult. Knowing whether it is legitimate is near impossible. Why go to any significant effort to validate an address that may not actually work anyway?

I frequently use things like [email protected] or [email protected] when I don't think I should give an address. It seems to pass validation tests, but what use is it to the people who wanted it? At least if I wrote "Bugger off and leave me alone" they could easily work out that they haven't a valid address - of course this requires manual intervention, but why not use validation to flag addresses for someone to investigate, rather than to tell the user "You tell dirty great lies!!". I agree (for the most part) with an earlier comment that allowing silly addresses to get through (and perhaps having them flagged for manual intervention) allows a far better assessment of whether an email address is valid or not. It should be noted that quite often it seems the address won't necessarily be used ever (eg the legal representation - the email would most likely only be used if the phone number doesn't work) - so why bother validating it?

2011-04-27 Reply Admin

trtrwtf:
jonnyq:
N. Tufnel:
Paul:
The easiest, and most correct way to validate an email address is to send an email to it.

In what sense does that validate anything, if you don't have access to that address's mailbox?

One would hope that the person providing the email address would have access to it for confirmation...

I mean seriously... two people this dumb?

In my opinion, if you're just trying to prevent typing mistakes: anything, followed by an @, followed by anything, followed by a ., followed by anything. That's all I ever use. If you're trying to validate that an email address is real - send a confirmation email. Anything in between is prone to mistakes and probably isn't helping anyone.

I think there's some confusion about the meaning of "validate" here. Some people seem to think it means "make sure this email address is really the one that belongs to this person", others believe (correctly) that it means "make sure that this email address is a well-formed address".

The former case might be a common scenario, but when you "validate" something you're checking to see if it's valid - not whether it's correct.

Question: Which way at the light Correct Answer: Left Valid, but incorrect answer: Right Invalid Answer: It's just a flesh wound

2011-04-27 Reply Admin

wow:
Paul:
The easiest, and most correct way to validate an email address is to send an email to it.

Wow. Just, well, wow. This is possibly the most amazingly dumb thing I have seen posted in a very long time, for more reasons than I can count.

No - it's not wrong, or dumb, but you most certainly are. There is no way to validate an email address with a regex. None. Go read some actual literature on the subject and come back here when you're feeling humble and embarrassed. The easiest, and most correct way to validate an email address is to send an email to it. That is undeniably, eternally true, and you do damage every time you deny it.

Matt Westwood · 2011-04-27 Reply Admin

Gerald:
socknet:
Validate This:
So how are you validating that they're entering the correct name and phone number? Or are you assuming that any (insert country specific format) sequence of digits is _the_ correct phone number for that person?

Not really relevant to this discussion on email validation, but perhaps there are common methods which people use for names and phone numbers, feel free to google it if you are interested.

Validate This:
In each of those cases the difference between entering a validly formatted but incorrect e-mail address and an invalidly formatted e-mail address is zilch. You still have garbage data either way.

Correct, but doesn't really add anything to the conversation. As mentioned, most of the time you don't have to prove with absolute certainty that an email address is correct, being 'reasonably sure' is usually close enough (asking people to input an email address twice seems to be common nowdays and probably catches a lot of entry mistakes). When you do need to be 100% sure on the email address, that is when it is a good time to do things such as send validation emails which require a response.

Actually I always find the 'enter it twice' amusing - usually you can copy paste the first one.

... except for those irritating websites where they don't allow cut and paste from box a to box b, and you have to type the damn thing in twice.

Whether the email is valid in format or not is utterly immaterial. The only thing that matters at the end of the day is whether or not it works. Anything in between is spoo.

Validate the email address by sending an email saying "Please press this link if you are so-and-so, and yadayada ..." which is what most sites do. If you've typed in the email address of a pirate who is about to rape your account from here to buggery then sorry but Darwin's laws apply: you've got to learn to be more careful about what you type.

If you are entering email addresses for functions you do not have direct access to (e.g. as above: "enter email address for your tech support department") then the same ought still to apply: you just send the email as "You have been identified as the tech support team for ..." blahblah.

Am I alone in thinking this is all being made too complicated? Don't overthink.

Jaime · 2011-04-27 Reply Admin

Eric TF Bat:
wow:
Paul:
The easiest, and most correct way to validate an email address is to send an email to it.

Wow. Just, well, wow. This is possibly the most amazingly dumb thing I have seen posted in a very long time, for more reasons than I can count.

No - it's not wrong, or dumb, but you most certainly are. There is no way to validate an email address with a regex. None. Go read some actual literature on the subject and come back here when you're feeling humble and embarrassed. The easiest, and most correct way to validate an email address is to send an email to it. That is undeniably, eternally true, and you do damage every time you deny it.

Look into mailinator. It's trivial to respond to a confirmation email without giving a single iota of useful information. If you're away from your own computer it's also easier than giving your real address. Anybody who thinks they can validate* an email address is fooling themselves. Mailinator will accept email for any address from any one, so it will pass any SMTP checks. The only undeniable truth is that email addresses can never be fully validated*, ever.

Assuming that by validate we mean to confirm that the address can be used in the future to contact this individual

2011-04-27 Reply Admin

jnewton:
RobY:
Not sure about before the @, but most school systems in the area I work all follow the common pattern of [email protected] .

<firstName>.<lastName>@<domain>.com is pretty popular as well

and, for gmail users, [email protected] is gaining momentum

2011-04-27 Reply Admin

...(asking people to input an email address twice seems to be common nowdays and probably catches a lot of entry mistakes).

This is a useless convention as well - it is such a pain in the ass that people simply cut and paste the email address into the duplicate field. Note that we don't do this for other data - why not type in everything twice to be sure?

2011-04-27 Reply Admin

By my understanding, the 100 character limit isn't validation per se, but defense against SQL injection (and other similar) attacks. As the overwhelming majority of email addresses (don't have a study for this, but c'mon) are under 100 characters, the odds that someone inputting something 500 characters long is doing something nefarious are fairly good. There was a study performed at MIT a few years back that you can effectively prevent virtually all injection attacks with a 100 character limit -- as its hard to do much that's interesting in 100 characters.

2011-04-27 Reply Admin

You misunderstood my statement here. I am saying that 99.9% of the time, a very basic validation check will be sufficient to ensure an email address is valid and that it is only 0.1% of the time that you have oddball cases (such as '+' characters in the address).

Excluding a small percentage of valid data from being entered for no other reason than to have a dubious validation process that serves little or no purpose is total FAIL.

Reminds me of the story of the drivers licensing system that could not issue licenses to people whose surname was less than 3 characters.

2011-04-27 Reply Admin

Eric TF Bat:
No - it's not wrong, or dumb, but you most certainly are. There is no way to validate an email address with a regex. None. Go read some actual literature on the subject and come back here when you're feeling humble and embarrassed. The easiest, and most correct way to validate an email address is to send an email to it. That is undeniably, eternally true, and you do damage every time you deny it.

I'm sure the school will appreciate it when it tries to deal with Little Bobby Relay and gets banned for porn spam.

2011-04-27 Reply Admin

Honestly, anyone writing validation code that disallows subaddressing in the year 2011 should probably be taken out back and shot.

At minimum, their code should be nuked, possibly from orbit.

2011-04-27 Reply Admin

Testing to see if there is an MX value for the domain is a great way to determine if an email address is valid assuming a proper caching nameserver setup and some application side caching to boot.

2011-04-27 Reply Admin

Sysadmin:
Testing to see if there is an MX value for the domain is a great way to determine if an email address is valid assuming a proper caching nameserver setup and some application side caching to boot.

That is, if the DOMAIN for an email address is valid. Now that finger doesn't work anymore there's next to no chance to test the user aside from sending it as one other said.

2011-04-27 Reply Admin

I'll take NullPointerException for 100, Alex.

SQLDave · 2011-04-27 Reply Admin

Question for those on this forum who are smarter in the ways of The Web (tm), which is probably 97% of you as I'm just a lowly DBA.

Like many here, I use the [email protected] format. Whenever I encounter a site which tells me that's an invalid email address, I take the time to send a note to "webmaster" or "contact us" or whatever. Usually I get no reply. Recently, however, I got this reply back from a site that I had an otherwise good experience with:

"Unfortuanalty [sic] hackers use the plus sign in code to hack websites so we strip the + sign out and throw it as an error. I cant change that."

My question for you web/email experts is, is he blowing smoke up my skirt,or is there something to what he said. (My guess: total smoke).

Thanks!

2011-04-28 Reply Admin

This works for credit card numbers too. Why validate it when you can just run it?

lolwtf · 2011-04-28 Reply Admin

Jaime:
If you need to validate that the email address belongs to the user -- Send a one-off email that the user must respond to.
If you are trying to help a user enter data on a form -- do any half-assed validation and throw up a kindly worded warning if it doesn't match. Allow the user to continue on validation failure.

This. Exactly this.

2011-04-28 Reply Admin

EZMoney:
This works for credit card numbers too. Why validate it when you can just run it?

Yah, Credit cards is different - there rules about what is and isn't valid is simpler - and we are a little more concerned about accuracy.

While I think of it, where does CC validation occur? On the website, or at some third-party site? (I;'m guessing bit of both, but point is there would be little to stop you from doing any more than basic sanity check that we only have numbers - the validity can always be tested by bank)

2011-04-28 Reply Admin

HellKarnassus:
wow:
Paul:
The easiest, and most correct way to validate an email address is to send an email to it.

Wow. Just, well, wow. This is possibly the most amazingly dumb thing I have seen posted in a very long time, for more reasons than I can count.
Why? I think it is a useful idea, you ask the user to input an e-mail address and to confirm it, then you send an activation/validation mail to finish whatever process the app must do. It is less annoying than a script or method that tells you the address is incorrect even though it does exist.

Regex parsing to validate an email address does have a place in the web, but it is an advisory place only. A little piece of javascript or what have you which can indicate that the website thinks there might be a problem with the email address the user has supplied (after all, who doesn't make typos?) but allow the user to carry on anyway, so that the email server, which is the only way to be sure, can actually use the address. The same is true with postal addresses. The number/post code abbreviation (or zip code for the internationally challenged) is a shorthand which works in most but not all cases, as is any email address regex.

Now for a real life example: SWMBO recently applied for a permanent visa. The first clerk checked that all our papers were in order and (before we paid) told us that the application would probably fail because certain information was missing BUT ALLOWED US TO CARRY ON ANYWAY (this is the validation regex). When we did just that, the actual clerk accepted the application regardless of the missing data because he was closer to the actual decision making process (this is the email server itself).

tl;dr: Let me use a fucking + symbol in my email address you useless PHP ridden so-called websites.

2011-04-28 Reply Admin

SQLDave:
Question for those on this forum who are smarter in the ways of The Web (tm), which is probably 97% of you as I'm just a lowly DBA.
Like many here, I use the [email protected] format. Whenever I encounter a site which tells me that's an invalid email address, I take the time to send a note to "webmaster" or "contact us" or whatever. Usually I get no reply. Recently, however, I got this reply back from a site that I had an otherwise good experience with:

"Unfortuanalty [sic] hackers use the plus sign in code to hack websites so we strip the + sign out and throw it as an error. I cant change that."

My question for you web/email experts is, is he blowing smoke up my skirt,or is there something to what he said. (My guess: total smoke).

Thanks!

Using the + sign is common in SQL injection attacks. Main use is for tring to get into areas you are not suppose to such as a identification number. Send a 2+2 and if the web site did not do proper programming and use parameterized queries you are now bringing up ID number 4.

It is smoke and mirrors because they are not solving the problem correctly and directing problem somewhere else.

Or in other words don't trust that site with any important information.

2011-04-28 Reply Admin

Paul:
I do concede that if you are validating addresses from a list, rather than immediate user input, an email shouldn't actually be sent out to the recipient's mailbox. However, the DRY way to validate is still to use the same tool that will actually be sending the mail (but offline).

If you are typing emails from a list, the most correct you can possibly come is to verify that what is typed into the computer matches what is on the written list.

What if an address as written, real or not, does not pass the validation technique du jour? Do you 'fix' it? Delete it?

2011-04-28 Reply Admin

Jaime:
If you are trying to help a user enter data on a form -- do any half-assed validation and throw up a kindly worded warning if it doesn't match. Allow the user to continue on validation failure.
Notice in both cases that it is not necessary to have a robust validation procedure. I can't think of a reason, other than while writing mail router software, to strictly validate the format of an email address.

This times a million.

PS: If I screw when typing my email address it's likely to be a misspelled domain or something which your dumbass email validator won't catch anyway.

The ONLY thing you should be doing with input data is checking it for SQL injection attacks.

Also: Don't ask me to "retype your email" in two different input boxes. It gets copied and pasted so any error will just be repeated.

2011-04-28 Reply Admin

Jaime:
* Assuming that by validate we mean to confirm that the address can be used in the future to contact this individual

No, you cannot ever confirm that the address can be used in the future, because that requires prescience. If, for example, someone uses their work email address, that is tied to their tenure at that organisation. The best you can do is check that it reaches them now. However, that is not what we mean by validate.

By Validate, we mean confirm that it is formatted according to the RFC. We could go one step further and say we are confirming that it is formatted in such a way as to allow the mail-sending application to parse enough information out of it to be able to send to it.

By actually sending it, even to a service like mailinator, you also get to check whether the receiving server specified in the domain-part can accept mail destined for the target specified in the local-part. It is not relevant whether this is a private mailbox, or a persistent one or any other feature that, simply by convention, one normally associates with mailboxes.

2011-04-28 Reply Admin

Well, the correct regexp to validate an email address is mentioned here: http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html so to really validate an email address, using regexps is stupid. better write a proper parser, since the format is much more complicated than anyone can imagine with all the special character stuff and inline-comments.

Using regexps on various websites is the reason why professional email users all the time get their emails rejecetd, most do not even catch [email protected] as a valid address.

2011-04-28 Reply Admin

Duh... Character string processing WTFs are immportal :-P

2011-04-28 Reply Admin

I test e-mail addresses empirically. Connect to MX server and ask it. When it says the address is ok then you can politely close the connection.

I do run a simple regex to sanitise the input first.

Admittedly this assumes the mail server is up and that you're not validating a huge pile of addresses.

2011-04-28 Reply Admin

The main problem with using the network to validate it is that email isn't instant - it can take a long time to find out whether the address is accepting mail or not.

Personally I agree with the "why ask for an email address in the first place?" crowd - it's done out of habit these days. If you need it for something (e.g. password retrieval) then you also need to verify the person who entered the address actually wants to use your service and has access to that address, so sending an email with a confirmation link is required.

But you can't actually run an SMTP session to do a very meaningful check in real-time, unless you just want to use your local SMTP server to tell you if it thinks the address is bogus or not.

You also need to do some very basic level of parsing, since you can't just dump whatever the user entered straight to your database/mail server. But I wouldn't do any more than:

check there's no whitespace (including newlines)
check there's an @
check there's stuff before and after the @

Kiss me I'm Polish · 2011-04-28 Reply Admin

Mike:
The main problem with using the network to validate it is that email isn't instant - it can take a long time to find out whether the address is accepting mail or not.
Personally I agree with the "why ask for an email address in the first place?" crowd - it's done out of habit these days. If you need it for something (e.g. password retrieval) then you also need to verify the person who entered the address actually wants to use your service and has access to that address, so sending an email with a confirmation link is required.

But you can't actually run an SMTP session to do a very meaningful check in real-time, unless you just want to use your local SMTP server to tell you if it thinks the address is bogus or not.

You also need to do some very basic level of parsing, since you can't just dump whatever the user entered straight to your database/mail server. But I wouldn't do any more than:

check there's no whitespace (including newlines)

check there's an @

check there's stuff before and after the @

"Mike Clueless"@example.org is a valid email address. Oops.

2011-04-28 Reply Admin

Kiss me I'm Polish:
"Mike Clueless"@example.org is a valid email address. Oops.

You are of course technically correct (the best kind of correct), but nobody really uses email addresses with spaces in them, because very few systems support them, and even fewer email address "validators" will accept them. ;)

RFC 5321 even says: a host that expects to receive mail SHOULD avoid defining mailboxes where the Local-part requires (or uses) the Quoted-string form

I'm happy to reject the one guy who thinks it's fun to try to use such an address. If he actually wants to pass the validation, he'll just retry using the email address he uses for everything else that rejects that form.

2011-04-28 Reply Admin

The fact that anyone thinks this is something for a regex is the real WTF. Not one regex I've seen yet will check to see whether or not the email address is part of a valid TLD, which is pretty important I'd say!

Sure, it's possible to do it with a regex if you really tried hard, but would you really want a 3000 character regex? Imagine trying to debug it.

"this is still a valid email address @!~?"@someplace.com

Email Validation Validity

Leave a comment on “Email Validation Validity”