- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
So you are saying that cats can understand you?
</pfft logic>
(I am still waiting for you to explain the logical steps which you took to take the statement "a simple validation will work most of the time" and conclude "no validation will be performed, but some valid entries will be rejected")
Admin
A string without a "@" is an invalid email address A string without a "." is an invalid email address Partyin' Partyin' Partyin' Partyin' Fun Fun Fun FUN Tomorrow is Thursday And Friday comes after... wards FRIDAY FRIDAY
Admin
The REAL reason for "validating" email addresses is to harvest them for SPAM. I own a domain, and I get several SPAM emails to "addresses" that are in headers. Usually these are dumb timestamps, but some email harvester thinks that they are valid email addresses.
As for validating an address, give up! If you are on a local machine with a couple of users, the domain name is implied, so you don't even need an '@', so IN THEORY you could on an email service (gmail) just send a message to "friend" and the implied address would be "[email protected]". Fortunately most email places do check for the '@', but if they are smart, that is about all they do. In fact browser plug-ins highlight what they think are email addresses (simple test!) so you can click on them (like Firefox just did).
Me, I frequently use a dash ('-') to make unique email addresses when requested. That way I can see who is spreading around email addresses (answer: quite a few people).
Face it: there is no simple answer!
Admin
Incorrect. According to RFC3696, section 3, an email address must contain both a local part and a remote part, separated by an "@" symbol, even if the mail is going to a mailbox on the same system. Many mail programs violate this, and allow address without a remote part.
Admin
Admin
No no no.
The easiest and most correct way to validate an e-mail address field is to try sending it an e-mail.
E-mails do not need @ symbols, they do not need a . in the domain, they do not need to contain only letters or numbers.
Admin
Total false, you corrupted memory chip!
Admin
You would also need a confirmation of your own existence from a third party. A verification by your mother or father would suffice.
Admin
Corrupted memory chip has come back to copy my style!!!
Admin
Admin
Corrupted memory chip? It was probably manufactured by one of your cousins. Tech support from one of your other cousins didn't help.
Admin
Are you saying we are related, 8086 procesor?
Admin
But the second dude did use a regex
Admin
I seem to recall seeing a site that compared several different (user submitted, I believe) regular expressions that claimed to validate email.
For each one they managed to find addresses that failed.
I've never understood the obsession with email validation myself. It's easy enough to make up something like [email protected] - don't know whether it's valid or not, and neither will a validator...
Validating email addresses is similar to validating dates - only the brave attempt it, and more often than not it is not as critical as people think. Having had to work with all sorts of name matching and identity resolution functionality, I've found that entered data - correct or not - is most useful in its rawest form.
Why force users to make data appear more realistic? It only makes the job of verifying whether it's legitimate a lot more difficult...
Admin
What bothers me the most is the sites that think they need to verify that the email address is not too long. And who don't know how long an address is allowed to be.
The spec is a bit inconsistent, but it is pretty clear that a 256 character long address can be valid. Loads of systems won't even permit 100 characters in the email address.
As for the standard itself, I don't get how somebody after specifying that the part before the @ could be 64 characters, and the part after the @ could be 255, then go and specify that the entire concatenated string must be at most 256 characters.
If you absolutely have to put a limit on how long email addresses your system will support, you should make the limit 320 characters, as anything above that will be a clear violation of the spec.
Admin
Not even sure why email is needed in a lot of situations. We seem obsessed with arbitrarily collecting any information users are willing to give - often without thinking about why we collect it. Why, for example, does a forum want name, address, birth date, telephone number etc - They might claim that it is to ensure people behave online, but how thoroughly do they actually check that this 'very critical' data is actually even remotely correct? Other than validate the email address (and sometimes phone number) looks vaguely valid, not a lot.
<off topic> For the overly paranoid reader, it's all because your personal information is worth a lot of money!! </off topic>Admin
Why, click on the link in the email of course...Kinda got to have received the email to click the link, right?
Admin
If your mail server sends it, and it doesn't bounce, then it's valid enough to accept. This isn't rocket science.
Admin
We haves new letters coming? That be complicate our language some.
Admin
Actually I always find the 'enter it twice' amusing - usually you can copy paste the first one.
Admin
And what does the user do if they haven't a valid email address for the legal representative, or for their IT support?
Only the most basic validation is required on email addresses (that they contain the '@' and it's not first or last should be sufficient).
Knowing whether an email address exists is difficult. Knowing whether it is legitimate is near impossible. Why go to any significant effort to validate an address that may not actually work anyway?
I frequently use things like [email protected] or [email protected] when I don't think I should give an address. It seems to pass validation tests, but what use is it to the people who wanted it? At least if I wrote "Bugger off and leave me alone" they could easily work out that they haven't a valid address - of course this requires manual intervention, but why not use validation to flag addresses for someone to investigate, rather than to tell the user "You tell dirty great lies!!". I agree (for the most part) with an earlier comment that allowing silly addresses to get through (and perhaps having them flagged for manual intervention) allows a far better assessment of whether an email address is valid or not. It should be noted that quite often it seems the address won't necessarily be used ever (eg the legal representation - the email would most likely only be used if the phone number doesn't work) - so why bother validating it?
Admin
Question: Which way at the light Correct Answer: Left Valid, but incorrect answer: Right Invalid Answer: It's just a flesh wound
Admin
No - it's not wrong, or dumb, but you most certainly are. There is no way to validate an email address with a regex. None. Go read some actual literature on the subject and come back here when you're feeling humble and embarrassed. The easiest, and most correct way to validate an email address is to send an email to it. That is undeniably, eternally true, and you do damage every time you deny it.
Admin
... except for those irritating websites where they don't allow cut and paste from box a to box b, and you have to type the damn thing in twice.
Whether the email is valid in format or not is utterly immaterial. The only thing that matters at the end of the day is whether or not it works. Anything in between is spoo.
Validate the email address by sending an email saying "Please press this link if you are so-and-so, and yadayada ..." which is what most sites do. If you've typed in the email address of a pirate who is about to rape your account from here to buggery then sorry but Darwin's laws apply: you've got to learn to be more careful about what you type.
If you are entering email addresses for functions you do not have direct access to (e.g. as above: "enter email address for your tech support department") then the same ought still to apply: you just send the email as "You have been identified as the tech support team for ..." blahblah.
Am I alone in thinking this is all being made too complicated? Don't overthink.
Admin
Admin
and, for gmail users, [email protected] is gaining momentum
Admin
This is a useless convention as well - it is such a pain in the ass that people simply cut and paste the email address into the duplicate field. Note that we don't do this for other data - why not type in everything twice to be sure?
Admin
By my understanding, the 100 character limit isn't validation per se, but defense against SQL injection (and other similar) attacks. As the overwhelming majority of email addresses (don't have a study for this, but c'mon) are under 100 characters, the odds that someone inputting something 500 characters long is doing something nefarious are fairly good. There was a study performed at MIT a few years back that you can effectively prevent virtually all injection attacks with a 100 character limit -- as its hard to do much that's interesting in 100 characters.
Admin
Excluding a small percentage of valid data from being entered for no other reason than to have a dubious validation process that serves little or no purpose is total FAIL.
Reminds me of the story of the drivers licensing system that could not issue licenses to people whose surname was less than 3 characters.
Admin
I'm sure the school will appreciate it when it tries to deal with Little Bobby Relay and gets banned for porn spam.
Admin
Honestly, anyone writing validation code that disallows subaddressing in the year 2011 should probably be taken out back and shot.
At minimum, their code should be nuked, possibly from orbit.
Admin
Testing to see if there is an MX value for the domain is a great way to determine if an email address is valid assuming a proper caching nameserver setup and some application side caching to boot.
Admin
Admin
I'll take NullPointerException for 100, Alex.
Admin
Question for those on this forum who are smarter in the ways of The Web (tm), which is probably 97% of you as I'm just a lowly DBA.
Like many here, I use the [email protected] format. Whenever I encounter a site which tells me that's an invalid email address, I take the time to send a note to "webmaster" or "contact us" or whatever. Usually I get no reply. Recently, however, I got this reply back from a site that I had an otherwise good experience with:
"Unfortuanalty [sic] hackers use the plus sign in code to hack websites so we strip the + sign out and throw it as an error. I cant change that."
My question for you web/email experts is, is he blowing smoke up my skirt,or is there something to what he said. (My guess: total smoke).
Thanks!
Admin
This works for credit card numbers too. Why validate it when you can just run it?
Admin
Admin
Yah, Credit cards is different - there rules about what is and isn't valid is simpler - and we are a little more concerned about accuracy.
While I think of it, where does CC validation occur? On the website, or at some third-party site? (I;'m guessing bit of both, but point is there would be little to stop you from doing any more than basic sanity check that we only have numbers - the validity can always be tested by bank)
Admin
Regex parsing to validate an email address does have a place in the web, but it is an advisory place only. A little piece of javascript or what have you which can indicate that the website thinks there might be a problem with the email address the user has supplied (after all, who doesn't make typos?) but allow the user to carry on anyway, so that the email server, which is the only way to be sure, can actually use the address. The same is true with postal addresses. The number/post code abbreviation (or zip code for the internationally challenged) is a shorthand which works in most but not all cases, as is any email address regex.
Now for a real life example: SWMBO recently applied for a permanent visa. The first clerk checked that all our papers were in order and (before we paid) told us that the application would probably fail because certain information was missing BUT ALLOWED US TO CARRY ON ANYWAY (this is the validation regex). When we did just that, the actual clerk accepted the application regardless of the missing data because he was closer to the actual decision making process (this is the email server itself).
tl;dr: Let me use a fucking + symbol in my email address you useless PHP ridden so-called websites.
Admin
Using the + sign is common in SQL injection attacks. Main use is for tring to get into areas you are not suppose to such as a identification number. Send a 2+2 and if the web site did not do proper programming and use parameterized queries you are now bringing up ID number 4.
It is smoke and mirrors because they are not solving the problem correctly and directing problem somewhere else.
Or in other words don't trust that site with any important information.
Admin
If you are typing emails from a list, the most correct you can possibly come is to verify that what is typed into the computer matches what is on the written list.
What if an address as written, real or not, does not pass the validation technique du jour? Do you 'fix' it? Delete it?
Admin
This times a million.
PS: If I screw when typing my email address it's likely to be a misspelled domain or something which your dumbass email validator won't catch anyway.
The ONLY thing you should be doing with input data is checking it for SQL injection attacks.
Also: Don't ask me to "retype your email" in two different input boxes. It gets copied and pasted so any error will just be repeated.
Admin
No, you cannot ever confirm that the address can be used in the future, because that requires prescience. If, for example, someone uses their work email address, that is tied to their tenure at that organisation. The best you can do is check that it reaches them now. However, that is not what we mean by validate.
By Validate, we mean confirm that it is formatted according to the RFC. We could go one step further and say we are confirming that it is formatted in such a way as to allow the mail-sending application to parse enough information out of it to be able to send to it.
By actually sending it, even to a service like mailinator, you also get to check whether the receiving server specified in the domain-part can accept mail destined for the target specified in the local-part. It is not relevant whether this is a private mailbox, or a persistent one or any other feature that, simply by convention, one normally associates with mailboxes.
Admin
Well, the correct regexp to validate an email address is mentioned here: http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html so to really validate an email address, using regexps is stupid. better write a proper parser, since the format is much more complicated than anyone can imagine with all the special character stuff and inline-comments.
Using regexps on various websites is the reason why professional email users all the time get their emails rejecetd, most do not even catch [email protected] as a valid address.
Admin
Duh... Character string processing WTFs are immportal :-P
Admin
I test e-mail addresses empirically. Connect to MX server and ask it. When it says the address is ok then you can politely close the connection.
I do run a simple regex to sanitise the input first.
Admittedly this assumes the mail server is up and that you're not validating a huge pile of addresses.
Admin
The main problem with using the network to validate it is that email isn't instant - it can take a long time to find out whether the address is accepting mail or not.
Personally I agree with the "why ask for an email address in the first place?" crowd - it's done out of habit these days. If you need it for something (e.g. password retrieval) then you also need to verify the person who entered the address actually wants to use your service and has access to that address, so sending an email with a confirmation link is required.
But you can't actually run an SMTP session to do a very meaningful check in real-time, unless you just want to use your local SMTP server to tell you if it thinks the address is bogus or not.
You also need to do some very basic level of parsing, since you can't just dump whatever the user entered straight to your database/mail server. But I wouldn't do any more than:
Admin
Admin
RFC 5321 even says: a host that expects to receive mail SHOULD avoid defining mailboxes where the Local-part requires (or uses) the Quoted-string form
I'm happy to reject the one guy who thinks it's fun to try to use such an address. If he actually wants to pass the validation, he'll just retry using the email address he uses for everything else that rejects that form.
Admin
The fact that anyone thinks this is something for a regex is the real WTF. Not one regex I've seen yet will check to see whether or not the email address is part of a valid TLD, which is pretty important I'd say!
Sure, it's possible to do it with a regex if you really tried hard, but would you really want a 3000 character regex? Imagine trying to debug it.
"this is still a valid email address @!~?"@someplace.com