The Daily WTF: Curious Perversions in Information Technology

2011-04-28 Reply Admin

Irritated user:
If you are typing emails from a list, the most correct you can possibly come is to verify that what is typed into the computer matches what is on the written list.
What if an address as written, real or not, does not pass the validation technique du jour? Do you 'fix' it? Delete it?

I made no mention of typing things in from a written list, and nor did the post I was responding to. In fact, that would count as "immediate user input". I'm talking about iterating over addresses in a list that is already in the computer. Perhaps in an XML file or CSV, or maybe an array of addresses randomly generated in memory.

However, since you ask - when things from a hard copy don't validate when turned into soft copy, the correct thing to do is normally to report it to a human, who can then choose the correct action by using their brains.

ParkinT · 2011-04-28 Reply Admin

We need to return to the days of Compuserve, where ALL usernames were comprised of only numerals.

2011-04-28 Reply Admin

Don't forget the poor saps in the .museum domain that can never pass that 5 char TLD limit!

2011-04-28 Reply Admin

Shishire:
http://code.google.com/p/isemail/source/browse/trunk/is_email.php?r=6
That is a link to what is quite possibly the only truly correct email validator out there.

It is not an email validator. At best, it would be an email ADDRESS validator. And I'm not so sure it would pass the previously given example "#"@"#"@[IPv6:::ffff:173.230.158.172], given the way it initially looks for a @.

2011-04-28 Reply Admin

I use + all the time on one account that gets enormous numbers of spam.

I made a list of words to use after the plus and created a procmail filter to accept those combinations. For example, if the word list was maroon, acre, lightning, saturn, and piano, then the acceptable e-mail addresses for [email protected] would be [email protected], [email protected], [email protected], and [email protected]. I then would keep a copy of the list with me and if someone needed an address, I would give them the next one on the list.

Any e-mail coming in to those addresses was accepted, but if I started to get spammed at one combination, it would be a simple manner to reject any and all incoming e-mail to that address.

For e-mail coming in without the +something to [email protected], if it was encrypted with my PGP key, signed by the senders PGP key, from a specific whitelist of individual addresses, or originating from anyone on the local network, the e-mail was delivered okay.

All other e-mail is dumped into a trash folder. Originally I automatically responded back with a message telling the sender what it would take for the e-mail to be delivered, but that never seemed to do any good.

The number of spams on that address went from 50 or more a day to 0.

2011-04-28 Reply Admin

Wow, checking the link in the story and looking at the RFC 2822 email validation regex, I've confirmed that regex is dumb as shit and Perl programmers are even dumber.

2011-04-28 Reply Admin

will:
SQLDave:
Question for those on this forum who are smarter in the ways of The Web (tm), which is probably 97% of you as I'm just a lowly DBA.
Like many here, I use the [email protected] format. Whenever I encounter a site which tells me that's an invalid email address, I take the time to send a note to "webmaster" or "contact us" or whatever. Usually I get no reply. Recently, however, I got this reply back from a site that I had an otherwise good experience with:

"Unfortuanalty [sic] hackers use the plus sign in code to hack websites so we strip the + sign out and throw it as an error. I cant change that."

My question for you web/email experts is, is he blowing smoke up my skirt,or is there something to what he said. (My guess: total smoke).

Thanks!

Using the + sign is common in SQL injection attacks. Main use is for tring to get into areas you are not suppose to such as a identification number. Send a 2+2 and if the web site did not do proper programming and use parameterized queries you are now bringing up ID number 4.

It is smoke and mirrors because they are not solving the problem correctly and directing problem somewhere else.

Or in other words don't trust that site with any important information.

or.. (hypothetically speaking of course), people could do things like: select password + 4 from auth_table where user_id = 'admin'. Then you'd get an error saying something like: "Error: could not convert hunter2 to integer" or whatever.. the point being that you want to try and cause a type mismatch to get the error message printed to the screen.

2011-04-28 Reply Admin

grzlbrmft:
wow:
Paul:
The easiest, and most correct way to validate an email address is to send an email to it.

Wow. Just, well, wow. This is possibly the most amazingly dumb thing I have seen posted in a very long time, for more reasons than I can count.

In case you can count that far, name three, please.

You can't confirm the email was received without access to the inbox.
You get your sender's domain flagged on an RBL.
Bandwidth waste.

You must be a regex and Perl fan.

pjt33 · 2011-04-28 Reply Admin

Yazeran:
Splognosticus:
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

So easy a child could do it.

Fail (according to the perl module Mail::RFC822::Address: regexp-based address validation):

(?:(?:\r\n)?[ \t])(?:(?:(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t] )+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?: \r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:( ?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\0 31]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)
](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+ (?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?: (?:\r\n)?[ \t])))|(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z |(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n) ?[ \t]))<(?:(?:\r\n)?[ \t])(?:@(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:
r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n) ?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t] )))(?:,@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])* )(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t] )+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))) :(?:(?:\r\n)?[ \t]))?(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+ |\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r \n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?: \r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t ]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031 ]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)]( ?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(? :(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(? :\r\n)?[ \t])))>(?:(?:\r\n)?[ \t]))|(?:[^()<>@,;:\".[] \000-\031]+(?:(? :(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)? [ \t]))"(?:(?:\r\n)?[ \t])):(?:(?:\r\n)?[ \t])(?:(?:(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]| \.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<> @,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|" (?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t] )(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\ ".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(? :[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[ ]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))|(?:[^()<>@,;:\".[] \000- \031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|( ?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))<(?:(?:\r\n)?[ \t])(?:@(?:[^()<>@,; :\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([ ^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\" .[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[
]\r\]|\.)](?:(?:\r\n)?[ \t])))(?:,@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".
[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]
r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\] |\.)](?:(?:\r\n)?[ \t])))):(?:(?:\r\n)?[ \t]))?(?:[^()<>@,;:\".[] \0 00-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\ .|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@, ;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(? :[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t]) (?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\". []]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[ ^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[] ]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))>(?:(?:\r\n)?[ \t]))(?:,\s( ?:(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\ ".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:( ?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[ ["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t ])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t ])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(? :.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+| \Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))|(?: [^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[
]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))<(?:(?:\r\n) ?[ \t])(?:@(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[[" ()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n) ?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<> @,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))(?:,@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@, ;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t] )(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\ ".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))):(?:(?:\r\n)?[ \t]))? (?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\". []]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?: \r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[[ "()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]) ))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t]) +|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:
.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z |(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))>(?:( ?:\r\n)?[ \t]))))?;\s*)

And even that is only with comments removed......

I've never understood why that Perl regex is so long, and I've implemented validation myself using regexes and following the RFCs. If you can provide a test case which the shorter regex fails and the longer one doesn't then I'll be interested to check whether my validator passes or not.

dtobias · 2011-04-28 Reply Admin

Christopher:
It's doubleplus ungood that you have to log into the site using your email address,

Why does every site seem to want to do this these days (use the email address as the username)? Just over the last few months I've had several sites that had perfectly good login systems where I had chosen short, memorable usernames which then suffered a redesign in which they insisted on changing the login system to use my email address instead. I dislike that for a number of reasons, including that I store my myriad usernames/passwords in an iPhone app, and typing all-letter usernames like 'dtobias' is much simpler than typing e-mail addresses, which requires switching to the symbols touchscreen keyboard with the at sign in it.

dohpaz42 · 2011-04-28 Reply Admin

TRWTF are all of the commenters who are confusing the validity of an email address with the ability to send an email to said address. Those are two separate, and distinct, functions that compliment each other, but should be treated separately. In most real-world production systems it may be unfeasible from a business perspective to waste the customer's valuable time trying to validate whether or not email can be sent to an email address; PHP, for example, does provide getmxrr() to test a domain for valid MX records. The problem is, with enough load and traffic, this will block your website until the function returns. So it generally is acceptable to only validate the format of an email, and worry about bounces on whatever system that actually sends out emails (i.e., newsletters, etc). This is why a lot of sites have adopted the paradigm of forcing a user to validate their account via email.

2011-04-28 Reply Admin

ted:
1. You can't confirm the email was received without access to the inbox.

Not relevant, you can't do that with a regular expression either. Whether it is received is a different matter to whether it was sent.

By sending the email you have proven that the address is parseable enough to be able to send email to it. If the sender fails to send, you know that it is not parseable enough.

Surely this is better than proving that the address only has alphabetic characters with an "@" in the middle and a "." 3-4 characters from the right-hand edge. Such proof has very little to do with whether it is a valid email address or not.

ted:
2. You get your sender's domain flagged on an RBL.

If you use this method to validate a big list, without alerting the recipients, then set up some mechanism whereby the emails don't actually reach the recipients. Perhaps configure DNS so that the mail server sends everything to itself.

That will still be easier and more correct than writing a regular expression that accepts all valid email addresses and rejects all invalid ones (considering that even the enormous one that people have already linked to above requires you to preprocess the address before testing it with the regex).

If you are actually in charge of your mail server, then it may even be easier than writing a proper parser for emails, I'm not sure, but it's certainly more DRY.

ted:
3. Bandwidth waste.

A) That depends on your definition of waste. I'm sure that for most applications, the bandwidth required to do it this way is much cheaper (in money) than the developer time involved in creating an effective email address validator.

B) If you actually only have a small, finite amount of bandwidth and you absolutely must do the validation without using any, then see my response to 2, above.

Kuba · 2011-04-28 Reply Admin

Paul:
The most incorrect way is to use some trivial little regex written by someone who hasn't even heard of RFC822, and just intuits what they think an email address might be.
I have never seen a regex in the wild, that correctly validates email addresses.

Means you didn't look two posts before yours. FAIL.

dtobias · 2011-04-28 Reply Admin

ASDG:
I frequently use things like [email protected] or [email protected] when I don't think I should give an address.

You ought to use an RFC-compliant fake address like [email protected] or [email protected], instead of things like you mentioned which might be somebody's actual address.

2011-04-28 Reply Admin

dohpaz42:
TRWTF are all of the commenters who are confusing the validity of an email address with the ability to send an email to said address. Those are two separate, and distinct, functions that compliment each other, but should be treated separately. In most real-world production systems it may be unfeasible from a business perspective to waste the customer's valuable time trying to validate whether or not email can be sent to an email address; PHP, for example, does provide getmxrr() to test a domain for valid MX records. The problem is, with enough load and traffic, this will block your website until the function returns. So it generally is acceptable to only validate the format of an email, and worry about bounces on whatever system that actually sends out emails (i.e., newsletters, etc). This is why a lot of sites have adopted the paradigm of forcing a user to validate their account via email.

If you are collecting an e-mail address for the sake of having an e-mail address and do not intend on ever sending an e-mail to that address there's no reason to validate. It's a waste of resources. Hell, if you're never going to use it, why even ask for it?

2011-04-28 Reply Admin

I used this code for a site I was managing: http://www.dominicsayers.com/isemail

It reduced the number of trouble calls due to invalid email addresses significantly once it was implemented. What I especially liked was the fact it did a MX lookup on the domain to make sure it was valid as the final step. Catches all the hotmal.com, yhaoo.com etcs that would pass normal validation.

Ofcourse they can still misspell the first part of their address since we can't validate that BUT then we have them enter the email address twice and error if they differ to try and mitigate that as well :)

2011-04-28 Reply Admin

As with e-mail validation as with other validation, they are meant to just help the user, not to slap him on the wrist when he makes a mistake.

I think the most errors in users e-mails are simple typos like [email protected] instead of [email protected] and the mighty 100-line regex doesn't help against that.

Also, if you need a valid e-mail address from a user, send a confirmation mail, this way you can be sure the user double checks what he enters.

If you don't want bogus data in your database, don't ask for data which is going to be bogus most of the time. Like if you make 'hobbies' a required field, you can bet it's going to be something like "adsf" most of the time.

Nagesh · 2011-04-28 Reply Admin

Just for recording

Guy posting image saying "First Post" is now annoying me lots.

2011-04-28 Reply Admin

I guarantee you that I've seen that CodeSOD on a couple of sites. I have domain elite-systems.org registered but when I came across that I ended up having to register domain elitesystems.org as well and setting up an alias.

2011-04-28 Reply Admin

ted:
grzlbrmft:
wow:
Paul:
The easiest, and most correct way to validate an email address is to send an email to it.

Wow. Just, well, wow. This is possibly the most amazingly dumb thing I have seen posted in a very long time, for more reasons than I can count.

In case you can count that far, name three, please.

You can't confirm the email was received without access to the inbox.

You get your sender's domain flagged on an RBL.

Bandwidth waste.

You must be a regex and Perl fan.

There's something called "double opt-in" that's very useful if you're trying to maintain a mailing list.

What it is is the user subscribes (website or email) to your mailing list. The server sends back a response which the user then simply hits reply and send. This validates the email between the server and client (if there's an RBL in progress, the user won't get the confirmation email and the server won't bother sending emails to a blackhole'd address). This also confirms that the user WANTS the email. Perhaps the user made a typo and the email is going to someone else's account. All they have to do is ... nothing. Or hit delete. And they won't be bothered again. Hence, double opt-in. The user opted in once, and confirmed that yes, they really really really do want the email.

Yes, I have been "subscribed" to many email lists by those who don't check (probably spammers and the like). I write a procmail recipe to filter out those emails and delete them, because their unsubscribes don't work. Hell, I keep getting emails from British Telecom about some phone services. Never could figure out how to get access to the account so I could get some phone cards or upgrade the guy's bill or something.

I also get the occasional joe-job with someone using a whitelist. I do whatever it takes to get that email accepted because they obviously don't know about backscatter spamming. (I've always wanted to use a public wifi and the like to send a pile of emails to those addresses with a fake header leading back to RBL honeypots to get those whitelist domains blocked...).

2011-04-28 Reply Admin

Please make the bad code go away.

2011-04-29 Reply Admin

Even properly validators sometimes forbid '+', which pisses me off. Haven't they heard of plus addressing?

2011-04-29 Reply Admin

The ONLY thing you should be doing with input data is checking it for SQL injection attacks.

No you shouldn't. As soon as you start checking input for any kind of injection attacks, you are going to end up with a terrible system.

The way to code is to use proper escaping. (SQL can do this behind the scenes for you because the API has ways to pass statements and data as separate string arguments.)

The only place where you need to check for injection attacks is in your unit tests. If your unit tests uses all the special characters that could be used to perform attacks, and verify that they are properly escaped and unescaped, then you are well protected against injection.

If OTOH you try to just forbid "invalid" characters at a higher level before saving the email address in your database, you will be rejecting perfectly valid email addresses.

It may come as a surprise to many people, but the local part of an email address permit many characters. In fact the original spec permitted every single 7-bit ascii character. Yes, all 128 of them, including the NUL character, the NUL character didn't even have to be escaped. Only CR, LF, ", and \ had to be escaped by putting a \ in front of it.

The wording in the spec about valid characters was: "any one of the 128 ASCII characters (no exceptions)"

A later update to the spec says that you should not define email addresses that require quoting. But you still have to support it for interoperability.

2011-04-29 Reply Admin

This is what I was referring to, and it's very task-specific.

If you want to confirm that the email address entered belongs to:

a living person with
some way to read mails sent to them and
a mailbox which can accept mails (not full, etc) and
isn't behind a spam filter which will eat your verification email and
has some way to successfully confirm that they received the email (so can contact your server, or send a reply email, or whatever) and
your outgoing mail server is working and
you have some sort of extra persistent database to store the keys you're using to validate emails so you know which response is which and
the time taken for mail to get from A to B and for the user to do their confirmation action is not important

then sure, use the standard call-and-response stuff. If you just want to validate if "[email protected]" is okay fot possible future use, then all you have is syntax checking.

VxJasonxV · 2011-04-29 Reply Admin

The real wtf is that a + is perfectly ok in e-mail addresses.

dtobias · 2011-04-29 Reply Admin

If you want to prove that the address belongs to a living person, should you demand a birth certificate? Is the short form OK?

2011-04-29 Reply Admin

...and RFC 2822 was superseded in October 2008 by 5322.

Cheers

2011-04-29 Reply Admin

Bloomer:
EZMoney:
This works for credit card numbers too. Why validate it when you can just run it?

Yah, Credit cards is different - there rules about what is and isn't valid is simpler - and we are a little more concerned about accuracy.

While I think of it, where does CC validation occur? On the website, or at some third-party site? (I;'m guessing bit of both, but point is there would be little to stop you from doing any more than basic sanity check that we only have numbers - the validity can always be tested by bank)

You could at least do a Luhn checksum. It works for most cards.

2011-05-02 Reply Admin

wow:
Paul:
The easiest, and most correct way to validate an email address is to send an email to it.

Wow. Just, well, wow. This is possibly the most amazingly dumb thing I have seen posted in a very long time, for more reasons than I can count.

It may seem dumb -- and it's certainly counterintuitive -- but it's true. RFC822 provides for a very wide range of valid syntax. The only correct e-mail address validation regexes are on the order of a page in length. Please see http://www.linuxjournal.com/article/9585 for an overview of some of the issues involved.

2011-05-03 Reply Admin

Tears literally welled up in my eyes.

2011-05-03 Reply Admin

Most SANE people would use THIS regex to validate emails:

http://www.ex-parrot.com/pdw/Mail-RFC822-Address.html

NOT SPAM AKISMET

2011-05-07 Reply Admin

If you want to get technical about it, this regexp should catch them all: .+@.+ (in theory it would be possible to have a mail server at a TLD :-))

2013-08-26 Reply Admin

^A-Za-z0-9@([A-Za-z0-9]+)(([.-]?[a-zA-Z0-9]+)*).([A-Za-z]{2,})$

2013-08-26 Reply Admin

Thank you soo much its working

Email Validation Validity

Leave a comment on “Email Validation Validity”