The Daily WTF: Curious Perversions in Information Technology

2011-04-27 Reply Admin

[image]

2011-04-27 Reply Admin

hmmm

guess I'm gonna change my email address to

hello?????@???.??

2011-04-27 Reply Admin

So what you're saying is, that I should return "CHAIN"?

I suppose maintaining this code is better than getting stuck on a ^([l]{2,60})([@])([A-Za-z0-9.|-|_]{1,60})(.)([A-Za-z]{2,5})$ gang

derula · 2011-04-27 Reply Admin

NYCNetworker:
hmmm
guess I'm gonna change my email address to

hello?????@???.??

Why not simply ".@ "?

2011-04-27 Reply Admin

(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

So easy a child could do it.

operagost · 2011-04-27 Reply Admin

OK, maybe those of us who don't administer email systems haven't heard of plus-addressing. But not ever encountering a valid email address with a hyphen in it? Or more than two periods?

2011-04-27 Reply Admin

Not sure about before the @, but most school systems in the area I work all follow the common pattern of [email protected] .

jnewton · 2011-04-27 Reply Admin

RobY:
Not sure about before the @, but most school systems in the area I work all follow the common pattern of [email protected] .

<firstName>.<lastName>@<domain>.com is pretty popular as well

2011-04-27 Reply Admin

The WTF is that he used REGEX, right?

2011-04-27 Reply Admin

looks like somebody did not know what a regex is.

2011-04-27 Reply Admin

This is the most correct way of proving if an e-mail address is valid or not:

'([^\\x00-\\x20\\x22\\x28\\x29\\x2c\\x2e\\x3a-\\x3c'.
'\\x3e\\x40\\x5b-\\x5d\\x7f-\\xff]+|\\x22([^\\x0d'.
'\\x22\\x5c\\x80-\\xff]|\\x5c[\\x00-\\x7f])*\\x22)'.
'(\\x2e([^\\x00-\\x20\\x22\\x28\\x29\\x2c\\x2e'.
'\\x3a-\\x3c\\x3e\\x40\\x5b-\\x5d\\x7f-\\xff]+|'.
'\\x22([^\\x0d\\x22\\x5c\\x80-\\xff]|\\x5c\\x00'.
'-\\x7f)*\\x22))*\\x40([^\\x00-\\x20\\x22\\x28'.
'\\x29\\x2c\\x2e\\x3a-\\x3c\\x3e\\x40\\x5b-\\x5d'.
'\\x7f-\\xff]+|\\x5b([^\\x0d\\x5b-\\x5d\\x80-\\xff'.
']|\\x5c[\\x00-\\x7f])*\\x5d)(\\x2e([^\\x00-\\x20'.
'\\x22\\x28\\x29\\x2c\\x2e\\x3a-\\x3c\\x3e\\x40'.
'\\x5b-\\x5d\\x7f-\\xff]+|\\x5b([^\\x0d\\x5b-'.
'\\x5d\\x80-\\xff]|\\x5c[\\x00-\\x7f])*\\x5d))*'

(http://www.iamcal.com/publish/articles/php/parsing_email/)

2011-04-27 Reply Admin

When I was in grad school I had an e-mail address with a + in it. I often found websites that would barf on accepting that as a valid e-mail address.

2011-04-27 Reply Admin

The easiest, and most correct way to validate an email address is to send an email to it.

The most incorrect way is to use some trivial little regex written by someone who hasn't even heard of RFC822, and just intuits what they think an email address might be.

I have never seen a regex in the wild, that correctly validates email addresses.

I would much rather read through the version of in the post, isValidEmailAddress to debug it, than a regular expression complex enough to be comprehensive.

dtobias · 2011-04-27 Reply Admin

I've run into sites that refuse to accept my perfectly valid .name and .info addresses because they think that TLDs shouldn't be more than three letters.

2011-04-27 Reply Admin

Paul:
The easiest, and most correct way to validate an email address is to send an email to it.

Wow. Just, well, wow. This is possibly the most amazingly dumb thing I have seen posted in a very long time, for more reasons than I can count.

2011-04-27 Reply Admin

Well, yes, sending a test email (or nuking from orbit) is the only way to be sure. But short of that, I think you could just send an XML message to a web service somewhere and let it all be Somebody Else's Problem.

2011-04-27 Reply Admin

dtobias:
I've run into sites that refuse to accept my perfectly valid .name and .info addresses because they think that TLDs shouldn't be more than three letters.

Back when I invented the internet, 3 letters was enough for anybody.

2011-04-27 Reply Admin

A surprising number of websites also barf on [email protected] style addresses. I mean, .name has only been around 10 years, so I can see how they haven't had time to adjust to non 3-character TLDs. Lord knows that if it doesn't conform to [a-z]+@[a-z]+.[a-z]{3} then it must be invalid!

PedanticCurmudgeon · 2011-04-27 Reply Admin

NYCNetworker:
hmmm
guess I'm gonna change my email address to

hello?????@???.??

A polite troll leaves a good hint for his target. With that in mind, this email would be a better choice:

u.fail@regex

Zolcos · 2011-04-27 Reply Admin

Email validation by regex is practically a canonical example of some problems just not having a good solution that everyone can agree on. If you validate down to the letter of the RFCs, you will accept addresses that are technically correct but possibly unreachable, like those based on internal subdomains and not fully qualified. Even once you get past the hurdle of defining exactly what constitutes validity there are still many weird gotchas and it is hard to guarantee you've considered them all.

PHP to the rescue, I guess. Every standard library needs an equivalent to: filter_var($x, FILTER_VALIDATE_EMAIL)

2011-04-27 Reply Admin

wow:
Paul:
The easiest, and most correct way to validate an email address is to send an email to it.

Wow. Just, well, wow. This is possibly the most amazingly dumb thing I have seen posted in a very long time, for more reasons than I can count.

Rare is the situation where you only care if an email address is theoretically valid, and not whether it actually exists.= and can be sent mail.

2011-04-27 Reply Admin

So I guess I'm safe with my aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa@bbb.cc?

HellKarnassus · 2011-04-27 Reply Admin

wow:
Paul:
The easiest, and most correct way to validate an email address is to send an email to it.

Wow. Just, well, wow. This is possibly the most amazingly dumb thing I have seen posted in a very long time, for more reasons than I can count.

Why? I think it is a useful idea, you ask the user to input an e-mail address and to confirm it, then you send an activation/validation mail to finish whatever process the app must do. It is less annoying than a script or method that tells you the address is incorrect even though it does exist.

Power Troll · 2011-04-27 Reply Admin

Ah, OK. TRWTF is that the first example isn't really appropriate for Java. Here's a more OO version.

public final class EmailAddress {
	private final String address;

	public EmailAddress (String address) throws IllegalEmailAddressException {
		
		// a null string is invalid
		if (address == null)
		    throw new IllegalEmailAddressException("This is an invalid email address.");

		// a string without a "@" is an invalid email address
		if (address.indexOf("@") < 0)
		    throw new IllegalEmailAddressException("This is an invalid email address.");

		// a string without a "."  is an invalid email address
		if (address.indexOf(".") < 0)
		    throw new IllegalEmailAddressException("This is an invalid email address.");

		if (lastEmailFieldTwoCharsOrMore(address) == false)
		    throw new IllegalEmailAddressException("This is an invalid email address.");

		if (address.indexOf("!") > 0)
		    throw new IllegalEmailAddressException("This is an invalid email address.");

		if (address.indexOf("#") > 0)
		    throw new IllegalEmailAddressException("This is an invalid email address.");

		if (address.indexOf("$") > 0)
		    throw new IllegalEmailAddressException("This is an invalid email address.");

		if (address.indexOf("%") > 0)
		    throw new IllegalEmailAddressException("This is an invalid email address.");

		if (address.indexOf("&") > 0)
		    throw new IllegalEmailAddressException("This is an invalid email address.");

		if (address.indexOf("*") > 0)
		    throw new IllegalEmailAddressException("This is an invalid email address.");

		if (address.indexOf("+") > 0)
		    throw new IllegalEmailAddressException("This is an invalid email address.");

		if (address.indexOf("-") > 0)
		    throw new IllegalEmailAddressException("This is an invalid email address.");

		if (address.indexOf("~") > 0)
		    throw new IllegalEmailAddressException("This is an invalid email address.");

		if (address.indexOf("ä") > 0)
		    throw new IllegalEmailAddressException("This is an invalid email address.");

		if (address.indexOf("ö") > 0)
		    throw new IllegalEmailAddressException("This is an invalid email address.");

		if (address.indexOf("å") > 0)
		    throw new IllegalEmailAddressException("This is an invalid email address.");

		if (address.indexOf(";") > 0)
		    throw new IllegalEmailAddressException("This is an invalid email address.");
		

		this.address = address;

	}

	private static final boolean lastEmailFieldTwoCharsOrMore(String emailAddress) {
        	if (emailAddress == null)
            		return false;
        	StringTokenizer st = new StringTokenizer(emailAddress, ".");
        	String lastToken = null;
        	while (st.hasMoreTokens()) {
            		lastToken = st.nextToken();
        		}

		if (lastToken.length() >= 2) {
			return true;
		} else {
			return false;
		}
    	}

	public String getAddress() {
		return this.address;
	}
}

2011-04-27 Reply Admin

Actually, indexOf usually is a lot faster than regexes. This'd be my first stab at it, without regexes:

bool isValidMail(String s) {
    return ((sidx_t i = s.indexOf('@')) != -1 && 
        s.indexOf('.',i) != -1 && 
        s.lastIndexOf('.') inr(s.length-1,s.length-3) &&
        [&s](){ 
            for(uchar c : s) { 
                if(!(isalpha(c) || isnum(c) ||
                    c ina({'@','.','+'})) return false;
            }
            return true; 
        }());
}

Of course, this one wouldn't prevent adresses such as [email protected], but hey, it'd make a great T-shirt!(*) :P

SeySayux

(*) Not that that regex wouldn't.

2011-04-27 Reply Admin

Paul:
The easiest, and most correct way to validate an email address is to send an email to it.
The most incorrect way is to use some trivial little regex written by someone who hasn't even heard of RFC822, and just intuits what they think an email address might be.

I have never seen a regex in the wild, that correctly validates email addresses.

I would much rather read through the version of in the post, isValidEmailAddress to debug it, than a regular expression complex enough to be comprehensive.

I hope no one is still writing code against RFC 822. It was superseded 10 years ago by 2822.

2011-04-27 Reply Admin

Try and enumerate a few, please.

If you are actually writing the actual email-sending part of an actual email-sending application, then I agree, you can't send an email to validate the address.

If you are doing anything else, then the only reliable (and DRY) way to validate an email address is to try and use it as an email address.

Why reinvent the wheel by doing some other stupid "validation" beforehand? You make the wheel less and less round every time you reinvent it. What if (as is normally the case) your validation incorrectly discards good addresses that your server can cope with? What if your validation is correct, but your email server is wrong? What if your validation deliberately matches your email server's faulty validation, but someone else fixes the server, leaving you with a pointlessly faulty validation?

Even if you do get your syntactic validation correct, (which appears highly unlikely, given the prevalence of incorrect validation in the world); all you know is that it looks like an email address, you have no idea whether it is an email address (i.e. one that resolves to a mailbox)

frits · 2011-04-27 Reply Admin

For those who don't already know, here's one way to verify whether an email address exists: http://www.labnol.org/software/verify-email-address/18220/. All this nonsense about input validation and sending actual messages isn't much better than the code in the article.

2011-04-27 Reply Admin

XXXXX:
I hope no one is still writing code against RFC 822. It was superseded 10 years ago by 2822.

Whoops, Good point.

2011-04-27 Reply Admin

Justin:
So I guess I'm safe with my spaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaace@spa.ce?

FTFY

2011-04-27 Reply Admin

dtobias:
I've run into sites that refuse to accept my perfectly valid .name and .info addresses because they think that TLDs shouldn't be more than three letters.

I think you need to try with the .museum and .travel TLDs then :D

2011-04-27 Reply Admin

wow:
Paul:
The easiest, and most correct way to validate an email address is to send an email to it.

Wow. Just, well, wow. This is possibly the most amazingly dumb thing I have seen posted in a very long time, for more reasons than I can count.

In case you can count that far, name three, please.

2011-04-27 Reply Admin

frits:
For those who don't already know, here's one way to verify whether an email address exists: <snip>. All this nonsense about input validation and sending actual messages isn't much better than the code in the article.

The only difference between sending an email and what was done in the article is actually sending data to be delivered. It's a lot more work than just sending an email, and you have no way of verifying that the user actually got the email because you didn't send one...

2011-04-27 Reply Admin

Hmm, in the past I've had email address with ! and % in them.

I've got domains with - in them. I use + all the time (and still see breakage).

However, I was once <user>@<tld> . That one was fun. :)

2011-04-27 Reply Admin

[code] else { return "C-C-C-COMBO BREAKER!"; }

2011-04-27 Reply Admin

else {
    return "C-C-C-COMBO BREAKER!";
}

2011-04-27 Reply Admin

Paul:
The easiest, and most correct way to validate an email address is to send an email to it.

In what sense does that validate anything, if you don't have access to that address's mailbox?

Zylon · 2011-04-27 Reply Admin

wow:
Paul:
The easiest, and most correct way to validate an email address is to send an email to it.

Wow. Just, well, wow. This is possibly the most amazingly dumb thing I have seen posted in a very long time, for more reasons than I can count.

I was going to say this, but about Alex's regex comment. As others have pointed out, checking all possible valid email addresses using a regex is borderline impossible.

http://www.regular-expressions.info/email.html

Also fuck you Akismet, this isn't spam.

frits · 2011-04-27 Reply Admin

dude:
frits:
For those who don't already know, here's one way to verify whether an email address exists: <snip>. All this nonsense about input validation and sending actual messages isn't much better than the code in the article.

The only difference between sending an email and what was done in the article is actually sending data to be delivered. It's a lot more work than just sending an email, and you have no way of verifying that the user actually got the email because you didn't send one...

How are you verifying again?

2011-04-27 Reply Admin

N. Tufnel:
Paul:
The easiest, and most correct way to validate an email address is to send an email to it.

In what sense does that validate anything, if you don't have access to that address's mailbox?

It validates that whatever you are using to actually send mail can parse the address.

It validates that the address can be used to reach a mail server.

Depending on the setup of the receiving server, it validates that mail can be sent to the specified mailbox (rather than returning an error). If there is a catch-all, then this doesn't work.

As others have mentioned, sending a validation message ensures that the address entered reaches a mailbox that the intended recipient can read.

2011-04-27 Reply Admin

It seems that 90% of sites restrict valid emails, I usually use something like this

.[^ ]+.@[^ ]+.[^ ]

jonnyq · 2011-04-27 Reply Admin

N. Tufnel:
Paul:
The easiest, and most correct way to validate an email address is to send an email to it.

In what sense does that validate anything, if you don't have access to that address's mailbox?

One would hope that the person providing the email address would have access to it for confirmation...

I mean seriously... two people this dumb?

In my opinion, if you're just trying to prevent typing mistakes: anything, followed by an @, followed by anything, followed by a ., followed by anything. That's all I ever use. If you're trying to validate that an email address is real - send a confirmation email. Anything in between is prone to mistakes and probably isn't helping anyone.

Jaime · 2011-04-27 Reply Admin

Paul:
N. Tufnel:
Paul:
The easiest, and most correct way to validate an email address is to send an email to it.

In what sense does that validate anything, if you don't have access to that address's mailbox?

It validates that whatever you are using to actually send mail can parse the address.

It validates that the address can be used to reach a mail server.

Depending on the setup of the receiving server, it validates that mail can be sent to the specified mailbox (rather than returning an error). If there is a catch-all, then this doesn't work.

As others have mentioned, sending a validation message ensures that the address entered reaches a mailbox that the intended recipient can read.

[email protected]

Mailinator's whole purpose in life is to accept mail and junk it after a very short time. EMail address validation is stupid as a general idea, but makes sense if you have a specific purpose for doing so. Examples:

If you need to validate that the email address belongs to the user -- Send a one-off email that the user must respond to.

If you are trying to help a user enter data on a form -- do any half-assed validation and throw up a kindly worded warning if it doesn't match. Allow the user to continue on validation failure.

Notice in both cases that it is not necessary to have a robust validation procedure. I can't think of a reason, other than while writing mail router software, to strictly validate the format of an email address.

2011-04-27 Reply Admin

This one is actually better than most I have seen.

I have actually tried to build a test that was 100 % rfc compliant and the result was horrifyingly large, and we still found existing, working emails that was denied ;)

We finally went with a commercial component that so far have correctly handled every address we feed it.

For any lesser ambition, just check the very simplest format and if you need more, use a verification where you actually send an email to the address and require them to click a link.

With the new Asian letters coming I would not even try to build my own again =)

hoodaticus · 2011-04-27 Reply Admin

Paul:
The easiest, and most correct way to validate an email address is to send an email to it.
The most incorrect way is to use some trivial little regex written by someone who hasn't even heard of RFC822, and just intuits what they think an email address might be.

I have never seen a regex in the wild, that correctly validates email addresses.

I would much rather read through the version of in the post, isValidEmailAddress to debug it, than a regular expression complex enough to be comprehensive.

I agree 100%. The best way is to contact the purported email server and see if the account is on there.

Of course, you still have to parse out the server from the email address.

evilspoons · 2011-04-27 Reply Admin

(I had to add a bunch of spaces to this to make the damn spam filter let me post, argh.)

Gmail lets you arbitrarily use periods in your email address:

bob dot frank at gmail dot com is the same inbox as bobfrank at gmail.com and b dot o dot b dot f dot r dot a dot n dot k at gmail dot com.

Gmail also lets you use plus signs to assign labels to incoming messages - bob dot frank + noodles at gmail dot com will assign the message the label 'noodles' in your gmail inbox.

I use it all the time for spam filtering (myaddress + nameofoffendingsite at gmail dot com) but I have run into many, many instances where it thinks my plus sign invalidates the address. Sigh.

In the mid 90s I had an email address from a free webmail service that allowed you to choose your domain name. I think the site was "My Own Email" or something.

I signed up with my.name at imatrekkie dot com (yeah yeah) and used it for a few months, signed up for a lot of newsletters. Then I had it forward all my messages to my new-fangled pre-Microsoft Hotmail account.

Then My Own Email had a major overhaul. Their new software didn't accept email addresses with periods in them. My address was still valid and receiving newsletters and forwarding them to my Hotmail account... but I couldn't log in.

My old email then got on a spam list and my Hotmail account was absolutely overwhelmed with forwarded crap... and I had no way to log in to delete my account, turn off forwarding, or anything. It was pretty special.

hoodaticus · 2011-04-27 Reply Admin

N. Tufnel:
Paul:
The easiest, and most correct way to validate an email address is to send an email to it.

In what sense does that validate anything, if you don't have access to that address's mailbox?

You don't know much about SMTP, do you?

2011-04-27 Reply Admin

This assumes the e-mail address is accessible on the internet, or is at least on the same network. Consider the case of networks that are completely disconnected from each other...

2011-04-27 Reply Admin

[quote user="jonnyq"][quote user="N. Tufnel"]

One would hope that the person providing the email address would have access to it for confirmation...

I mean seriously... two people this dumb?

In my opinion, if you're just trying to prevent typing mistakes: anything, followed by an @, followed by anything, followed by a ., followed by anything. That's all I ever use. If you're trying to validate that an email address is real - send a confirmation email. Anything in between is prone to mistakes and probably isn't helping anyone.[/quote]

Please provide contact details for your legal representation:

Name: _____ Phone: _____ Email: ______

Please provide contact details for your IT Support team:

Phone: _____ Email: ______

Please provide your preferred email address to be created"

Email: ______

etc.

There are MANY cases where you might ask someone to provide an email address which they may not have access to.

Sending real emails to these addresses is pretty silly. 99.9% of the time, the simple validation will suffice. Whether or not you care about that missing 0.1%, and how you do so, would be completely dependent on context.

2011-04-27 Reply Admin

jonnyq:
One would hope that the person providing the email address would have access to it for confirmation...
I mean seriously... two people this dumb?

In my opinion, if you're just trying to prevent typing mistakes: anything, followed by an @, followed by anything, followed by a ., followed by anything. That's all I ever use. If you're trying to validate that an email address is real - send a confirmation email. Anything in between is prone to mistakes and probably isn't helping anyone.

Please provide contact details for your legal representation:

Name: _____ Phone: _____ Email: ______

Please provide contact details for your IT Support team:

Phone: _____ Email: ______

Please provide your preferred email address to be created"

Email: ______

etc.

There are MANY cases where you might ask someone to provide an email address which they may not have access to.

Sending real emails to these addresses is pretty silly. 99.9% of the time, the simple validation will suffice. Whether or not you care about that missing 0.1%, and how you do so, would be completely dependent on context.

Email Validation Validity

Leave a comment on “Email Validation Validity”