- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
First?
Admin
regex anyone?
that whole nasty thing can be shrunk to a nice little regex
\w+([-+.]\w+)@\w+([-.]\w+).\w+([-.]\w+)* i think that would do it maybe not.. but still....
Admin
Wow!
He must be paid per line b/c I tend to do something like this:
Function isValidEmail(myEmail)
dim isValidE
dim regEx
isValidE = True
set regEx = New RegExp
regEx.IgnoreCase = False
regEx.Pattern = "^[a-zA-Z][\w.-][a-zA-Z0-9]@[a-zA-Z0-9][\w.-][a-zA-Z0-9].[a-zA-Z][a-zA-Z.]*[a-zA-Z]$"
isValidE = regEx.Test(myEmail)
isValidEmail = isValidE
End Function%>
Admin
I've met a bunch of "programmers" who don't even know what a Regular Expression is. Functions like this one don't surprise me in the least.
Admin
Admin
I'm not going to try to do an exhaustive list. But there is no reason to nest each of those validations. And if I were writing something like this, I would keep the sense of all my functions the same, and not have to use the "NOT' keyword in front of some but not others. Fairly ugly.
Admin
At the very least, couldn't he have simplified this function to this, and gotten rid of all those nested if statements?
Admin
Another case of not using regular expressions..... Unfortunately this is much too common...
Admin
Word to the wise is that regular expressions and email addresses don't mix all that well.
http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html
Admin
Admin
This is certainly no shining example of good code but these past two WTFs have really just been bad code IMHO. I guess there really are a finite number of WTF patterns out there and we're getting close to having seen them all.
Admin
So you set IngoreCase = false, then use [a-zA-Z] in every location. That is a nice WTF.
And lets look at that pattern. Start of String, followed by one alpha character, followed by zero or more word characters, dots, or dash, followed by a single alphanumeric character, follow the strudel, followed by exactly one alphanumeric character, followed by zero or more word characters, dots, or dashes, followed by one alphanumeric, followed by a dot, followed by exactly one alpha, followed by zero or more alphas or dots, follwed by exactly one alpha, and the end of string.
So [email protected] is not a valid email?
You might want to look at this for details on using a regex to validate an RFC822 email address.
Admin
forgot the link
Admin
It would be really nice if the only problem was it is not using regular expressions. :)
Admin
http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html
Software is acting funny for me. :(
Admin
Stop doing that. You are not accepting these valid addresses:
Use a real RFC822 compliant validator like Mail::RFC822::Address
Admin
The nesting of the if statements is actually for performance... VB is nice enough not to short circuit like that, which is a WTF itself.
brad
Admin
Here's a very simple email validation: ^.+?@.+\..+?$. It doesn't get much simpler.
Admin
If I remember correctly VB can short circuit it just does not do this by default. I believe the operators are: AndAlso and OrElse
Admin
I'm beginning to think that Regular Expressions ought to be the very next thing tought to computer science students after the prerequeset Hello World app.
Admin
AndAlso and OrElse werent until VB.Net and that appears to be old school vb/asp (prolly asp)
brad.
Admin
I like how none of his variables are typed.
(AndAlso and OrElse are only available in VB.NET, my guess is that this if VB6).
Admin
I use something like this on my own website (I am reconstructing this from memory.. it's something like this though..)-
/^[a-z0-9_]+(-[a-z0-9_]+)*@[a-z0-9]+(-[a-z0-9]+)*(\.[a-z0-9]+(-[a-z0-9]+)*)*\.[a-z]{2,3}$/i
Maybe everyone can tell me how wtf'ed it is... it's a personal site, so its not like I have to worry about losing money if valid e-mail addresses are rejected..
Admin
Because I'm sure performance was the first thing on this guy's mind ;)
Admin
Function isUserNameOk(strInput)
Dim intIn
intIn = InStr(strInput,"@")
If intIn = 0 Then
isUserNameOk = False
Else
isUserNameOk = True
End If
End Function
Hmmmm ... so, according to the above function, @jeff.com is a valid email address? (Instr(..) returns 1, which is not zero)
The whole thing is pretty horrible; even if you don't know reg expressions, don't have them available, or don't want to use them, there is no excuse for writting sloppy string checking code like this.
Admin
I know, he forgot to write:
Admin
That'd be fine with me. I was at least 3 or month out of college before I ever even heard the term "RegEx." Granted, I minored rather than majored in computer science.
Admin
Wow. Lack of regular expressions is only the tip of the iceberg. Even if there was no such thing as RegEx there's plenty of WTFs to be found here. For example, witness detectDoubleDotInARow, detectDoubleAtSimbol [sic]: 20 lines of code to do
If InStr(strInput, "..") > 0 Or InStr(strInput, "@@") > 0 Then Return True
Admin
Since there seems to be a large number of people that write email validators in this thread let me say this really loud:
The plus sign ('+') is a valid character in email addresses!
In fact just about every ASCII character is valid.
Read the RFC. It's important! +, %, $, !, ., etc
Gmail has this really neat feature where you can go [email protected] and it goes to "myname" and you can filter it for "dailywtf" and automatically give it a tag or trash it (good for killing spam).
Admin
Fantastic. After all that, there are still countless ways to break the damn thing......when you start going down this path writing code, shouldn't you start wondering "Hmm, is this how everyone has to do it?!? This really sucks."
Admin
This actually isn't unique to gmail. It's an old trick for mailer daemons.
But yes, it's surprising how many web forms reject perfectly valid email addresses. Especially when code exists on most every web language specifically to validate email addresses using fast functionality.
Rolling your own email validation is a WTF all by itself.
Admin
I want to thank thedailywtf for providing us with clever techniques, tricks and code samples over the years. I am going to make sure that this advanced email validation algorithm will be implemented in our production system. Thank you thedailyWTF!!!!
Admin
Admin
FWIW, here's what I use:
// is_valid_email
// argument: email
// this is a string that may or may not be
// a syntactically valid email address
// return value
// returns true if the email is syntactically valid
// returns false if the email is not syntactically valid
function is_valid_email(email)
{ var syntax_ok;
var at_position = email.indexOf("@");
if (at_position == -1)
{ syntax_ok = false;
}
else
{ var local_part = email.substring(0, at_position);
var domain_part = email.substring(at_position + 1, email.length);
var local_part_pattern =
/^[a-zA-Z0-9!#$%&'*+-/=?^_`{|}~]+(.[a-zA-Z0-9!#$%&'*+-/=?^_`{|}~]+)*$/;
var domain_part_pattern =
/^a-zA-Z0-9?(.a-zA-Z0-9?)+$/;
if ( local_part.match(local_part_pattern)
&& domain_part.match(domain_part_pattern)
)
{ syntax_ok = true;
} else
{ syntax_ok = false;
}
}
return syntax_ok;
}
Admin
ALRIGHT. WE GET IT. STOP USING THIS 'JOKE' FOR EVERY WTF.
Admin
Some douche wrote:
"I know, he forgot to write: If (Not(IsTrue(detectedEmailBadCharacters(valueStr)))) Then"
ALRIGHT. WE GET IT. STOP USING THIS 'JOKE' FOR EVERY WTF.
Admin
Actually, detectDoubleAtSimbol just looks for two at sImbols in the string, not two in a row. At least the guy got the name of this unnecessary function correct. Well, not counting spelling.
Admin
He should have had each of the functions he calls from the if statements throw an exception. That way he could catch them instead of letting them return a possibly troublesome "false" value.
Admin
Or so you think. There is a saying that goes "As soon as you think you've made something idiot-proof someone comes along and makes a better idiot"
Admin
The nesting is awful. Ignoring all those WTF'd functions, the best way would be to invert some and AND them together...
Admin
Maybe the obscure reference to a music professor at the University of Southern North Dakota at Hoople (google if necessary) is what makes RFC822 so difficult for folks to understand.
But I would expect any code to acknowledge the requirement(s) - either RFC822 or exceptions (or acceptable email forms).
As for "validating" email addresses, why bother? "[email protected]" is perfectly valid, syntactically. But it's no more effective at delivering mail than anything these "validators" might reject.
Admin
I'd say leave to the user to put a valid e-mail in and just check it with a tiny regex like
+@+.+Admin
Admin
And how, brother. Several years ago, my wife was working on a an ASP project with a team that was... let's say, somewhat out of their league. After her workday got to be twelve to fifteen hours long, I got sick and tired of being without my wife and informed them that I'd be joining their team to until the project was under control. (Note: this was the first ASP project I had ever worked on.) Every morning, I'd go to my day job, knock off around 5, and head over to my wife's office to work for them for an additional four or five hours.
After replacing two convoluted functions similar to these with a pair of regular expressions (one for e-mail addresses, one for phone numbers), I got a phone call from my wife, who told me the project manager decided that he wanted to change the format of the phone number, and promptly broke my validation code. When I checked, it appeared that he'd attempted to convert the regular expression back into some kind of awful hybrid of regexp and an "iterate over the string one character at a time" beast that... I don't know, I blocked it out. But I fixed it, and left strict instructions that the PM was not to touch my code again.
As a side note: it can be pretty liberating to work for a guy who can't fire you.
Admin
I'm assuming that by "+" you meant something representing one or more characters, rather than the more standard ".+".
a@b is a perfectly valid email address in certain contexts. Granted, on the Internet at large it won't work, but if you craft an application that rejects it when you know that your SMTP server will be able to handle it, you've tied the hands of one of your users for no reason.
Admin
As a side note: it was acceptable in this situation to not accept the whole range of addresses described by RFC 822, so the regular expression wasn't the type of thing likely to make your eyeballs fall out.
Admin
To the guys AND'ing...(and myself)
I actually thought this at first:
Then I thought...well shit, which is faster, so I ran 10 second iterations of each method and found the nested if statements were actually 10% faster. +/- 0.1% error margin. I completely ignored fast code for a clean look. I forgot that VB evaluates EVERY AND condition before making up it's mind. (you can test this by changing it to:
All-in-all, it's pretty bad the way it is. I did like the spelling of symbol.
Admin
My favorite line in this whole mess is in detectedEmailBadCharacters.
counter = 5
Admin
I once got this RegExp for email validation from some semi-official XSchema resource, in an attempt to define an email schema type.
^[A-Za-z0-9!#-'*+-/=?^_
\{-~]+(\.[A-Za-z0-9!#-'\*\+\-/=\?\^_
{-~]+)@[A-Za-z0-9!#-'*+-/=?^_\{-~]+(\.[A-Za-z0-9!#-'\*\+\-/=\?\^_
{-~]+)$Not having read the email RFC, I don't know exactly how compliant this is, but I believe it handles every e-mail address that does not contain outlandish features such as comments.
To the one who asked, why bother to validate?
As a convenience. You do not validate e-mail addresses to prevent fraud addresses; that's what confirmation mails are for. You validate them to catch typos. You validate them not to prevent some H4X0R to abuse your system, but to offer some asistance to the poor sod who has no clue about computers and can't type, either.
Admin
A brillant observation!