- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
I use the + quite a bit to separate the wheat from the chaff. If I'm forced to sign up for something on a website and I have to get the unlock through a valid email address, I just slap the name of the website I used the email address on ([email protected]) and watch as they either ignore or honor their privacy policy. I can't tell you how many times I see my email address come int eh mail from a website that supposedly never sells email addresses to third parties.
captcha: burned (indeed, indeed)
Admin
Admin
Admin
so, you don't like it that valid addresses are invalidated, so you present a regex that follows spec...except you then admit it doesn't do everything because...well who uses that anyway. Somewhere there is a person gnashing their teeth becuase their perfectly valid address isn't being validated by your code.
btw, I don't no the exact spec, and haven't tried to figure out if your regexp follows it. I don't think my boss would appreciate the time spent ;}
Admin
Whitespace is allowed in email addresses, as are constructs like:
which both would fall over on.
Admin
use Mail::RFC822::Address qw(valid)
Admin
Ooooh, ooh! I got it!
The WTF (apart from this stupid comment box I'm typing in being only 20x2 characters) is that he thinks the 'at' sign is called an ampersand.
Can't see anything else wrong with it. It validates my email address just fine: "@/.
Steve
PS. That RegEx would fail on email addresses that use an IP address instead of a FQ name.
Admin
Your regexp doesn't support valid addresses such as billg@[131.107.115.212]
Admin
The regexp for validating all compliant email addresses is to large to this in this margin.
(Seriously, it's pretty big.)
Admin
Don't tell me that you think that the ugly regex is more readable than the javascript version. The purpose of email validation is just to check for common errors it has no sense to try to validate perfectly because it won't save you against valid nonexisting email (you just have to send the mail there and wait for the response).
Admin
From Regexlib.com
No?
Admin
This one is much more fun:
(?:(?:\r\n)?[ \t])(?:(?:(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))|(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))<(?:(?:\r\n)?[ \t])(?:@(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))(?:,@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))):(?:(?:\r\n)?[ \t]))?(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))>(?:(?:\r\n)?[ \t]))|(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])):(?:(?:\r\n)?[ \t])(?:(?:(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))|(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))<(?:(?:\r\n)?[ \t])(?:@(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))(?:,@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))):(?:(?:\r\n)?[ \t]))?(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))>(?:(?:\r\n)?[ \t]))(?:,\s(?:(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))|(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))<(?:(?:\r\n)?[ \t])(?:@(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))(?:,@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))):(?:(?:\r\n)?[ \t]))?(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \x00-\x1F]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))>(?:(?:\r\n)?[ \t]))))?;\s)
captcha: tastey (yum yum- if only it was spelt correctly!)
Admin
The email rfc is full of crazy ideas... Look at the RFC compliant regex to validate an email :
http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html
Admin
And check this one
Admin
Drat. I mean this:
And check [url=http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html]this one[url].
Admin
The regexp doesn't work with domain names starting with numbers. I think that might comply with RFC822, but try telling 3com that.
Admin
The regular expression for Mail::RFC822::Address is horrendous - and it doesn't even handle comments. If you squint at the bottom half of it, you can see a picture of a donkey.
Admin
Crap, I give up - either copy/paste, or simply believe me that this regex is on that page:
(?:(?:\r\n)?[ \t])(?:(?:(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t] )+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?: \r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:( ?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\0 31]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)
](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+ (?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?: (?:\r\n)?[ \t])))|(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z |(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n) ?[ \t]))<(?:(?:\r\n)?[ \t])(?:@(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:
r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n) ?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t] )))(?:,@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])* )(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t] )+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))) :(?:(?:\r\n)?[ \t]))?(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+ |\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r \n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?: \r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t ]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031 ]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)]( ?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(? :(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(? :\r\n)?[ \t])))>(?:(?:\r\n)?[ \t]))|(?:[^()<>@,;:\".[] \000-\031]+(?:(? :(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)? [ \t]))"(?:(?:\r\n)?[ \t])):(?:(?:\r\n)?[ \t])(?:(?:(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]| \.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<> @,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|" (?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t] )(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\ ".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(? :[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[ ]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))|(?:[^()<>@,;:\".[] \000- \031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|( ?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))<(?:(?:\r\n)?[ \t])(?:@(?:[^()<>@,; :\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([ ^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\" .[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[
]\r\]|\.)](?:(?:\r\n)?[ \t])))(?:,@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".
[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]
r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\] |\.)](?:(?:\r\n)?[ \t])))):(?:(?:\r\n)?[ \t]))?(?:[^()<>@,;:\".[] \0 00-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(?:[^"\r\]|\ .|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[^()<>@, ;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[]]))|"(? :[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t])))@(?:(?:\r\n)?[ \t]) (?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\". []]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t])(?:[ ^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[] ]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))>(?:(?:\r\n)?[ \t]))(?:,\s( ?:(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\ ".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:( ?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[ ["()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t ])))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t ])+|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(? :.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+| \Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))|(?: [^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\".[
]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))<(?:(?:\r\n) ?[ \t])(?:@(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[[" ()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n) ?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<> @,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))(?:,@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@, ;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:.(?:(?:\r\n)?[ \t] )(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\ ".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))):(?:(?:\r\n)?[ \t]))? (?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[["()<>@,;:\". []]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]))(?:.(?:(?: \r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[[ "()<>@,;:\".[]]))|"(?:[^"\r\]|\.|(?:(?:\r\n)?[ \t]))"(?:(?:\r\n)?[ \t]) ))@(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t]) +|\Z|(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t]))(?:
.(?:(?:\r\n)?[ \t])(?:[^()<>@,;:\".[] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z |(?=[["()<>@,;:\".[]]))|[([^[]\r\]|\.)](?:(?:\r\n)?[ \t])))>(?:( ?:\r\n)?[ \t]))))?;\s*)
Admin
regex looks like garbage. is there a framework somewhere that can validate an email address?
Don't FTP servers have constructs that allow them to verify email addresses as being valid without physically checking them on the internet?
Admin
my ISP doesn't support de local part!
Admin
The Regexes are utterly unreadable and therefore unmaintainable. I'd hate to have to fix one of those monsters.
Recently I overheard a collegues' phone conversation. He was babbling on about the email validation not being tight enough. "after the period, it should check for exactly 3 characters. You know: .com .org .net. But it should just check for those three we don't want to limit ourselves if they come up with more TLDs. I'll raise a low priority defect for that after the call."
So I shot him off an email (not wanting to interrupt his call) giving him some examples of .info and .co.uk email addresses. I didn't have the heart to show him the RFC.
Admin
For some reason, this was hilarious to me. Maybe I need more coffee?
Admin
I believe it's much better to not do any validation, really. Even better, let e-mail be an optional field, and only if it's filled, complain about a missing @ if it's not there - and if you really need that @ to be there (you might want to accept local accounts/aliasses, ugly exchange/notes addresses, X.500, whatnot).
Other than that, the syntax is simply too complex. Even if you catch someone forgetting a bit of the FQDN or whatever, you still can't catch them making "valid" typos ([email protected]).
If you're going to be using the e-mail adress for, well, sending e-mail, you still need to send a confirmation e-mail just so you won't be called a spammer anyway. So let their mail server do the checking for you.
The one exception I can think of, is if your e-mail system itself has some limitation (that isn't in the specs). For example, if your system simply can't handle IP addresses and quoted local parts, validate against those.
Admin
Hmm, email address validation is a nasty one. I remember trying to validate by doing lookup on the hostname portion, only to get scuppered by mail servers that don't resolve but are valid. I forget the details as this was many aeons ago, however a more experienced colleague pointed me at some RFC's (and would have probably submitted my code as a WTF if this site had been around).
Admin
Actually, most validation algorithms disapproves of this perfectly valid address: me@se Why is that??
Admin
I read a quote somewhere that described regex as a "write once, read never" syntax. It's the poster child for the differnce between "Clever" and "Wise".
Captcha: tesla - good scientist, BAAAD band...
Admin
Er... I use a much simpler check than most of you do; perhaps it doesn't cover everything, but this is an internal thing, so it doesn't need to.
/^([a-zA-Z0-9_.-])+@(([a-zA-Z0-9-])+.)+([a-zA-Z0-9]{2,4})+$/
Admin
I just like how in the code they have the var "ampisthere" for ampersand (&) and they think that's what you call @
Admin
The correct regexp for emails can be found there: http://examples.oreilly.com/regex/
Admin
I don't think that's it. I think he's using "amp" as a short form for "ampersat", which is indeed a more or less valid reference to "@". The real WTF is that no one besides this particular coder knows what an "ampersat" is.
The other real WTF is that you can also refer to "@" as an "asperand". WTF?
In a Google battle between "asperand" and "ampersat", "ampersat" comes out slightly ahead - but both are practically undefined, by Google standards, at just under 3k references each. So using either one in code that is going to be maintained by someone else is definitely a WTF.
No, the real real WTF here is shortening "address" to "add"... that's clarity right there.
Admin
Yeah, I use something much simpler even than that in internal code.
!/^$/
If they dont get their email, it is highly likely that they didnt put in the right address :)
Admin
From http://www.eskimo.com/~hottub/software/programming_quotes.html
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
Admin
To add insult to injury, there may be e-mail addresses of IDN's (Internationalized Domain names) in future with umlauts, ogoneks, cedilles and such a stuff...they will be transformed by nameprep and punycode into RFC addresses before using, but that won't help validating them....
To regexps: They violate the good old KISS principle ("Writing solid code"). They are hard to read (both visually and mentally), they cannot be accordingly commented (if you have qualms to spread both comment and regexp over the page) AND they are fragile (you know what I mean if you accidentally tipped one more char than necessary)....sometimes it breaks, sometimes not. I think it is some pride involved to be able setting up a mighty "all-cases-in-one" regexp, but for maintenance the long monsters are garbage. It's not so much fun, but keep the style boring; write so that you know five pages beforehand what it going to happen. In this case break the mail address into parts and verify them individually (with short regexps, yes) and comment what you are doing. You will be pleased if you are forced to rewrite old routines which you haven't seen a year under time pressure.
Admin
After reading this I had a look at what the jakarta commons validator did. And found a nice mini WTF. The EmailValidator is a singleton despite it not having any state. God I wish they had never invented that pattern.
http://svn.apache.org/viewvc/jakarta/commons/proper/validator/trunk/src/main/java/org/apache/commons/validator/EmailValidator.java?view=markup
Admin
Please spam my valid address: .@._ Thank you.
captcha: tastey (t .. to the a .. to the s-t-e-y girl, you tastey)
Admin
unfortunately your regex isn't any better at actually validating email addresses, and here is why
http://www.regular-expressions.info/email.html
Admin
Some people, when confronted with regular expressions, like to quote jwz. Now they are fools who cannot cope with regular expressions.
Admin
The only validation of e-mail addresses I do is to check it matches .+@.+..+
i.e. there's an @, there are characters on both sides of the @, there's at least one . to the right of the @, there are characters on both sides of the .
Remember, no syntactic check is going to determine whether an e-mail address is actually valid and working. All you're checking for is obvious brokenness like putting in localhost-specific user IDs, or putting their name in the field instead of their e-mail address.
Admin
regex sucks. they are a real solo trip, to be trotted out after 15 cups of coffee and a similar number of cigarettes. maintainability? you're joking... just start again!
Admin
Perl has allowed comments and non-meaningful white space in regexes since 1998. That you can write regexes that look like line noise doesn't mean you have to.
That big monster of a regex that validates RFC822 email addresses exists because, if you're going to validate email addresses, you may as well validate them properly - by dint of building up a regex bit by bit, and then, once you're happy it works, compiling it down to one big long humungous lump of code for performance, so other people can just say "use Email::Valid" or whatever other Perl modules use it. In the same way that ages ago people used to shorten variables and eliminate white space to fit more code into 32K, or however much RAM their machine had at the time. It doesn't mean you develop that way.
Admin
With the help of the wonderful RegExBuddy the splendid regex above can be translated into an approximation of English.
^-!#$%∓'*+/0-9=?A-Z^_a-z{|}~ @a-zA-Z(.a-zA-Z)+$ Assert position at the start of the string «^» Match a single character present in the list below «[-!#$%∓'+/0-9=?A-Z^az{|}~] » One of the characters "-!#$%∓'*+/" «-!#$%∓'*+/» A character in the range between "0" and "9" «0-9» One of the characters "=?" «=?» A character in the range between "A" and "Z" «A-Z» One of the characters "^" «^» A character in the range between "a" and "z" «a-z» One of the characters "{|}~" «{|}~» Match the regular expression below and capture its match into backreference number 1 «(.?[-!#$%∓'+/0-9=?A-Z^_a-z{|}~])» Match the character "." literally «.?» Between zero and one times, as many times as possible, giving back as needed (greedy) «?» Match a single character present in the list below «[-!#$%∓'+/0-9=?AZ^ a-z{|}~]» One of the characters "-!#$%∓'+/" «-!#$%∓'+/» A character in the range between "0" and "9" «0-9» One of the characters "=?" «=?» A character in the range between "A" and "Z" «A-Z» One of the characters "^_" «^_» A character in the range between "a" and "z" «a-z» One of the characters "{|}~" «{|}~» Match the character " " literally « » Match the character " " literally « » Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «» Match the character "@" literally «@» Match a single character present in the list below «[a-zA-Z]» A character in the range between "a" and "z" «a-z» A character in the range between "A" and "Z" «A-Z» Match the regular expression below and capture its match into backreference number 2 «(-?[a-zA-Z0-9])» Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «» Note: You repeated the backreference itself. The backreference will capture only the last iteration. Put the backreference inside a group and repeat that group to capture all iterations. «» Match the character "-" literally «-?» Between zero and one times, as many times as possible, giving back as needed (greedy) «?» Match a single character present in the list below «[a-zA-Z0-9]» A character in the range between "a" and "z" «a-z» A character in the range between "A" and "Z" «A-Z» A character in the range between "0" and "9" «0-9» Match the regular expression below and capture its match into backreference number 3 «(.a-zA-Z)+» Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+» Note: You repeated the backreference itself. The backreference will capture only the last iteration. Put the backreference inside a group and repeat that group to capture all iterations. «+» Match the character "." literally «.» Match a single character present in the list below «[a-zA-Z]» A character in the range between "a" and "z" «a-z» A character in the range between "A" and "Z" «A-Z» Match the regular expression below and capture its match into backreference number 4 «(-?[a-zA-Z0-9])» Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «» Note: You repeated the backreference itself. The backreference will capture only the last iteration. Put the backreference inside a group and repeat that group to capture all iterations. «*» Match the character "-" literally «-?» Between zero and one times, as many times as possible, giving back as needed (greedy) «?» Match a single character present in the list below «[a-zA-Z0-9]» A character in the range between "a" and "z" «a-z» A character in the range between "A" and "Z" «A-Z» A character in the range between "0" and "9" «0-9» Assert position at the end of the string (or before the line break at the end of the string, if any) «$»
Admin
Some people, when defending regular expressions, like to show their arrogance because they know how to write obscure and usually unmaintainable code.
Admin
I own domain ölbaum.ch. Isn't there an RFC that allows it in an e-mail address? Then most of these regexps would have to be rewritten. Lucky no e-mail client (that I know of) supports IDNs.
Admin
Admin
Just a little note to add: I literally go maniacal when a webpage refuses my e-mail address that starts with "i@". One letter local part still makes an e-mail, for the love of Pete.
Regards.
Admin
When I've had to use fairly hairy regexes, they are always iteratively designed. I start simple, and add to it until it does what I need.
The trick is I have each iteration as a comment beforehand.
This lets me see what I've done, documents the limits of the regex, and lets me dig in and make changes without having to completely start over.
Sure, that chunk of the code will have a large amount of comments relative to other places... but when you've got a tough chunk of code, isn't that a good thing?
Admin
Possibly, but matters not. The fact remains that it's unmaintainable as-is. Just because the metadata that "Documents" it might be maintained elsewhere, such as a tool, doesn't mitigate the fact that no one reading the source can be sure of what it does. Also, if the tool were worth a damn, it would also give you comments to imbed along with the regex.
Hopefully this WAS simply the output of a builder class, where the method calls used to build it provide adequate documentation. But based on the OP, I doubt it.
Captcha: tacos - with that suggestion, I'm off to lunch
Admin
I don't see the point of validating email addresses at all... even if it vaguely resembles what an email looks like, there is no guarantee that it is correct. It will soon enough be validated when you try and send an email to it and at least you won't be alienating those users with non-standard-but-legal addresses.
(And even the argument that it is helping the user by telling them they've typoed... they are probably just as likely to typo the text part of the address than the @ and . parts!)
Admin
The real WTF is you comments section not word wrapping.:P
Maybe you should consider a HTML class?
Admin
The easiest and most likely to succeed way to validate an address is to establish an SMTP session to the primary MX of the domain and do an RCPT. If the address is invalid, either you cannot establish a connection or the SMTP server returns an error. Easy :)
[And yes, I do know that the Internet mail doesn't work like that any more, more is the pity.]