- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Frist? Sure
Admin
It is as sure as XKCD 221 - https://xkcd.com/221/ - Are you sure? Ja, ich hab' Schuhe.
Admin
As someone who has written a regexp to validate the syntax email addresses, I object to the statement that they cannot be parsed with a regex. Perl regular expression have allowed for recursion for more than a decade. Anything for which there is a BNF grammar, like email addresses, can be turned into a regular expression almost mechanically.
Admin
Sad part is JS has a perfectly good URL class that includes validation on the constructor. You could just do:
Admin
In practice, one can validate email addresses using a regular expression: https://stackoverflow.com/questions/201323/how-can-i-validate-an-email-address-using-a-regular-expression
Some people might say "but my Punycode!", which is halfway valid: IDN works (and people expect it to), but must be preprocessed before it becomes an SMTP-type email address. Less bright people might say "but RFC2822 comments!", which is not valid because they're definitionally not part of the email address.
There was a fad some decades ago to assert, as this article does, that email addresses are not a regular language -- but I thought that myth had been dispelled.
Admin
Are you ${isValidUrl()}?
Admin
Clearly, a brillant Paula bean solution.
Admin
The problem here is the uncalled-for decisiveness. A developer with a philosophical mindset would do something like this instead:
Admin
Sure is a great way to validate URLs!
Admin
I dunno about using a REGEX for email -- the simplest thing is just if the thing includes an "@". Every time I crack open an RFC and then compare it to what's actually sent, I always see "bad" data sent over.
Fun EMAIL fact: email addresses in RFC 561 are things like "fred AT place"
Admin
"Are those URLs? Sure! Is this a terrible approach? Sure! Does the fact that it's been like this for years and nobody actually complained imply that they didn't need URL validation in the first place? Sure!"
Now that's just funny. I'm still chuckling. Thumbs up!
Admin
Checking for just one
@
and not in either the first or last place covers the really silly problems (you aren't going to be supporting non-SMTP naming schemes in this day and age). More than that... need to see if you can actually send an email to the mailbox and have someone pick it up and respond sensibly to it. You simply aren't going to validate all possible current user mailboxes on all email systems without doing something completely beyond regexps.Admin
https://github.com/Perl/perl5/blob/blead/t/re/reg_email.t
Admin
I know I shouldn't, but I kinda like this for some reason
Admin
There's even a
URL.canParse(url, [base])
static method for this purpose. Although it's not supported by older JS engines.Admin
I agree - I regularly use comments to distinguish between [email protected] and [email protected] to make sure I can see who sold my address. And most regexes are too stupid to accept this, so I am a big proponent for the "send them a mail on the given address. If they answer, it was valid. Otherwise, ignore it."
Admin
Surely the correct implementation is:
Admin
Maybe there really was validation code there once and it kept failing, so somebody trollishly replaced it with that one day?
Admin
In the end, for most(1) purposes, this is the right answer. The website operator shouldn't care about some abstract concept of what's valid, e.g. I remember seeing someone on a forum somewhere who commented on having worked for the UK's overall domain name system operator, where his email address was (I may have his given name wrong) "john@uk" - good luck getting that past any website-hosted email address "validator"...
Instead, concentrate on what's important: can the site use the address to send an email to the person? Yes, then it's valid, otherwise complain in some way.
Admin
I would even argue that if you are looking for a complicated, exact validator you are most likely doing something wrong. Either you are looking for people to enter any possible valid value, in which case you are sure to be wrong at some point and restrict a valid input, like when the first person tries to post a URL attachment with an emoji in your 2005 forum software. Or, more likely, you have actual system requirements that mean supporting the entire standard would be nonsense. Your SSO redirect URL doesn't need to accept hex-IPs served over gopher with Arabic diacritics in the query string, that is nothing more than a security vulnerability waiting to happen. Azure AD allows anything that starts with https, uses the 443 port and a domain with a dot in it or localhost - that's perfectly fine for their use case and security profile. @.* will allow any practical address that can be served over the web and will catch most typos. Keep it simple and secure.
Admin
You must be new here.
Admin
RFC 561 had to be some kind of April joke. RFC 524 clearly uses @.
Admin
@Steve The Cynic ref:
Back in about 1993 I got my own domain for my company:
MyCompanyName.US
. Being based in the United States that seemed a good choice at the time. So my email address became[email protected]
. No special characters: just some letters, one @ and one dot. Little did I know the trouble I unleashed upon myself."US" became a valid ccTLD in 1985, about 8 years before I got my domain. I was not an early adopter. But in 1993 my shiny new email address choked (30+%) of website email validators. Most choked if your TLD wasn't 3 letters, or worse yet wasn't "com" , "net" or "org". Ouch. Even now in 2023 I occasionally encounter websites whose email validators don't like
[email protected]
.The idjits are many among us.
Admin
I use panix.com, which has a workaround: I can use any number of email addresses of the form (eg) "[email protected]", since so many places don't understand the "+" format. They all end up at [email protected], but I can still filter on "thedailywtf" in the address. I've found at least one email database leak that way.
Disclaimer: Just a happy customer.
Admin
You also send emails for verification when someone opens an account. With a big warning “don’t do anything if you didn’t try to open an account”.
So yo have a real good chance to not only figure out the address is valid, and that it exists, but also that it is the correct one.
Admin
"valid e-mail" - Has nothing to do wit RFC or otherwise.... simply "Does a server exist which can receive and e-mail addressed in this way, and if not, then could such a server be created and put on the network [note, I did not specifically say internet]
Admin
I do a little more than that - I look for [email protected] where each "z" represents any string of at least one character. It's about the minimum requirement for something that we're actually going to be able to send an email to.
Looking in our database the main benefit over just looking for an @ seems to be that we catch the ones who omit the domain part altogether (myemail@), or think the domain part is just "yahoo", "gmail", "hotmail", etc., or typo it in some way (usually xxcom or xx,com instead of xx.com). Though I have to give props to "lexus of brighton" as a domain, and the very special people who put the first part of the domain at the beginning (or the middle) of the username part instead of after the @.
Addendum 2023-12-06 05:19: No, I'm not too worried about the user@validTLD edge case. If we ever get one in our system and someone asks why that person isn't getting marketing emails, I'll happily add in an exception.