The format for e-mail addresses is specified in a number of RFCs; it's a pet peeve of mine when people "validate" away perfectly valid addresses, for instance: websites that think all domains end in .com, .net, .edu, or .org; and agents that refuse to transfer mail with a + in the local-part. To that end, I wrote my own regular expression that (I believe) follows the specification, which I'll share below.
First, I'd like to share some code that Igor found, which he considers a masterpiece.
if(document.forms[0].c_email.value) { var add = document.forms[0].c_email.value; var ampisthere = false; var spacesthere = false; var textbeforeamp = false; var textafteramp = false; var dotafteramp = false; var othererror = false; for(var i = 0; i < add.length; ++i) { if(add.charAt(i) == '@') { if(ampisthere) othererror = true; ampisthere = true; } else if(!ampisthere) textbeforeamp = true; else if(add.charAt(i) == '.') dotafteramp = true; else textafteramp = true; if(add.charAt(i) == ' ' || add.charAt(i) == ',') spacesthere = true; } if(spacesthere || !ampisthere || !textafteramp || !textbeforeamp || !dotafteramp || othererror) { error += "\tEmail addresses must be valid working"; error += " addresses with no commas or spaces\n"; } }
Obviously, it's JavaScript. Unfortunately, there was no equivalent check being done server-side.
And as I promised, here's my own RegExp for you to tear apart. (Yes, I know it doesn't handle a quoted local-part. No, I don't mind. Seriously, who does that?)
^[-!#$%&'*+/0-9=?A-Z^_a-z{|}~](\.?[-!#$%&'*+/0-9=?A-Z^_a-z{|}~])*
@[a-zA-Z](-?[a-zA-Z0-9])*(\.[a-zA-Z](-?[a-zA-Z0-9])*)+$
Update: Fixed HTML Encoding problem on RegEx