• C-Derb (unregistered)

    Why waste so much effort telling the user exactly what is wrong with their email address? Just use a generic "Please enter a valid email address" message. I mean, 99.9% of the time you are expecting users to enter valid data, right?

  • (cs) in reply to Abigo
    Abigo:
    Grzechooo:
    Good that he didn't use a regular expression.
    I think I see it. It's a boat, right?
    It's not a boat. It's a schooner.

    On the plus side, at least it didn't try to parse HTML.

  • Hannes (unregistered) in reply to jkupski
    jkupski:
    Hannes:
    DNS Resolvers DO care about the dot. If they wouldn't they couldn't resolve a URL like thedailywtf(dot)com(dot). But -surprise surprise- they do resolve it.
    Actually, they do not, given that the above is a URL (as you yourself note) and not a domain name. The above is really a lot like misusing they're/their/there while being a grammar nazi.

    Well, what do URLs mostly contain? Maybe Domain Names?

    Also, a FQDN does indeed have a dot at the end: (dot)com(dot), so the DNS can resolve the root server responsible for "com".

    Or, you could just try and read the wiki article about FQDN. ;)

  • ebinezer dude (unregistered)

    He was given this project to gain PM experience. Thus, the right action is to not review the code, allow the dev to check it in, compile and if no errors, launch. Then report to your boss that this project was successfully completed on time, within budget and the clients are incredibly happy.

    Next up- Project Management Lesson 2: The Perfect Project - How to Keep Client Complaints from Your Boss and Make Bug Reports Disappear

    Project Management Lesson 3: Ethics Schmethics - It's only wrong if your caught and How to Say Sorry Like a Boss.

  • ANON (unregistered) in reply to faoileag
    faoileag:
    pjt33:
    faoileag:
    think that the code ... does not represent a wtf per se.
    Regardless of the spec, any code which could be compressed by 90% with a loop or two is a WTF unless it's explicitly commented that the loop was unrolled with a significant impact on performance.
    For a peer review, I would agree with you completely. However, this is code delivered by an offshore team. In an ideal world, you run your pre-written unit-tests against it and tell the offshore team which have failed if any fail. You do not look at the codebase itself, unless somewhere in your contract with the overseas company you have a clause that explicitly states that the code itself must also meet certain standards. Which is normally not the case. So who cares if they do the loop unrolling themselves? Let them. Perhaps they get paid by lines of code.

    I don't agree on this. It's in the interest of the customer that the code meets the coding standards and not in the interest of the offshore company. The customer will have to pay the technical debt later. So you should take a look in the code to enforce the interests of your customer.

    Furthermore I read in some studies that white box tests find more errors in the same time than black box tests.

  • fein (unregistered) in reply to Citron

    All you should check for is "x@y"

    This is the minimum spec for an email address.

  • Poen (unregistered)

    The RWTF is they outsourced such a simple task...

  • (cs)

    TRWTF is that not only the dev team, but every comment in this thread chose the wrong meaning of "valid." A character string can pass RFC_whatevernumber but still not be valid because no such email address exists.

    Thus, the validation code needs to go like this: (pseudo-pseudocode)

    pingit := pine $address_to_be_validated if (pingit != NULL ) { # must have got some bounceback nastygram Table_reject($address_to_be_validated) }

  • (cs)

    Couple of things:

    1. Quality of code: Today's lesson will be loops. This is a basic computer construct that everyone should know. Study it well, it will be used in a test later.

    2. On validating email addresses: Email addresses can range from the simple to the very complex. There are many standards. Read them. For the most part if you want to have an address that will be globally useful, it most likely contains two things: It contains an '@', and it has at least one '.' after the '@'. Sure there are other constraints, but for the most part this is enough. Ask the person who typed it in to do it twice, and they will usually get it right (or it will be intentionally wrong). If you want to validate further, send a confirming email and await a response. If you get one, it is probably good to go. Anything else is getting close to a waste of time.

  • ¯\(°_o)/¯ I DUNNO LOL (unregistered) in reply to cellocgw
    cellocgw:
    pingit := pine $address_to_be_validated
    An e-mail address domain is not necessarily ping-able. They use MX records, and an IP address is only a fallback.

    Or was that not a typo and you really meant pine the e-mail client?

  • meinders1337 (unregistered)

    The real WTF is writing code to validate an email address. Use existing libraries.

  • (cs) in reply to radarbob
    radarbob:
    Warren:
    OK, so they should have had a return type of boolean and used exceptions for the errors...

    Aaaarrrrggghhhhhh...

    Yeah, people like Warren that abuse exceptions need to be thrown in a deep, dark dungeon somewhere.

  • (cs)

    "Kerbleckistan Considered Harmful"

  • Doozerboy (unregistered) in reply to faoileag
    faoileag:
    pjt33:
    faoileag:
    think that the code ... does not represent a wtf per se.
    Regardless of the spec, any code which could be compressed by 90% with a loop or two is a WTF unless it's explicitly commented that the loop was unrolled with a significant impact on performance.
    For a peer review, I would agree with you completely. However, this is code delivered by an offshore team. In an ideal world, you run your pre-written unit-tests against it and tell the offshore team which have failed if any fail. You do not look at the codebase itself, unless somewhere in your contract with the overseas company you have a clause that explicitly states that the code itself must also meet certain standards. Which is normally not the case. So who cares if they do the loop unrolling themselves? Let them. Perhaps they get paid by lines of code.

    Riiiiiiight. So are your unit tests going to catch that the thing runs like a bag of spanners?

  • Popeye (unregistered)

    Surely even an idiotic offshore developer in pakistan would know how to use Google to look up a regex for email validations?

    One line of code PEOPLE.

  • Kasper (unregistered) in reply to meinders1337
    meinders1337:
    The real WTF is writing code to validate an email address. Use existing libraries.
    I recently came across some code, which did that. It didn't work. The library rejected any TLD longer than 6 characters.
  • (cs) in reply to ¯\(°_o)/¯ I DUNNO LOL
    ¯\(°_o)/¯ I DUNNO LOL:
    cellocgw:
    pingit := pine $address_to_be_validated
    An e-mail address domain is not necessarily ping-able. They use MX records, and an IP address is only a fallback.

    Or was that not a typo and you really meant pine the e-mail client?

    Yes, yes I did. You can use elm if you prefer :-).

    And daggonit, this is thedailywtf. Every post is assumed to have <sarcasm> and <satire> tagged

  • foo (unregistered) in reply to C-Derb
    C-Derb:
    Why waste so much effort telling the user exactly what is wrong with their email address? Just use a generic "Please enter a valid email address" message. I mean, 99.9% of the time you are expecting users to enter valid data, right?
    But you're not asking the user to enter a valid email address, but to enter an email address that conforms to your arbitrary rules. How is the user supposed to know those rules? It's like saying "I don't like what you entered, but I won't tell you why, keep guessing." Really that makes the WTF twice as bad.
  • gizmore (unregistered)

    Looks quite solid and is way faster than a regular expression. no wtf ;)

  • Jay (unregistered)

    Rejecting things like plus signs and two dots in a row comes from the philosophy of "I've never seen one like that, it's probably invalid", rather than actually checking the specs.

    I have a personal email address that ends in dot-us. One website I went to rejected that. I tried changing it to dot-com -- of course not my correct email address then, but just to see -- and it accepted it. I guess they don't want a lot of grubby foreigners on their system, but you think they'd allow in Americans who use the us TLD.

    I would think that for something like validating the user's email address, if for whatever reason you can't get exactly the right rules, you would want to err on the side of accepting too much rather than too little. The main point of a validation like that is to catch user brain freezes, like he accidentally types his phone number in the e-mail field. So just checking for "includes an @ and at least one period after the @" is probably a not-bad validation.

  • Jay (unregistered)

    Hey, this brings to mind an actual serious thought: How tight should a validation be?

    If you are validating an email address, you COULD keep a list of all the valid TLDs and validate against that. Then if a user trying to type in "com" accidentally typed "cim", you'd catch it. But I've never done that and I doubt I ever will, because it would require keeping that list up to date, which would mean constantly monitoring for the creation of new TLDs. That sounds like way too much trouble.

    Of course on the flip side, some validations must be 100% tight. Like if I'm validating a user's password, I'm not going to say, "oh, okay, the last character was wrong, but that's probably just a typo, we'll let you in."

  • Jay (unregistered) in reply to Kasper
    Kasper:
    meinders1337:
    The real WTF is writing code to validate an email address. Use existing libraries.
    I recently came across some code, which did that. It didn't work. The library rejected any TLD longer than 6 characters.

    Exactly. I don't know how many times I've heard, "What?! You're going to write that function yourself?! That's crazy. Just search for something on the Internet, then you don't have to debug it yourself."

    The assumption there, of course, is that anything downloaded from the Internet is guaranteed to not only be 100% correct but also to 100% meet my requirements. There's absolutely no reason to believe that's true, and plenty of reason to believe it's wildly false.

  • foo (unregistered) in reply to Jay
    Jay:
    Hey, this brings to mind an actual serious thought: How tight should a validation be?

    If you are validating an email address, you COULD keep a list of all the valid TLDs and validate against that. Then if a user trying to type in "com" accidentally typed "cim", you'd catch it. But I've never done that and I doubt I ever will, because it would require keeping that list up to date, which would mean constantly monitoring for the creation of new TLDs. That sounds like way too much trouble.

    Of course on the flip side, some validations must be 100% tight. Like if I'm validating a user's password, I'm not going to say, "oh, okay, the last character was wrong, but that's probably just a typo, we'll let you in."

    The latter is not validation, but rather checking. Something like validation is done when a user chooses a password, e.g. is it long enough, does it contain enough different characters, does it not contain some characters we don't like etc.

  • (cs)
    [a-zA-z]*[@]gmail[.]com

    Pssht, anyone who knows what they're doing should have a gmail account with normal characters, and type it all lowercase. At least anyone we want associated with our site.

    Addendum (2013-08-12 15:18): *

    [a-zA-Z]

    Muphry's Law

  • AN AWESOME CODER (unregistered) in reply to ratchet freak
    ratchet freak:
    u:
    Citron:
    The real WTF is "alphanumeric characters only". With all these possible e-mail-addresses out there, the only useful thing to do for e-mail validation is to check, if the ser may have misstyped his e-mail-address, by checking for '@' and '.'. Use an opt-in to check if the user has access to the address.

    It states "In following with agreed upon standards" - it is nowhere said that they are following RFC standards.

    isn't TRWTF that Andrew didn't specify that the RFC standard should be followed

    though I'm afraid of the code that would come out of that specification

    Not necessarily. You're assuming this email address field would be accepting all email addresses in the world.

    It's quite possible that the field is on an internal screen, and that the format of email address is known ahead of time. It still might be sub-optimal to have custom validation considering the plethora of times validating email address has been solved, but it's reasonable that a company knows all of their emplyee email address are in the same format.

    Similarly, validating extensions on phone numbers would be dumb, but not if the phone number is the number to my desk, and every employee accessing the program should have one.

  • Rcx (unregistered) in reply to Damien

    Actually, by the email address RFC(s), it cannot, see

    http://www.faqs.org/rfcs/rfc1123.html section 5.2.18

    "Some systems over-qualify domain names by adding a trailing dot to some or all domain names in addresses or message-ids. This violates RFC-822 syntax."

  • (cs)

    Shall we count the ways this violates RFC 2822? (Could take a while.)

    Let's just leave it at, "Very exhaustively wrong."

  • (cs) in reply to Grzechooo
    Grzechooo:
    Good that he didn't use a regular expression.

    I wouldn't put my hands on fire for VB support of RegExp.

    OTOH, I've met my share of "developers" that don't even know what RegExp is. Ask around, it might surprise you.

  • (cs) in reply to chubertdev
    chubertdev:
    [a-zA-z]*[@]gmail[.]com

    Pssht, anyone who knows what they're doing should have a gmail account with normal characters, and type it all lowercase. At least anyone we want associated with our site.

    [a-zA-z]

    What the FUCK is that?

  • foo (unregistered) in reply to ubersoldat
    ubersoldat:
    Grzechooo:
    Good that he didn't use a regular expression.

    I wouldn't put my hands on fire for VB support of RegExp.

    OTOH, I've met my share of "developers" that don't even know what RegExp is. Ask around, it might surprise you.

    Probably nearly as many as those who do know but refuse to use them due to a misinterpreted quote.

  • (cs) in reply to faoileag
    faoileag:
    strReturn = "Email address cannot contain " & Chr(34)
    Don't tell me Visual Basic has no other means to include a quote in string?
    It does, but it is not any prettier
    strReturn = "Email address cannot contain """
    That's right, two quotes together makes a single with the third to close the string. Simmilar syntax as T-SQL
    Declare SingleQuoteChar char(1) = ''''
  • (cs) in reply to ubersoldat
    ubersoldat:
    Grzechooo:
    Good that he didn't use a regular expression.

    I wouldn't put my hands on fire for VB support of RegExp.

    OTOH, I've met my share of "developers" that don't even know what RegExp is. Ask around, it might surprise you.

    VB..Net has "full" support of RegEx through the same .Net framework classes as C#, though, you are right, many "developers" don't even know what a RegEx is.

    Then again, a lot of developers I work with have been going at this for so long, technologies such as RegEx were just never introduced to them.

  • Arne Nonymous (unregistered)

    Meh, yet another email validator that doesn't accept "@ @"@example.com as valid (or the light version "@_@"@example.com ).

  • (cs) in reply to Matt Westwood
    Matt Westwood:
    chubertdev:
    [a-zA-z]*[@]gmail[.]com

    Pssht, anyone who knows what they're doing should have a gmail account with normal characters, and type it all lowercase. At least anyone we want associated with our site.

    [a-zA-z]

    What the FUCK is that?

    Muphry's Law

  • AN AWESOME CODER (unregistered) in reply to Hannes
    Hannes:
    jkupski:
    Hannes:
    DNS Resolvers DO care about the dot. If they wouldn't they couldn't resolve a URL like thedailywtf(dot)com(dot). But -surprise surprise- they do resolve it.
    Actually, they do not, given that the above is a URL (as you yourself note) and not a domain name. The above is really a lot like misusing they're/their/there while being a grammar nazi.

    Well, what do URLs mostly contain? Maybe Domain Names?

    Also, a FQDN does indeed have a dot at the end: (dot)com(dot), so the DNS can resolve the root server responsible for "com".

    Or, you could just try and read the wiki article about FQDN. ;)

    And it has absolutely nothing to do with email.

    "myhomenetwork" is a valid domain.

    "an_awesome_coder@myhomenetwork" is a valid email address.

    Sending email to "an_awesome_coder@myhomenetwork" will succeed as long as the machine sending the email can resolve MX records for "myhomenetwork."

    I will concede, though, that any email address used in the real world that's input by a user in some non-geeky and non-ops related software will have at least one dot following the @ character though.

  • Roger (unregistered) in reply to Mike

    You have to admit they didn't go nearly far enough with that method, though. Consider:

    if strEmail == "!@!.!" Or strEmail == "!@!.#" Or strEmail == "a@!.$" Or

    ... a modest amount of code left out strEmail == "zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz@zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz.zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz.zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz" Then return True

  • XXI (unregistered) in reply to Roger
    Roger:
    You have to admit they didn't go nearly far enough with that method, though. Consider:

    if strEmail == "!@!.!" Or strEmail == "!@!.#" Or strEmail == "a@!.$" Or

    ... a modest amount of code left out strEmail == "zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz@zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz.zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz.zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz" Then return True

    ^ This

    It is always much easier to make a whitelist of what you support than trying to blacklist all possible unsupported cases. This is the way it should be done

    Captcha aptent: I would not aptent it any other way

  • erazerazerazerazerazer (unregistered) in reply to Matt Westwood
    Matt Westwood:

    [a-zA-z]

    What the FUCK is that?

    That's all characters between A and z meaning:

    • "a", "b", "c", ... "z"
    • "A", "B", "C", ... "Z"
    • "a", "b", "c", ... "z" (again)
    • "["
    • ""
    • "]"
    • "^"
    • "_"
    • "`"

    Or put another way: all characters between ASCII 65 and ASCII 122.

  • (cs) in reply to Hannes
    Hannes:
    Well, what do URLs mostly contain? Maybe Domain Names?

    Also, a FQDN does indeed have a dot at the end: (dot)com(dot), so the DNS can resolve the root server responsible for "com".

    Or, you could just try and read the wiki article about FQDN. ;)

    I did not correct your statement about FQDNs, merely your statement that a DNS resolver "resolves" URLs. It doesn't. Being snarky and directing me to wikipedia on another subbject does not change your inaccuracy.

    To be even more pedantic, even "URLs mostly contain domains names" is also not necessarily an accurate statement. It contains protocol information (http), a separator (://), host information (www.example.com), an optional port number (:80), and the full path to a resource on that host (/fake/path/example.htm).

    Hell, the URL to reply to your comment, http://thedailywtf.com/Comments/AddComment.aspx?ArticleId=7636&ReplyTo=414871&Quote=Y, is a great example of your inaccuracy. My DNS resolver can't make heads nor tails of it. :)

  • :o) (unregistered) in reply to faoileag
    faoileag:
    csrster:
    The real WTF is surely not using the Composite pattern to aggregate multiple validation rules in a single rule. Then each individual rule can be ruthlessly and independently unit-tested. Plus you're able instantiate these generalised validation rules using an Abstract Factory Pattern and an appropriate dependency-injection framework. Here, let me show you some UML ...
    I'm missing the XML in your design. Without XML in it, it's definitely not enterprisey enough!
    troll!
  • Someone (unregistered)

    Hey everyone, I'm having a problem related to this story. I'm trying to make a list of email addresses that I can validate entries against, but typing it all out is really slow. Can some people help me out?

    Here is what I have so far:

    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
  • foxyshadis (unregistered) in reply to C-Derb
    C-Derb:
    Why waste so much effort telling the user exactly what is wrong with their email address? Just use a generic "Please enter a valid email address" message. I mean, 99.9% of the time you are expecting users to enter valid data, right?
    Because when they enter their perfectly valid email address, they're justified in at least knowing why you petulantly reject it.
  • Gaurav (unregistered)

    Its better than the boring old regex approach brings spice to your life

  • Friedrice The Great (unregistered) in reply to Mike
    Mike:
    This is why VB coders get a bad wrap. If your if statement doesn't get you all the way there just throw in another 100 or so for each possibility and you should be fine.
    Would a good wrap be in something like plastic? Or is canvas preferred when disposing of weighted bodies at sea?
  • (cs) in reply to Grzechooo
    Grzechooo:
    Good that he didn't use a regular expression.

    Especially since that regex implements RFC 822, which is incredibly obsolete...having been superseded by RFC 2822 and then by RFC 5322.

  • grisle (unregistered) in reply to Christian
    Christian:
    Hi,

    and this is my all time favourite ....

    If InStr(1, strEmail, "+") > 0 Then strReturn = "Email address cannot contain '+'" GoTo ExitHandler End If

    Why the hell shouldn't an email address contain a +. I use that all the time.

    Greetings Christian

    oh, everyone is up in arms about it....

    The + is often used so you can filter SPAM.
    Why do people want your email address? Why do you think they don't want the '+' there?

    HINT: People who want your email address probably don't care much about the standards. They want to know they can send you email you will READ (or at least be forced to notice). That is, until the sender is blocked, or your Bayesian SPAM filter sees it for what it is.

    People always assume that "nobody knows that + is valid in email addresses" - I think it's more that "nobody CARES that + is valid in email addresses"

  • asd (unregistered)

    insert rant about how data validation is over used....

    It's easy to spot a bodgey address if it's: No Way You gunna get my emial MF

    than if it's: [email protected]

    One of the questions that needs to be considered is "Why do they need an email address?". Normally it's so they can send you stuff (in which case send something with a link and see if it works). Sometimes it's under the pretext of verifying your identity (or at least somehow holding you to account for how you use their site - and again, if you NEED the address, verify it via an emailed link). Sometimes it's because they want to send you SPAM (actually in all cases this could be the case - and again SEND A LINK).

    The only time when you don't need to send a link to verify, is in cases where you don't intend to use the email. But if you don't intend using it, why ask for it?

    Of course everyone does, simply because they like having your information, but the only certain way to see that the address exists (even if for only a fleeting moment) is to have an email send there and somehow verified.

    Who gives a shit about the actual format? This is a classic case of programmers overthinking the problem and reengineering the wheel.

    There's a difference between a valid email address and a real email address. Many people ask for valid when they mean real - at the end of the day, a valid email address that doesn't exist is about as useful as an invalid one. And if you demand a valid one, I'll either use my enemies one (and hope you don't verify it) or make up some stupid one that doesn't exist anyway....

  • E (unregistered) in reply to faoileag

    It does, but Chr(34) is more readable than """".

  • foo (unregistered) in reply to Someone
    Someone:
    Hey everyone, I'm having a problem related to this story. I'm trying to make a list of email addresses that I can validate entries against, but typing it all out is really slow. Can some people help me out?

    Here is what I have so far:

    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    You don't have to type it yourself. Just ask at N&N (Nigerian Scammers & NSA). Both have very good, up to date, lists of all email addresses in the world.
  • Norman Diamond (unregistered) in reply to Someone
    Someone:
    Hey everyone, I'm having a problem related to this story. I'm trying to make a list of email addresses that I can validate entries against, but typing it all out is really slow. Can some people help me out?

    Here is what I have so far:

    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    At least you'll be immune from this prank: http://www.youtube.com/watch?v=gJuGKJaSyVU

    Akismet says I should sell you stolen credit cards instead.

Leave a comment on “Email Hyper-Validation”

Log In or post as a guest

Replying to comment #:

« Return to Article