• fnord (unregistered)

    string InvalidComments = "FRIST";

    if (!InvalidComments.Contains(comment.ToString().Substring(comment.ToString().LastIndexOf(".")))) { // The extension is OK. Proceed with the rest of the comment } else { // Incorrect comment. Show error message. }

  • Neville Flynn (unregistered)

    Hey, at least his code will allow *.tsv files.

  • Vilhelm (unregistered)

    if (comment.toLower() !in ('first', 'frist', thrid')) { // Display comment } else { // Give user a virus }

  • A. Friend (unregistered)

    Duh! He should have used a code generator to generate the 17574 invalid strings (all three-letter strings except for txt and tsv).

  • tB (unregistered) in reply to A. Friend

    So what if someone uploads a docx or xlsx file?

  • (cs)

    Stupify!

  • RBoy (unregistered)

    Last!

    (See, it's backwards logic, just like the code. And wrong. Get it?)

    distineo: Italian for destiny.

  • Engival (unregistered)
    if (System.IO.Path.GetExtension(fileName).ToLower() == "txt")

    and I thought THAT was the WTF at first. Who cares what the extension is. Who names csv files .txt? I can just see the training of the users now. "Okay, export your file, you can pick csv, or tsv. okay, now go rename the file because the web site needs it as txt)

    At least the guy doing "the real WTF" was excluding some common types that idiot users might click on by mistake. It's not supposed to be an exhaustive list. I'm not sure I agree with either approach. One is a pretty poor defense against invalid types, and the other is restrictive.

    Also, at this point in time, you can bet that the file has already been uploaded to the web server. So you won't even bother inspecting the first 1k of the file, just because your preconceptions of filename aren't met?

  • Frits (unregistered)

    The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.

  • (cs) in reply to tB
    tB:
    So what if someone uploads a docx or xlsx file?

    Better brush the dust off those regex skillz.

  • HeebyJeeby (unregistered) in reply to Frits
    Frits:
    The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.

    That's retarded...

  • (cs) in reply to Engival
    Engival:
    if (System.IO.Path.GetExtension(fileName).ToLower() == "txt")
    and I thought THAT was the WTF at first. Who cares what the extension is. Who names csv files .txt? I can just see the training of the users now. "Okay, export your file, you can pick csv, or tsv. okay, now go rename the file because the web site needs it as txt)

    At least the guy doing "the real WTF" was excluding some common types that idiot users might click on by mistake. It's not supposed to be an exhaustive list. I'm not sure I agree with either approach. One is a pretty poor defense against invalid types, and the other is restrictive.

    Also, at this point in time, you can bet that the file has already been uploaded to the web server. So you won't even bother inspecting the first 1k of the file, just because your preconceptions of filename aren't met?

    Agreed, my WTF alarm bells went off the moment I saw any kind of filename extension validation. My blackberry does the same frigging thing, won't open a text file if its extension is not .txt even though whatever I am trying to open is a frigging text file. I've cursed about that more than once.

  • Rodnas (unregistered) in reply to Engival
    Engival:
    if (System.IO.Path.GetExtension(fileName).ToLower() == "txt")
    and I thought THAT was the WTF at first. Who cares what the extension is. Who names csv files .txt? I can just see the training of the users now. "Okay, export your file, you can pick csv, or tsv. okay, now go rename the file because the web site needs it as txt)

    At least the guy doing "the real WTF" was excluding some common types that idiot users might click on by mistake. It's not supposed to be an exhaustive list. I'm not sure I agree with either approach. One is a pretty poor defense against invalid types, and the other is restrictive.

    Also, at this point in time, you can bet that the file has already been uploaded to the web server. So you won't even bother inspecting the first 1k of the file, just because your preconceptions of filename aren't met?

    "The idea behind the feature was that the administrators could upload a tab-delimited text file containing a list of products, and the application would insert or update the products in the database"

    CSV is a CHARACTER separated values file. They did state they needed a tab-delimited file, thus a text file is correctamundo.

    CAPTCHA acsi (ASCII for dyslectic people?)

  • Yazeran (unregistered)

    I'd say looking at the extension at all is pointless, what matters is the content of the file.

    You have to do some data validation on the text content anyways, so if the content of the file is not text, then print the error at that point (or continue with tabulation validation etc) if it is.

    Yours Yazeran

    Plan: To go to Mars one day with a hammer

  • (cs)
    MikeS:
    Thrid
    Eric Pode of Croydon - is that you ?
  • (cs) in reply to Frits
    Frits:
    The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.

    Why? This is a perfectly good first step. Now, obviously, someone could upload a valid text file that did not have a ".txt" extension, so if this were intended to be a robust, general purpose validation routine, it would indeed be a poor approach. But this was meant for a very specific business scenario--it's not unreasonable to require that any input file have a specific extension.

  • Dave (unregistered)

    This whole requirement is stupid. Why should you care what the extension is if the data is valid? I personally might want to call my csv file com.exe.ni.ni..exe and as long as the format of the data is fine I would expect it to work.

  • onu (unregistered) in reply to tB
    tB:
    So what if someone uploads a docx or xlsx file?

    those users will be shot

    also: -4 internets

  • Marquess von Hinten (unregistered) in reply to Rodnas
    Rodnas:
    CSV is a CHARACTER separated values file. They did state they needed a tab-delimited file, thus a text file is correctamundo.
    Gee, and I always thought the C stood for comma ...
  • (cs) in reply to Frits
    Frits:
    The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.
    Exactly. Why care whether they name their files *.txt, *.text, *.tsv or even *.exe if you validate the contents anyway?
  • Steve Urkel (unregistered) in reply to HeebyJeeby

    Not really. It IS pretty retarded to check the file extension of a file the user has explicitly selected.

    It would be better to just put some magic signature on the first line of the files, and check for that. In any case, character-delimited files rarely use the .txt extension. More often .?sv (where ? is the first letter of whatever character is used as a delimiter).

    If the magic signature is not an option, why not just check that it contains the correct number of columns and/or that the type of data in them matches the expected type (ie. that a numeric column doesn't contain letters or other funky stuff).

  • Jeekers (unregistered) in reply to Rodnas

    [quote user="Rodnas"][quote user="Engival"]

    CSV is a CHARACTER separated values file. They did state they needed a tab-delimited file, thus a text file is correctamundo. [/quote]

    Err.. That'll be COMMA separated values. A character separated values file would surely include the TAB character.

  • Yalpe Nismou (unregistered)
    Kristof:
    I was flabbergasted.
    FTFY

    CAPTCHA : Veniam

  • Me (unregistered)

    Aside from all the worthwhile and humerous objections above, System.IO.Path.GetExtension(fileName) actually returns the extension with the dot in front (e.g. ".txt" not "txt")...

  • highphilosopher (unregistered) in reply to toth
    toth:
    Frits:
    The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.

    Why? This is a perfectly good first step. Now, obviously, someone could upload a valid text file that did not have a ".txt" extension, so if this were intended to be a robust, general purpose validation routine, it would indeed be a poor approach. But this was meant for a very specific business scenario--it's not unreasonable to require that any input file have a specific extension.

    Too often people code backwards logic. It's not that they can't code, it's just that they've been notting things too long.

    The reason backwards logic is called backwards logic is because it doesn't satisfy all situations. It's a limiting factor.

    1 2 FIZZ 4 BANG FIZZ 7 8 FIZZ BANG 11 FIZZ 13 14 FIZZBANG

    See, I can do it!!!!

  • Steve Urkel (unregistered) in reply to Rodnas

    You DO know that Tab is also a character?

    Anyway, CSV is usually meant to stand for Comma-Separated Values, even though the .csv extension is just as often used for semicolon or tab-delimited files.

    I'd say using .txt for any file intended to be parsed automatically is somewhat dubious.

  • Anon (unregistered) in reply to Engival
    Engival:
    if (System.IO.Path.GetExtension(fileName).ToLower() == "txt")
    and I thought THAT was the WTF at first. Who cares what the extension is. Who names csv files .txt? I can just see the training of the users now. "Okay, export your file, you can pick csv, or tsv. okay, now go rename the file because the web site needs it as txt)

    Actually, Excel defaults to the .txt extension if you save a file as tab delimited and that is probably exactly the scenario that is being envisioned here. You tell the user to create their product "database" in Excel (yeah, I know Excel isn't a database, but try telling that to most non-technical users) and save it as tab delimited (which gives it the .txt extension in Excel). Upload to database and you're done. The real WTF is why he choose to include extensions such as csproj and vbproj in his list of excluded extensions. These are extension that no non-programmer would be likely to ever encounter. The other WTF is that don't most frameworks have the ability to display a standard file selection dialog with a filter for valid file types in the first place?

  • Ax (unregistered) in reply to Rodnas
    Rodnas:
    CSV is a CHARACTER separated values file. They did state they needed a tab-delimited file, thus a text file is correctamundo.

    Not quite. A tsv would make sense if you're creating the file from some automatic means. But I have a suspicion that these input files are hand crafted.

  • (cs)

    My web browser won't view the page http://thedailywtf.com/Comments/AddComment.aspx because it expects HTML pages to have a .html extension.

  • Anon (unregistered) in reply to Steve Urkel
    Steve Urkel:
    In any case, character-delimited files rarely use the .txt extension. More often .?sv (where ? is the first letter of whatever character is used as a delimiter).
    1. Open Excel
    2. Select File -> Save As
    3. In the "Save as type" drop down, select Text(Tab delimited)
    4. What's the extension that Excel uses?
  • Martin (unregistered)

    if (!InvalidExtensions.Contains(fileName.ToString().Substring(fileName.ToString().LastIndexOf("."))))

    Also - doesn't that give me an ArgumentOutOfRangeException if my file doesn't have an extension?

  • Johnny Awkward (unregistered)

    Surely the RWTF is all the comments here. What hope is there of good software ever getting written if this is a cross section of the coding community?

  • Anonymous Cow-Herd (unregistered)

    I wonder if anyone's tried dropping the hint by uploading a text file from World of Warcraft? (extension .wtf)

  • My Name? (unregistered) in reply to Frits
    Frits:
    The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.

    No, the real WTF is using tabs as delimiters. Also relying on actual files is the worng way to do it. By using streams the mime-type table becomes obsolete.

  • Frits (unregistered) in reply to HeebyJeeby
    HeebyJeeby:
    Frits:
    The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.

    That's retarded...

    Would you care to elaborate on why it's retarded?
    A file extension does not determine the contents of the file. It's not a good idea to needlessly exclude file extensions based on preconceived notions.

    It's also good practice to validate the actual data in a file, in-house use or not. Therefore, I'm assuming a validation of the data will be coded. Why would you add an extra validation step that adds little value but may cause potential headaches?

  • Carl (unregistered)

    File "extensions" are TRWTF.

    A file name is just a name. You can name a file anything you want, as long as you use valid characters. It can have zero or 100 dots and the computer shouldn't care. The name shouldn't be expected to contain any metadata that the computer cares about. It is for use by humans, and the computer should let the human use whatever name is meaningful to the human.

    Metadata should be tracked by the computer in its data structures, along with owner, date created, date last accessed, permissions and all that. Or should we maybe jam those into the filename too?

    The profound retardedness of certain systems, that haven't been repaired yet after all these decades, and the blind unqestioning acceptance by the masses, seriously leads me to question the worthiness of the species.

  • Anonymous Cow-Herd (unregistered) in reply to My Name?
    My Name?:
    Frits:
    The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.

    No, the real WTF is using tabs as delimiters. Also relying on actual files is the worng way to do it. By using streams the mime-type table becomes obsolete.

    Surely should be using XML. That's more "enterprise" than using tabs or commas or semicolons or rectums as delimiters.

  • Vollhorst (unregistered)

    Where is the problem with the code?

    He was told to block other kind of extensions. So he did. He wasn't told to only allow txt-files. His boss should have been a bit more specific with his requirements. And since his lousy requirements even missed the names of files he should block he was forced to rely on what he hat on his own desktop. That guy is a genius.

  • Pepster (unregistered)

    Good lord, you all want to reinvent the wheel. Why?

    Yes, IF you were bulding a robust, multiple-use DB loader that could be deployed across multiple business processes, MAYBE you'd want to include other extensions, or simply look at the file contents..... but that wasn't the job!

    The business process called for a .txt file, so that's what was being verified. if a .tsv, or a .csv, or a .wtf file showed up, then the user was doing something wrong! It doesn't matter that they COULD cram the data into any extension, it matters that they're not supposed to.

    You're so preoccupied with presenting "clever" solutions that can handle "just in case" conditions, you forgot to consider the original requirements.

  • Anonymous (unregistered)

    Well there's nothing to discuss about this one really but I'm going to comment anyway just to get involved. Hey guys!

  • Jasper (unregistered)

    What if someone would try to upload an .a file? 'Oh yeah, you're right... I should add those to the list as well!'

    What if someone would try to upload a .b file? 'Oh yeah, you're right... I should add those to the list as well!'

    What if someone would try to upload a .c file? 'Oh yeah, you're right... I should add those to the list as well!'

    What if someone would try to upload a .d file? 'Oh yeah, you're right... I should add those to the list as well!'

    ... (5 minutes later)

    What if someone would try to upload a .z file? 'Oh yeah, you're right... I should add those to the list as well!'

    What if someone would try to upload an .aa file? 'Oh yeah, you're right... I should add those to the list as well!'

    What if someone would try to upload an .ab file? 'Oh yeah, you're right... I should add those to the list as well!'

    ... (2 hours later)

    What if someone would try to upload an .aaa file? 'Oh yeah, you're right... I should add those to the list as well!'

    What if someone would try to upload an .aab file? 'Oh yeah, you're right... I should add those to the list as well!'

    ... (10 days later)

    etc.

    And also: Some extensions such as .asp and .aspx are in the list twice. Probably those are files you really don't want people to upload. Just to be extra sure you check twice.

  • Anonymous Cow-Herd (unregistered) in reply to Jasper
    Jasper:

    And also: Some extensions such as .asp and .aspx are in the list twice. Probably those are files you really don't want people to upload. Just to be extra sure you check twice.

    When you really want to be sure of something, you're supposed to do it three times.

  • iToad (unregistered)

    I'm sitting here looking at a tab-delimited text file from one of the company systems with an extension of .xls

  • HeebyJeeby (unregistered) in reply to Frits
    Frits:
    HeebyJeeby:
    Frits:
    The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.

    That's retarded...

    Would you care to elaborate on why it's retarded?
    A file extension does not determine the contents of the file. It's not a good idea to needlessly exclude file extensions based on preconceived notions.

    It's also good practice to validate the actual data in a file, in-house use or not. Therefore, I'm assuming a validation of the data will be coded. Why would you add an extra validation step that adds little value but may cause potential headaches?

    It's retarded because you're retarded

    CAPTCHA: fellatio

  • BSDGuy (unregistered) in reply to Frits

    ding, ding!

    Why check the extension at all? If the file meets the delimiting parameters then it's all good,....if it doesn't return BAD_FILE_FORMAT (some number..probably negative...or use an exception if it's Java which I think it is based on system.IO....).

    Two WTFs don't make a right.

  • DJ Maze (unregistered)

    Is magic mime not available anymore? How about "filename.exe\x00.txt"

  • CoderHero (unregistered) in reply to Rodnas
    Rodnas:
    Engival:
    if (System.IO.Path.GetExtension(fileName).ToLower() == "txt")
    and I thought THAT was the WTF at first. Who cares what the extension is. Who names csv files .txt? I can just see the training of the users now. "Okay, export your file, you can pick csv, or tsv. okay, now go rename the file because the web site needs it as txt)

    At least the guy doing "the real WTF" was excluding some common types that idiot users might click on by mistake. It's not supposed to be an exhaustive list. I'm not sure I agree with either approach. One is a pretty poor defense against invalid types, and the other is restrictive.

    Also, at this point in time, you can bet that the file has already been uploaded to the web server. So you won't even bother inspecting the first 1k of the file, just because your preconceptions of filename aren't met?

    "The idea behind the feature was that the administrators could upload a tab-delimited text file containing a list of products, and the application would insert or update the products in the database"

    CSV is a CHARACTER separated values file. They did state they needed a tab-delimited file, thus a text file is correctamundo.

    CAPTCHA acsi (ASCII for dyslectic people?)

    No CSV is COMMA separated value.

  • Snake (unregistered) in reply to GalacticCowboy
    GalacticCowboy:
    tB:
    So what if someone uploads a docx or xlsx file?

    Better brush the dust off those regex skillz.

    Now that's a very big WTF. Parsing XML with Regex.

    You need a stack based parser for that!

  • Anon (unregistered) in reply to CoderHero
    CoderHero:
    No CSV is COMMA separated value.

    Not if you are using regional settings that use comma as the decimal separator. I've been stung by that before.

  • Swedish tard (unregistered) in reply to CoderHero
    CoderHero:
    Rodnas:
    Engival:
    if (System.IO.Path.GetExtension(fileName).ToLower() == "txt")
    and I thought THAT was the WTF at first. Who cares what the extension is. Who names csv files .txt? I can just see the training of the users now. "Okay, export your file, you can pick csv, or tsv. okay, now go rename the file because the web site needs it as txt)

    At least the guy doing "the real WTF" was excluding some common types that idiot users might click on by mistake. It's not supposed to be an exhaustive list. I'm not sure I agree with either approach. One is a pretty poor defense against invalid types, and the other is restrictive.

    Also, at this point in time, you can bet that the file has already been uploaded to the web server. So you won't even bother inspecting the first 1k of the file, just because your preconceptions of filename aren't met?

    "The idea behind the feature was that the administrators could upload a tab-delimited text file containing a list of products, and the application would insert or update the products in the database"

    CSV is a CHARACTER separated values file. They did state they needed a tab-delimited file, thus a text file is correctamundo.

    CAPTCHA acsi (ASCII for dyslectic people?)

    No CSV is COMMA separated value.

    CSV has, about 1/3rd of the times I've seen it used referred to character separated values. So I'd venture as far as to say that CSV can actually mean both, whereas character probably is a later construct due to the fact that comma is a bad separator for reasons already mentioned here. ;)

Leave a comment on “Pretty Basic Validation”

Log In or post as a guest

Replying to comment #:

« Return to Article