• Pepster (unregistered) in reply to BSDGuy
    BSDGuy:
    Why check the extension at all?

    Because the process calls for a user to upload a .txt file! Yes, they COULD cram the data into another extension, but they'd be doing something wrong (Not covered ybt he procedure they should be following).

    How many major errorrs have we all seen, heck how many are on this SITE, trhat follow this pattern:

    1. User X does someting inherently wrong, like submitting the entire parts .xls "database" to the upload routine.

    2. The upload routine was written to be "clever", and parses the .xls file, even though it shouldn't.

    3. The resulting FUBAR costs $$$ and a WTF is submitted.

    Moral: The hardware is dumb, and so are the users. Just because YOU can concieve of a bizzare scenario where you would want the program to handle anyt extension type, doesn't mean that implementing it would make the program "Better" or "More robust". Why not just create a genera purpose loader progam (Call it RUNPROG, to keep things simple) to export the .xls to a .txt, call the uploader, write the FORTRAN compiler to tape backup....

  • fw (unregistered) in reply to Marquess von Hinten
    Marquess von Hinten:
    Rodnas:
    CSV is a CHARACTER separated values file. They did state they needed a tab-delimited file, thus a text file is correctamundo.
    Gee, and I always thought the C stood for comma ...

    +1

  • Anonymous (unregistered) in reply to Anon
    Anon:
    CoderHero:
    No CSV is COMMA separated value.

    Not if you are using regional settings that use comma as the decimal separator. I've been stung by that before.

    Actually, CSV stands for "comma separated values" irrespective of your regional settings. Just because your system uses something different, it doesn't magically change the meaning of the abbreviation!

  • (cs) in reply to Anon
    Anon:
    CoderHero:
    No CSV is COMMA separated value.

    Not if you are using regional settings that use comma as the decimal separator. I've been stung by that before.

    That's Excel's idiosyncrasy.

  • (cs) in reply to Frits
    Frits:
    The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.

    mv Free_Bird.mp3 importdata.txt

    Now run that app, and watch how it does a rematch of the "Free Bird Database" article we had some time ago!

  • Todd Lewis (unregistered) in reply to A. Friend
    A. Friend:
    Duh! He should have used a code generator to generate the 17574 invalid strings (all three-letter strings except for txt and tsv).

    Duh! Who cares about the extension?

    He should have used a data generator to generate all 293,847,637,468,794,376 terabytes of valid inputs and checked to make sure the content was okay. Noobs.

  • (cs) in reply to Anonymous Cow-Herd
    Anonymous Cow-Herd:
    ...rectums as delimiters.

    Comment of the month?

  • (cs) in reply to Frits
    Frits:
    The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.

    Agreed. I'd hazard a guess that even in Kristof's specific case, the "WTF" code would lead to a much better user experience. In general, I'm sure it would. Neither approach is needed though, or really even useful. The correct approach obviously would have been proper data validation.

    Kristof:
    To solve this, my colleague figured the best way was to verify that the uploaded file's name had the correct extension of .txt.

    The real WTF is that that is what Kristof thought the "WTF" code was there for. He failed to glean the actual purpose the code: not to allow only .txt, as he incorrectly assumed, but rather to prevent a few obviously wrong extensions.

  • OldCoder (unregistered) in reply to Pepster
    Pepster:
    BSDGuy:
    Why check the extension at all?

    Because the process calls for a user to upload a .txt file!

    Wrong! The process calls for the upload of a text file. No-one actually mentioned file extensions at all.

    captcha: iusto. iusto be a programmer a long while ago.

  • (cs) in reply to Pepster
    Pepster:
    Good lord, you all want to reinvent the wheel. Why?

    Yes, IF you were bulding a robust, multiple-use DB loader that could be deployed across multiple business processes, MAYBE you'd want to include other extensions, or simply look at the file contents..... but that wasn't the job!

    You make it sound like accepting other extensions is an active thing. It's not. It's passive. It's the result of not putting any extension validation into the code. And it's more correct. When less work results in more correct and flexible code, you'd have to be a moron to do anything more.

    You're so preoccupied with presenting "clever" solutions that can handle "just in case" conditions, you forgot to consider the original requirements.
    You've seen the original requirements? Or are you just going on Kristof's relating of them? Besides, requirements can also be wrong, or pointless.
  • (cs) in reply to Pepster
    Pepster:
    BSDGuy:
    Why check the extension at all?
    Yes, they COULD cram the data into another extension, but they'd be doing something wrong.
    But they could also cram data from another extension into a .txt, so what is the check actually gaining you? You're going to need to perform the actual data validation either way.

    If you think that having a .txt extension says anything about the data itself, than the hardware is dumb, the users are dumb, and so are you.

  • Patrick (unregistered)

    fileName.ToString();

    ... why is he converting a string to a string?

    and Kristof's solution could be better written as

    if("txt;tsv;csv".Contains(Path.GetExtension(fileName).ToLower())){...}

    thereby allowing for several different extensions and excluding all others

  • Patrick (unregistered) in reply to Markp
    Markp:
    Pepster:
    BSDGuy:
    Why check the extension at all?
    Yes, they COULD cram the data into another extension, but they'd be doing something wrong.
    But they could also cram data from another extension into a .txt, so what is the check actually gaining you? You're going to need to perform the actual data validation either way.

    If you think that having a .txt extension says anything about the data itself, than the hardware is dumb, the users are dumb, and so are you.

    This is why:

    FTA:
    // The extension is OK. Proceed with the rest of the validation
    because if the extension is wrong, chances are the rest of the data is too, so don't bother trying.
  • (cs)

    Ah, the old whitelist vs. blacklist philosophy. Kristof's colleague must have worked in the antivirus industry prior to this.

  • corbeau (unregistered)

    Stupidification!

  • Carl (unregistered)
    upload a tab-delimited text file containing a list of products, and the application would insert or update the products in the database."

    "Of course, the import feature required some pretty basic validation," Kristof continued. "Is it actually a text file? Is it tab delimited? Are the columns correct? And so on."

    There's your requirements, or as close as we are going to get to seeing them. Nowhere does it even suggest that the file name will have an "extension" at all -- much less that the extension should be .txt. The .txt extension was introduced by the WTF coder, who misinterpreted the requirements through his own tunnel vision and sloppy habits, just like most of you are doing!

  • justsomedude (unregistered) in reply to Rodnas
    Rodnas:

    CSV is a CHARACTER separated values file. They did state they needed a tab-delimited file, thus a text file is correctamundo.

    Wow...just, wow. Sure, it's true there's a common misnomer with CSVs, frequently they are actually CSSVs instead of true CSVs, but Character-Spaced-Value?

  • (cs) in reply to HeebyJeeby
    HeebyJeeby:
    Frits:
    The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.

    That's retarded...

    Care to explain your logic? I agree with Frits here.

  • sadwings (unregistered)

    This is why coding jobs are easy to outsource, too many "big brains" solving for stupid nonsense that nobody cares about (except them).

    The business partner agrees to upload a .txt file, so of course you check the extension before you process it.

    After it passes that check, you will verify the layout of the file in whichever way makes the most sense.

    There's more wtf worthy answers in this thread than there was wtf'ness in the article.

  • Edward Royce (unregistered)

    Hmmmm.

    The Real WTF is not using XML!

    ...

    Sorry I had to do that. Couldn't resist.

  • RFC (unregistered) in reply to Anonymous
    Comment held for moderation.
  • (cs) in reply to sadwings
    sadwings:
    This is why coding jobs are easy to outsource, too many "big brains" solving for stupid nonsense that nobody cares about (except them).

    The business partner agrees to upload a .txt file, so of course you check the extension before you process it.

    If your contract with the business partner said that .txt and nothing else should be accepted, of course that validation is required. The article certainly doesn't state that; it states that a text file is required. Maybe it's just Windows people not getting it, but I absolutely never name text files with .txt on Unix-like systems. I just leave out the extension altogether.

    But if it is a requirement, then it's the requirements that are the WTF for needlessly requiring a certain filename pattern. Either way, it's a requirement/implementation detail that gains nothing and is more work. You make it sound like you need to be clever or overthinking the problem to simply NOT do something.

  • junkpile (unregistered) in reply to Markp
    Markp:
    sadwings:
    This is why coding jobs are easy to outsource, too many "big brains" solving for stupid nonsense that nobody cares about (except them).

    The business partner agrees to upload a .txt file, so of course you check the extension before you process it.

    If your contract with the business partner said that .txt and nothing else should be accepted, of course that validation is required. The article certainly doesn't state that; it states that a text file is required. Maybe it's just Windows people not getting it, but I absolutely never name text files with .txt on Unix-like systems. I just leave out the extension altogether.

    But if it is a requirement, then it's the requirements that are the WTF for needlessly requiring a certain filename pattern. Either way, it's a requirement/implementation detail that gains nothing and is more work. You make it sound like you need to be clever or overthinking the problem to simply NOT do something.

    Hmmm... You mean like overthinking by responding 20 times trying to argue or prove a point that you're correct and everyone else is somehow absolutely wrong? You know that old saying about how everyone has an opinion...

  • blargraptor (unregistered) in reply to junkpile
    junkpile:
    Markp:
    sadwings:
    This is why coding jobs are easy to outsource, too many "big brains" solving for stupid nonsense that nobody cares about (except them).

    The business partner agrees to upload a .txt file, so of course you check the extension before you process it.

    If your contract with the business partner said that .txt and nothing else should be accepted, of course that validation is required. The article certainly doesn't state that; it states that a text file is required. Maybe it's just Windows people not getting it, but I absolutely never name text files with .txt on Unix-like systems. I just leave out the extension altogether.

    But if it is a requirement, then it's the requirements that are the WTF for needlessly requiring a certain filename pattern. Either way, it's a requirement/implementation detail that gains nothing and is more work. You make it sound like you need to be clever or overthinking the problem to simply NOT do something.

    Hmmm... You mean like overthinking by responding 20 times trying to argue or prove a point that you're correct and everyone else is somehow absolutely wrong? You know that old saying about how everyone has an opinion...

    I think it is cute how, instead of refuting his very good point, you feel the need to resort to personal attacks. YOU are the one that is simply spewing worthless drivel.

  • (cs) in reply to Mason Wheeler
    Mason Wheeler:
    HeebyJeeby:
    Frits:
    The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.

    That's retarded...

    Care to explain your logic? I agree with Frits here.

    The logic is explained here. Anything left unclear?

  • (cs) in reply to blargraptor
    blargraptor:
    junkpile:

    YOU are the one that is simply spewing worthless drivel.

    Does that surprise you?

  • Montoya (unregistered) in reply to onu
    onu:
    tB:
    So what if someone uploads a docx or xlsx file?

    those users will be shot

    also: -4 internets

    +1

  • Montoya (unregistered) in reply to sadwings
    sadwings:
    This is why coding jobs are easy to outsource, too many "big brains" solving for stupid nonsense that nobody cares about (except them).

    Don't forget to submit all your outsourced code to TDWTF when you start having problems :)

  • C# (unregistered)

    When you see something like s.ToString() (where s is a string!) - it's a lost cause (the programmer, not the program).

  • junkpile (unregistered) in reply to blargraptor
    blargraptor:
    junkpile:
    Markp:
    sadwings:
    This is why coding jobs are easy to outsource, too many "big brains" solving for stupid nonsense that nobody cares about (except them).

    The business partner agrees to upload a .txt file, so of course you check the extension before you process it.

    If your contract with the business partner said that .txt and nothing else should be accepted, of course that validation is required. The article certainly doesn't state that; it states that a text file is required. Maybe it's just Windows people not getting it, but I absolutely never name text files with .txt on Unix-like systems. I just leave out the extension altogether.

    But if it is a requirement, then it's the requirements that are the WTF for needlessly requiring a certain filename pattern. Either way, it's a requirement/implementation detail that gains nothing and is more work. You make it sound like you need to be clever or overthinking the problem to simply NOT do something.

    Hmmm... You mean like overthinking by responding 20 times trying to argue or prove a point that you're correct and everyone else is somehow absolutely wrong? You know that old saying about how everyone has an opinion...

    I think it is cute how, instead of refuting his very good point, you feel the need to resort to personal attacks. YOU are the one that is simply spewing worthless drivel.

    Whoosh... Strike one!

  • J (unregistered) in reply to Marquess von Hinten
    Marquess von Hinten:
    Rodnas:
    CSV is a CHARACTER separated values file. They did state they needed a tab-delimited file, thus a text file is correctamundo.
    Gee, and I always thought the C stood for comma ...
    Y e a h , I ' d l i k e t o s e e a l l t h e c h a r a c t e r s s e p a r a t e d i n t h e f i l e .
  • Romeo (unregistered) in reply to m0ffx
    Comment held for moderation.
  • (cs) in reply to highphilosopher
    highphilosopher:
    toth:
    Frits:
    The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.

    Why? This is a perfectly good first step. Now, obviously, someone could upload a valid text file that did not have a ".txt" extension, so if this were intended to be a robust, general purpose validation routine, it would indeed be a poor approach. But this was meant for a very specific business scenario--it's not unreasonable to require that any input file have a specific extension.

    Too often people code backwards logic. It's not that they can't code, it's just that they've been notting things too long.

    The reason backwards logic is called backwards logic is because it doesn't satisfy all situations. It's a limiting factor.

    1 2 FIZZ 4 BANG FIZZ 7 8 FIZZ BANG 11 FIZZ 13 14 FIZZBANG

    See, I can do it!!!!

    Did you have a point, or do you just like seeing your words magically appear on the shiny glass box?

  • somedude (unregistered)

    Assuming that it was stated in the requirements that the files had to have a .txt extension, the whole file extension debate is moot.

  • (cs) in reply to Markp
    Markp:
    Maybe it's just Windows people not getting it, but I absolutely never name text files with .txt on Unix-like systems. I just leave out the extension altogether.
    I'd like to point out that I make sure to label all text files on UNIX systems *.txt. Why? Because that way, I (not the computer, but the person) know that it's supposed to be text. That being said, the day my *nix box starts checking for file extensions to determine what type of a file it is is the day I throw it out of a 5th story window.
  • (cs) in reply to somedude
    somedude:
    Assuming that it was stated in the requirements that the files had to have a .txt extension, the whole file extension debate is moot.
    Assuming that all code and specs are well written and correct, this whole website is moot.
  • Bim Job (unregistered) in reply to somedude
    somedude:
    Assuming that it was stated in the requirements that the files had to have a .txt extension, the whole file extension debate is moot.
    And so we come back to the OP, which, IIRC, implements a file extension check in the worst possible way.
    • Crap requirements
    • Requirements unquestioned
    • Ludicrous implementation of initial filter
    • Second filter (gawrsh, Mickey, is it really CSV?) not mentioned and therefore outside the scope of what little rational discussion we have at this point.

    Nothing to see here. Move on. Next WTF, please.

  • some other dude (unregistered) in reply to somedude
    somedude:
    Assuming that it was stated in the requirements that the files had to have a .txt extension, the whole file extension debate is moot.

    And assuming everyone involved was a narwhale then the entire debate is also moot, because who really gives a damn what narwhales do?

    The fact is that both of these assumptions are stupid to make, being completely baseless.

  • Anonymous (unregistered)

    Does the OP work for a law firm? It's a basic tenet of justice that every csv file is innocent until proven guilty..

    CAPTCHA: incassum - 'in vain', which is kind of appropriate to the solution I guess.

  • caper (unregistered)

    But if the original coder did that kind of a job with the file name, how bad is the content validation ?

  • Bim Job (unregistered) in reply to Shishire
    Shishire:
    Markp:
    Maybe it's just Windows people not getting it, but I absolutely never name text files with .txt on Unix-like systems. I just leave out the extension altogether.
    I'd like to point out that I make sure to label all text files on UNIX systems *.txt. Why? Because that way, I (not the computer, but the person) know that it's supposed to be text. That being said, the day my *nix box starts checking for file extensions to determine what type of a file it is is the day I throw it out of a 5th story window.
    MIME much?
  • Joe (unregistered)

    #!/usr/bin/perl -w

    Convert Excel spreadsheet to PDF

    works only in DOS or Windows

    FOSS by Joe

    my $old = $ARGV[0]; $new = $old; $new =~ s/.xlsx?$/.pdf/; rename $old, $new;

  • duh (unregistered) in reply to Marquess von Hinten

    ... and I thought that tab was a character, stupid me.

  • javabeats (unregistered) in reply to duh

    TRWTF is using upload to feed data to the system. Certainly this data is available somewhere else, and the application could have read from there.

  • (cs) in reply to GalacticCowboy
    GalacticCowboy:
    tB:
    So what if someone uploads a docx or xlsx file?

    Better brush the dust off those regex skillz.

    .regex isn't even a real extension type. Geez.

  • Anonymous (unregistered) in reply to Frits

    If you want a box containing square things, it's a good idea to make the hole in the lid square so that people know what to put in it. Any preschooler can tell you that. ;)

    But agreed that checking the file extension shouldn't be any more than a userland filter, otherwise you'll undoubtedly run afoul of those users who worked out how to change file extensions and really really want to get their excel sheet into your app.

  • Callin (unregistered) in reply to privatejoker86

    I hope you're joking. If so, it wasn't a very good joke.

  • (cs)

    Kristof should have told him that adding .doc and .pdf wasn't quite good enough, and that he needed to add all the extensions listed on FileInfo.com.

  • (cs) in reply to Anonymous
    Anonymous:
    If you want a box containing square things, it's a good idea to make the hole in the lid square so that people know what to put in it. Any preschooler can tell you that. ;)
    Good analogy. Unfortunately, you failed to use it yourself :-). A square hole actually validates (avoiding a full geometry analysis here) something about the object. That's like validating the file contents. A filename validation is closer to looking at a label that somebody stuck on the object, and only permitting the object if the label reads "square"--even though a valid square's label might read "rectangle" and a valid circle's label might read "square".
    But agreed that checking the file extension shouldn't be any more than a userland filter, otherwise you'll undoubtedly run afoul of those users who worked out how to change file extensions and really really want to get their excel sheet into your app.
    I don't mind that idea since a properly used/implemented file chooser will allow you to clear the suggested extension and choose any file. This is all happening server-side though.
  • History Teacher (unregistered) in reply to Carl
    Carl:
    Metadata should be tracked by the computer in its data structures, along with owner, date created, date last accessed, permissions and all that. Or should we maybe jam those into the filename too?

    The profound retardedness of certain systems, that haven't been repaired yet after all these decades, and the blind unqestioning acceptance by the masses, seriously leads me to question the worthiness of the species.

    Being evolved from a system developed on old and currently ridiculously obsolete technology is not retarded.

    In this particular case, the extension originally was metadata. In directory listing, it was listed in it's own column, just like for example modification time.

    What perhaps is retarded is, that at some point this metadata got merged into file name. So we have both software and users that like to pretend it's still metadata, and then we have retards who call this retarded without suggesting any working alternative.

    The fact is, there won't be an alternative in our lifetime, because there just are too many legacy files, and too many legacy users.

Leave a comment on “Pretty Basic Validation”

Log In or post as a guest

Replying to comment #:

« Return to Article