Comment On Pretty Basic Validation

"For reasons beyond my comprehension," Kristof writes, "one of my coworkers has managed to keep his job after more than eighteen months of messing about. His latest project was to build an import feature in the admin module of our web application. The idea behind the feature was that the administrators could upload a tab-delimited text file containing a list of products, and the application would insert or update the products in the database." [expand full text]
« PrevPage 1 | Page 2 | Page 3 | Page 4Next »

Re: Pretty Basic Validation

2009-11-30 09:04 • by fnord (unregistered)
string InvalidComments = "FRIST";

if (!InvalidComments.Contains(comment.ToString().Substring(comment.ToString().LastIndexOf("."))))
{
// The extension is OK. Proceed with the rest of the comment
}
else
{
// Incorrect comment. Show error message.
}

Re: Pretty Basic Validation

2009-11-30 09:05 • by Neville Flynn (unregistered)
Hey, at least his code will allow *.tsv files.

Re: Pretty Basic Validation

2009-11-30 09:06 • by Vilhelm (unregistered)
if (comment.toLower() !in ('first', 'frist', thrid')) {
// Display comment
} else {
// Give user a virus
}

Re: Pretty Basic Validation

2009-11-30 09:10 • by A. Friend (unregistered)
Duh! He should have used a code generator to generate the 17574 invalid strings (all three-letter strings except for txt and tsv).

Re: Pretty Basic Validation

2009-11-30 09:16 • by tB (unregistered)
291738 in reply to 291737
So what if someone uploads a docx or xlsx file?

Re: Pretty Basic Validation

2009-11-30 09:20 • by CaptainOblivious
Stupify!

Re: Pretty Basic Validation

2009-11-30 09:22 • by RBoy (unregistered)
Last!

(See, it's backwards logic, just like the code. And wrong. Get it?)

distineo: Italian for destiny.

Re: Pretty Basic Validation

2009-11-30 09:22 • by Engival (unregistered)
if (System.IO.Path.GetExtension(fileName).ToLower() == "txt")

and I thought THAT was the WTF at first. Who cares what the extension is. Who names csv files .txt? I can just see the training of the users now. "Okay, export your file, you can pick csv, or tsv. okay, now go rename the file because the web site needs it as txt)

At least the guy doing "the real WTF" was excluding some common types that idiot users might click on by mistake. It's not supposed to be an exhaustive list. I'm not sure I agree with either approach. One is a pretty poor defense against invalid types, and the other is restrictive.

Also, at this point in time, you can bet that the file has already been uploaded to the web server. So you won't even bother inspecting the first 1k of the file, just because your preconceptions of filename aren't met?

Re: Pretty Basic Validation

2009-11-30 09:24 • by Frits (unregistered)
The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.

Re: Pretty Basic Validation

2009-11-30 09:29 • by GalacticCowboy
291743 in reply to 291738
tB:
So what if someone uploads a docx or xlsx file?


Better brush the dust off those regex skillz.

Re: Pretty Basic Validation

2009-11-30 09:31 • by HeebyJeeby (unregistered)
291744 in reply to 291742
Frits:
The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.


That's retarded...

Re: Pretty Basic Validation

2009-11-30 09:35 • by Kermos
291745 in reply to 291741
Engival:
if (System.IO.Path.GetExtension(fileName).ToLower() == "txt")

and I thought THAT was the WTF at first. Who cares what the extension is. Who names csv files .txt? I can just see the training of the users now. "Okay, export your file, you can pick csv, or tsv. okay, now go rename the file because the web site needs it as txt)

At least the guy doing "the real WTF" was excluding some common types that idiot users might click on by mistake. It's not supposed to be an exhaustive list. I'm not sure I agree with either approach. One is a pretty poor defense against invalid types, and the other is restrictive.

Also, at this point in time, you can bet that the file has already been uploaded to the web server. So you won't even bother inspecting the first 1k of the file, just because your preconceptions of filename aren't met?


Agreed, my WTF alarm bells went off the moment I saw any kind of filename extension validation. My blackberry does the same frigging thing, won't open a text file if its extension is not .txt even though whatever I am trying to open is a frigging text file. I've cursed about that more than once.

Re: Pretty Basic Validation

2009-11-30 09:37 • by Rodnas (unregistered)
291747 in reply to 291741
Engival:
if (System.IO.Path.GetExtension(fileName).ToLower() == "txt")

and I thought THAT was the WTF at first. Who cares what the extension is. Who names csv files .txt? I can just see the training of the users now. "Okay, export your file, you can pick csv, or tsv. okay, now go rename the file because the web site needs it as txt)

At least the guy doing "the real WTF" was excluding some common types that idiot users might click on by mistake. It's not supposed to be an exhaustive list. I'm not sure I agree with either approach. One is a pretty poor defense against invalid types, and the other is restrictive.

Also, at this point in time, you can bet that the file has already been uploaded to the web server. So you won't even bother inspecting the first 1k of the file, just because your preconceptions of filename aren't met?


"The idea behind the feature was that the administrators could upload a tab-delimited text file containing a list of products, and the application would insert or update the products in the database"

CSV is a CHARACTER separated values file. They did state they needed a tab-delimited file, thus a text file is correctamundo.

CAPTCHA acsi (ASCII for dyslectic people?)

Re: Pretty Basic Validation

2009-11-30 09:38 • by Yazeran (unregistered)
I'd say looking at the extension at all is pointless, what matters is the content of the file.

You have to do some data validation on the text content anyways, so if the content of the file is not text, then print the error at that point (or continue with tabulation validation etc) if it is.

Yours Yazeran

Plan: To go to Mars one day with a hammer

Re: Pretty Basic Validation

2009-11-30 09:38 • by SenTree
MikeS:
Thrid
Eric Pode of Croydon - is that you ?

Re: Pretty Basic Validation

2009-11-30 09:40 • by toth
291750 in reply to 291742
Frits:
The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.


Why? This is a perfectly good first step. Now, obviously, someone could upload a valid text file that did not have a ".txt" extension, so if this were intended to be a robust, general purpose validation routine, it would indeed be a poor approach. But this was meant for a very specific business scenario--it's not unreasonable to require that any input file have a specific extension.

Re: Pretty Basic Validation

2009-11-30 09:41 • by Dave (unregistered)
This whole requirement is stupid. Why should you care what the extension is if the data is valid? I personally might want to call my csv file com.exe.ni.ni..exe and as long as the format of the data is fine I would expect it to work.

Re: Pretty Basic Validation

2009-11-30 09:43 • by onu (unregistered)
291752 in reply to 291738
tB:
So what if someone uploads a docx or xlsx file?


those users will be shot

also: -4 internets

Re: Pretty Basic Validation

2009-11-30 09:43 • by Marquess von Hinten (unregistered)
291753 in reply to 291747
Rodnas:
CSV is a CHARACTER separated values file. They did state they needed a tab-delimited file, thus a text file is correctamundo.
Gee, and I always thought the C stood for comma ...

Re: Pretty Basic Validation

2009-11-30 09:43 • by Ilya Ehrenburg
291754 in reply to 291742
Frits:
The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.

Exactly. Why care whether they name their files *.txt, *.text, *.tsv or even *.exe if you validate the contents anyway?

Re: Pretty Basic Validation

2009-11-30 09:43 • by Steve Urkel (unregistered)
291755 in reply to 291744
Not really. It IS pretty retarded to check the file extension of a file the user has explicitly selected.

It would be better to just put some magic signature on the first line of the files, and check for that. In any case, character-delimited files rarely use the .txt extension. More often .?sv (where ? is the first letter of whatever character is used as a delimiter).

If the magic signature is not an option, why not just check that it contains the correct number of columns and/or that the type of data in them matches the expected type (ie. that a numeric column doesn't contain letters or other funky stuff).

Re: Pretty Basic Validation

2009-11-30 09:44 • by Jeekers (unregistered)
291756 in reply to 291747
[quote user="Rodnas"][quote user="Engival"]

CSV is a CHARACTER separated values file. They did state they needed a tab-delimited file, thus a text file is correctamundo.
[/quote]

Err.. That'll be COMMA separated values. A character separated values file would surely include the TAB character.

Re: Pretty Basic Validation

2009-11-30 09:45 • by Yalpe Nismou (unregistered)
Kristof:
I was flabbergasted.

FTFY

CAPTCHA : Veniam

Re: Pretty Basic Validation

2009-11-30 09:48 • by Me (unregistered)

Aside from all the worthwhile and humerous objections above, System.IO.Path.GetExtension(fileName) actually returns the extension with the dot in front (e.g. ".txt" not "txt")...

Re: Pretty Basic Validation

2009-11-30 09:48 • by highphilosopher (unregistered)
291759 in reply to 291750
toth:
Frits:
The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.


Why? This is a perfectly good first step. Now, obviously, someone could upload a valid text file that did not have a ".txt" extension, so if this were intended to be a robust, general purpose validation routine, it would indeed be a poor approach. But this was meant for a very specific business scenario--it's not unreasonable to require that any input file have a specific extension.


Too often people code backwards logic. It's not that they can't code, it's just that they've been notting things too long.

The reason backwards logic is called backwards logic is because it doesn't satisfy all situations. It's a limiting factor.

1
2
FIZZ
4
BANG
FIZZ
7
8
FIZZ
BANG
11
FIZZ
13
14
FIZZBANG

See, I can do it!!!!

Re: Pretty Basic Validation

2009-11-30 09:48 • by Steve Urkel (unregistered)
291760 in reply to 291747
You DO know that Tab is also a character?

Anyway, CSV is usually meant to stand for Comma-Separated Values, even though the .csv extension is just as often used for semicolon or tab-delimited files.

I'd say using .txt for any file intended to be parsed automatically is somewhat dubious.

Re: Pretty Basic Validation

2009-11-30 09:50 • by Anon (unregistered)
291761 in reply to 291741
Engival:
if (System.IO.Path.GetExtension(fileName).ToLower() == "txt")

and I thought THAT was the WTF at first. Who cares what the extension is. Who names csv files .txt? I can just see the training of the users now. "Okay, export your file, you can pick csv, or tsv. okay, now go rename the file because the web site needs it as txt)


Actually, Excel defaults to the .txt extension if you save a file as tab delimited and that is probably exactly the scenario that is being envisioned here. You tell the user to create their product "database" in Excel (yeah, I know Excel isn't a database, but try telling that to most non-technical users) and save it as tab delimited (which gives it the .txt extension in Excel). Upload to database and you're done.
The real WTF is why he choose to include extensions such as csproj and vbproj in his list of excluded extensions. These are extension that no non-programmer would be likely to ever encounter.
The other WTF is that don't most frameworks have the ability to display a standard file selection dialog with a filter for valid file types in the first place?

Re: Pretty Basic Validation

2009-11-30 09:50 • by Ax (unregistered)
291762 in reply to 291747
Rodnas:
CSV is a CHARACTER separated values file. They did state they needed a tab-delimited file, thus a text file is correctamundo.


Not quite. A tsv would make sense if you're creating the file from some automatic means. But I have a suspicion that these input files are hand crafted.

Re: Pretty Basic Validation

2009-11-30 09:50 • by m0ffx
My web browser won't view the page http://thedailywtf.com/Comments/AddComment.aspx because it expects HTML pages to have a .html extension.

Re: Pretty Basic Validation

2009-11-30 09:53 • by Anon (unregistered)
291764 in reply to 291755
Steve Urkel:
In any case, character-delimited files rarely use the .txt extension. More often .?sv (where ? is the first letter of whatever character is used as a delimiter).


1) Open Excel
2) Select File -> Save As
3) In the "Save as type" drop down, select Text(Tab delimited)
4) What's the extension that Excel uses?

Re: Pretty Basic Validation

2009-11-30 10:00 • by Martin (unregistered)
if (!InvalidExtensions.Contains(fileName.ToString().Substring(fileName.ToString().LastIndexOf("."))))

Also - doesn't that give me an ArgumentOutOfRangeException if my file doesn't have an extension?

Re: Pretty Basic Validation

2009-11-30 10:03 • by Johnny Awkward (unregistered)
Surely the RWTF is all the comments here. What hope is there of good software ever getting written if this is a cross section of the coding community?

Re: Pretty Basic Validation

2009-11-30 10:12 • by Anonymous Cow-Herd (unregistered)
I wonder if anyone's tried dropping the hint by uploading a text file from World of Warcraft? (extension .wtf)

Re: Pretty Basic Validation

2009-11-30 10:14 • by My Name? (unregistered)
291768 in reply to 291742
Frits:
The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.


No, the real WTF is using tabs as delimiters. Also relying on actual files is the worng way to do it. By using streams the mime-type table becomes obsolete.

Re: Pretty Basic Validation

2009-11-30 10:20 • by Frits (unregistered)
291769 in reply to 291744
HeebyJeeby:
Frits:
The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.


That's retarded...


Would you care to elaborate on why it's retarded?
A file extension does not determine the contents of the file. It's not a good idea to needlessly exclude file extensions based on preconceived notions.

It's also good practice to validate the actual data in a file, in-house use or not. Therefore, I'm assuming a validation of the data will be coded. Why would you add an extra validation step that adds little value but may cause potential headaches?

Re: Pretty Basic Validation

2009-11-30 10:24 • by Carl (unregistered)
File "extensions" are TRWTF.

A file name is just a name. You can name a file anything you want, as long as you use valid characters. It can have zero or 100 dots and the computer shouldn't care. The name shouldn't be expected to contain any metadata that the computer cares about. It is for use by humans, and the computer should let the human use whatever name is meaningful to the human.

Metadata should be tracked by the computer in its data structures, along with owner, date created, date last accessed, permissions and all that. Or should we maybe jam those into the filename too?

The profound retardedness of certain systems, that haven't been repaired yet after all these decades, and the blind unqestioning acceptance by the masses, seriously leads me to question the worthiness of the species.

Re: Pretty Basic Validation

2009-11-30 10:32 • by Anonymous Cow-Herd (unregistered)
291771 in reply to 291768
My Name?:
Frits:
The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.


No, the real WTF is using tabs as delimiters. Also relying on actual files is the worng way to do it. By using streams the mime-type table becomes obsolete.


Surely should be using XML. That's more "enterprise" than using tabs or commas or semicolons or rectums as delimiters.

Re: Pretty Basic Validation

2009-11-30 10:35 • by Vollhorst (unregistered)
Where is the problem with the code?

He was told to block other kind of extensions. So he did. He wasn't told to only allow txt-files. His boss should have been a bit more specific with his requirements. And since his lousy requirements even missed the names of files he should block he was forced to rely on what he hat on his own desktop. That guy is a genius.

Re: Pretty Basic Validation

2009-11-30 10:37 • by Pepster (unregistered)
Good lord, you all want to reinvent the wheel. Why?

Yes, IF you were bulding a robust, multiple-use DB loader that could be deployed across multiple business processes, MAYBE you'd want to include other extensions, or simply look at the file contents..... but that wasn't the job!

The business process called for a .txt file, so that's what was being verified. if a .tsv, or a .csv, or a .wtf file showed up, then the user was doing something wrong! It doesn't matter that they COULD cram the data into any extension, it matters that they're not supposed to.

You're so preoccupied with presenting "clever" solutions that can handle "just in case" conditions, you forgot to consider the original requirements.


Re: Pretty Basic Validation

2009-11-30 10:38 • by Anonymous (unregistered)
Well there's nothing to discuss about this one really but I'm going to comment anyway just to get involved. Hey guys!

Re: Pretty Basic Validation

2009-11-30 10:38 • by Jasper (unregistered)
What if someone would try to upload an .a file?
'Oh yeah, you're right... I should add those to the list as well!'

What if someone would try to upload a .b file?
'Oh yeah, you're right... I should add those to the list as well!'

What if someone would try to upload a .c file?
'Oh yeah, you're right... I should add those to the list as well!'

What if someone would try to upload a .d file?
'Oh yeah, you're right... I should add those to the list as well!'

... (5 minutes later)

What if someone would try to upload a .z file?
'Oh yeah, you're right... I should add those to the list as well!'

What if someone would try to upload an .aa file?
'Oh yeah, you're right... I should add those to the list as well!'

What if someone would try to upload an .ab file?
'Oh yeah, you're right... I should add those to the list as well!'

... (2 hours later)

What if someone would try to upload an .aaa file?
'Oh yeah, you're right... I should add those to the list as well!'

What if someone would try to upload an .aab file?
'Oh yeah, you're right... I should add those to the list as well!'

... (10 days later)

etc.

And also: Some extensions such as .asp and .aspx are in the list twice. Probably those are files you really don't want people to upload. Just to be extra sure you check twice.

Re: Pretty Basic Validation

2009-11-30 10:49 • by Anonymous Cow-Herd (unregistered)
291777 in reply to 291775
Jasper:


And also: Some extensions such as .asp and .aspx are in the list twice. Probably those are files you really don't want people to upload. Just to be extra sure you check twice.


When you really want to be sure of something, you're supposed to do it three times.

Re: Pretty Basic Validation

2009-11-30 10:52 • by iToad (unregistered)
I'm sitting here looking at a tab-delimited text file from one of the company systems with an extension of .xls

Re: Pretty Basic Validation

2009-11-30 10:55 • by HeebyJeeby (unregistered)
291779 in reply to 291769
Frits:
HeebyJeeby:
Frits:
The real WTF is validating the extension at all. Let the data validation fail if the file is the incorrect type.


That's retarded...


Would you care to elaborate on why it's retarded?
A file extension does not determine the contents of the file. It's not a good idea to needlessly exclude file extensions based on preconceived notions.

It's also good practice to validate the actual data in a file, in-house use or not. Therefore, I'm assuming a validation of the data will be coded. Why would you add an extra validation step that adds little value but may cause potential headaches?


It's retarded because you're retarded

CAPTCHA: fellatio

Re: Pretty Basic Validation

2009-11-30 10:58 • by BSDGuy (unregistered)
291780 in reply to 291742
ding, ding!

Why check the extension at all? If the file meets the delimiting parameters then it's all good,....if it doesn't return BAD_FILE_FORMAT (some number..probably negative...or use an exception if it's Java which I think it is based on system.IO....).

Two WTFs don't make a right.

Re: Pretty Basic Validation

2009-11-30 11:00 • by DJ Maze (unregistered)
Is magic mime not available anymore?
How about "filename.exe\x00.txt"

Re: Pretty Basic Validation

2009-11-30 11:04 • by CoderHero (unregistered)
291782 in reply to 291747
Rodnas:
Engival:
if (System.IO.Path.GetExtension(fileName).ToLower() == "txt")

and I thought THAT was the WTF at first. Who cares what the extension is. Who names csv files .txt? I can just see the training of the users now. "Okay, export your file, you can pick csv, or tsv. okay, now go rename the file because the web site needs it as txt)

At least the guy doing "the real WTF" was excluding some common types that idiot users might click on by mistake. It's not supposed to be an exhaustive list. I'm not sure I agree with either approach. One is a pretty poor defense against invalid types, and the other is restrictive.

Also, at this point in time, you can bet that the file has already been uploaded to the web server. So you won't even bother inspecting the first 1k of the file, just because your preconceptions of filename aren't met?


"The idea behind the feature was that the administrators could upload a tab-delimited text file containing a list of products, and the application would insert or update the products in the database"

CSV is a CHARACTER separated values file. They did state they needed a tab-delimited file, thus a text file is correctamundo.

CAPTCHA acsi (ASCII for dyslectic people?)


No CSV is COMMA separated value.

Re: Pretty Basic Validation

2009-11-30 11:08 • by Snake (unregistered)
291783 in reply to 291743
GalacticCowboy:
tB:
So what if someone uploads a docx or xlsx file?


Better brush the dust off those regex skillz.


Now that's a very big WTF. Parsing XML with Regex.

You need a stack based parser for that!

Re: Pretty Basic Validation

2009-11-30 11:13 • by Anon (unregistered)
291784 in reply to 291782
CoderHero:

No CSV is COMMA separated value.


Not if you are using regional settings that use comma as the decimal separator. I've been stung by that before.

Re: Pretty Basic Validation

2009-11-30 11:16 • by Swedish tard (unregistered)
291785 in reply to 291782
CoderHero:
Rodnas:
Engival:
if (System.IO.Path.GetExtension(fileName).ToLower() == "txt")

and I thought THAT was the WTF at first. Who cares what the extension is. Who names csv files .txt? I can just see the training of the users now. "Okay, export your file, you can pick csv, or tsv. okay, now go rename the file because the web site needs it as txt)

At least the guy doing "the real WTF" was excluding some common types that idiot users might click on by mistake. It's not supposed to be an exhaustive list. I'm not sure I agree with either approach. One is a pretty poor defense against invalid types, and the other is restrictive.

Also, at this point in time, you can bet that the file has already been uploaded to the web server. So you won't even bother inspecting the first 1k of the file, just because your preconceptions of filename aren't met?


"The idea behind the feature was that the administrators could upload a tab-delimited text file containing a list of products, and the application would insert or update the products in the database"

CSV is a CHARACTER separated values file. They did state they needed a tab-delimited file, thus a text file is correctamundo.

CAPTCHA acsi (ASCII for dyslectic people?)


No CSV is COMMA separated value.

CSV has, about 1/3rd of the times I've seen it used referred to character separated values. So I'd venture as far as to say that CSV can actually mean both, whereas character probably is a later construct due to the fact that comma is a bad separator for reasons already mentioned here. ;)
« PrevPage 1 | Page 2 | Page 3 | Page 4Next »

Add Comment