- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Should have been
return "FileNotFound";
Seriously though, LastIndexOf is probably sad right now because it's been forgotten :((
Admin
Wouldn't this just always return 'Error'? The While loop will only break out when intDotIndex is -1, and the If that follows will push it down the Error path.
Admin
You mix intDotCount with intDotIndex.
Admin
Danm. You beat me to it.
But even if that is corrected, the final step won't do what's wanted - imagine if the filename was
somelongfilename.jpg
.intDotIndex
will be16
, and the final step will create a substring of three characters starting at character13
ofjpg
, which seems ... difficult.Admin
strImageName.Substring(intDotIndex - 3, 3) : strImageName, the file name of the image, contains a user ID in the last 3 characters before the file extension.
Admin
I'd just like to point out, as I have done before here on this site, that TRWTF is programmers.
Admin
Actually the opposite. It can never return error. The index defaults to 0. It only ever gets updated to the same as count if the count is not -1. Not sure why they needed two variables for it, or why they have two ways of not being -1(
!(var == -1)
andvar != -1
), but it does mostly do something vaguely resembling correct behaviour.Admin
I don't think this is getting the extension. I think the name and comment are right: it's getting the last 3 chars of the non-extension part. So with a strImageName of "myuserid.jpg" this will find the position of the '.' and return the 3 chars before it: "rid".
Despite the loop to handle multiple dots it does not actually do so: "my.file.png" sets intDotIndex from "file.png" and then indexes the full string, getting "y.f".
Contrary to Darren, who I think has (as I did at first) confused intDotIndex with intDotCount, the "Error" case will never be hit. I think it was meant to handle strings with no dot, but those leave intDotIndex at 0, not -1. I do not know what string.Substring does when passed. (-3, 3) because I'm not a C# guy. Speaking of, the intLength calculation has to be redundant, right? Surely there is a way to tell Substring "from x position to the end."
Admin
Given a filename of "67a86c53d05643be989ada8bfabc37e9.jpg" (assuming "user id" + extension as image file name), this will return "7e9" - while is the last three chars of the ID instead of the extension
Admin
Per the documentation,
string.Substring()
throws anArgumentOutOfRangeException
if thestartIndex
parameter is negative. It's not like Python or Ruby where a negative index means to index in from the end of the string.This code also has a Schlemiel the Painter algorithm in it: if the string contains N dots in it, will take O(N^2) time to perform all of the splits & substring operations.
Admin
The lack of
LastIndexOf
use is the second-most WTF part of this code; what they use instead of it is the most WTF.Admin
It really is an odd function.
Here are some results of the function.
filename.ext: ame extended.filename.ext: ded silly.file.name.ext: ill silly.very.long.extended.file.name.longerext: ill silly.very.long.extended.file.1.longerext: Exception: Invalid startIndex (-2) silly.very.long.extended.file.12.longerext: Exception: Invalid startIndex (-1) silly.very.long.extended.file.123.longerext: sil silly.very.long.extended.file.1234.longerext: ill silly.very.long.extended.file.12345.longerext: lly silly.very.long.extended.file.123456.longerext: ly.
Note that the starting index of the selected 3 characters depend in the length of the second to last name segment, as divided by '.' characters.
Admin
Assuming the filename has exactly one ".", this code should return the three chars before the ".". So, depending on their file naming conventions (putting a user id at the end of the filename), the name of the function might be accurate.
Even in n that case the logic and variable naming is super confusing,
Admin
This I think this was (once) used to match userprofile pictures to the user. Let's say we take a picture of each employee of a medium sized company and store them as Dlareg.realname123.jpg But another secretary stored them as d.f.t.realname123.png
And now we want to import all those into our shiny new HRM software. That is where this function would make sense.
Admin
System.IO.Path.GetExtension(string: full path to file)
Admin
determining file type based on the three letter extension in its filename - "not a good practice"
I'd argue that determining based on content is worse. Windows used to sniff content in certain cases, so some old malware practices were to distribute malware with .mp4 or .jpg extensions. Windows would see that it was an executable and run it rather than passing it to a file type handler.
Admin
Either someone used AI to write the function comment or a second dev who doesn't know what UID stands for did. Still not a proper use for UID.
Whoever wrote this function failed to realize two things.
First, Path.GetExtension exists.
Second.string.LastIndexOf exists.
Admin
Looks like something that, along with the code which invokes it, went through an iterative/evolutionary process to solve some trivial but awkward business problem, where the developer tried to come at the problem a few different ways. So, the method retains the legacy of an earlier unsuccessful attempt in its name even after its implementation likely retains little to nothing of its original concept.
Admin
File extensions long predate DOS. They were on DEC operating systems TOPS-10 and TSS/8 in the 70's. Multics and DTSS had them. I think even IBM file systems in the 60's had something analogous.
Admin
Yep, that was my first thought too - there's useful metadata embedded in the file name, prior to the extension, and what we have here is just a really bad way to get it out...
Admin
File extensions have three letters, for example index.html or program.cs
Admin
Those methods of Path are platform/protocol independent - you should always use them instead of doing it yourself ;-)
Admin
Try again, and hopefully the formatting will be preserved.
It really is an odd function. Here are some results of the function.
Note that the starting index of the selected 3 characters depend in the length of the second to last name segment, as divided by '.' characters.
Admin
This definitely doesn't work: https://godbolt.org/z/rG9dYqda9
Admin
This makes mime types seem elegant.
Admin
More fun: it throws an exception when the basename is under 3 chars: https://godbolt.org/z/Y514zjvjb
Admin
I think you have it backwards. For one thing, most "users" of Windows leave "show extension" turned off because it's neater or something, so they get a file naed youarepwned.jpg.exe, and it shows in the browser or desktop as "youareownedjpg" and they think everything is cool.
Heck, I used to bypass moronic IT-dept blocking of email attachents by renaming goodstuff.zip to goodstuff.txt.
Apple had it right way back in the 80s and 90s, when Macintosh computers had that separate ".name" file which controlled what apps would open the file. Sadly, they gave in to the "majority rule" imposed by Microsoft
Admin
The double extension is a different security risk where user security was sacrificed in the name of pretty file name displays.
For file content sniffing, look up Internet Explorer and Mime Sniffing for more detail.
Admin
Imagine that we have
blah-di-blah.whatever.frob
as the name:I'll ignore
strSegment
because it's never used, but it accumulates the parts of the original string up to and including the last dot found.First pass:
intDotCount
is 12. Executes the body of the inner if().intDotIndex
is also 12 (because it is based on the same calculation asintDotCount
).intLength
is 14 because it's 27 (total length ofstrSplit
at this point) minus (12+1).strSplit
iswhatever.frob
intDotCount
is not -1, so the tortured logic in the while() continues (it's "not-the-case-that intDotCount is -1).Second pass:
intDotCount
is 8 (position of the "." inwhatever.frob
. Executes the body of the inner if().intDotIndex
is also 8 (because it is based on the same calculation asintDotCount
).intLength
is 4 because it's 13 (total length ofstrSplit
at this point) minus (8+1).strSplit
isfrob
intDotCount
is not -1, so the tortured logic in the while() continues (it's "not-the-case-that intDotCount is -1).Third pass:
intDotCount
is -1 (position of the "." infrob
. Does not execute the body of the inner if().Reminder:
intDotIndex
is still 8.intDotCount
is -1, so the tortured logic in the while() ends (it's "not-the-case-that intDotCount is -1).intDotIndex
is 8, so the function returns"di-"
, the three characters starting at index 5 of"blah-di-blah.whatever.frob"
.So the filenames had absolutely for sure better be set up correctly as
"someuseridthingwithoutdots.extension"
or the function returns gibberish.Addendum 2024-06-12 14:35: and "returns gibberish" might include "throws an ArgumentOutOfRangeException".
Admin
The return Error (which seems to be the usual result) is the icing of the cake. I think I will use that as a file extension in the first project I need it.
Admin
I think most of these comments are off target. As I see it, the inputs are <anything><userid>.<extension>, it's supposed to return <userid>. It's building up the <anything> string but this is probably legacy and no longer used. And it was supposed to correctly handle periods in the <anything> part, but this routine barfs on it.
Admin
Well, I'm not an expert in "#", but in C (both with and without "++") it's quite trivial to scan a string and find the last '.' that is after the last '\' or ':'...
Admin
It only needs two lines.
string fileName = Path.GetFileNameWithoutExtension(strImageName);
return fileName.Substring(Math.Max(0, fileName.Length - 3));Admin
Can you extend that more? Where would that
.name
file be and would it be a system-wide setting or a file-wise setting? If it is per-file, than that's actually pretty awful, because it means that the user can't change default programs for file types.I honestly really don't see the issue with file extensions. It is an easy way to decide what program to open something with. It also makes it easier to search for file types than having to parse contents of a file. For an average user it is also likely easier to understand that file ending with ".docx" will be opened by word, than starting to talk about mime types.
It also makes operations like "list all files in this directory that are images" cheap. Without the file extensions, you'd need programs to maintain caches of detected file types.
Streamed file systems (like supported by Google Drive, Onedrive, Dropbox) would suddenly be faced with programs fetching and reading files just in order to list them, instead of having to access only metadata. If those services would provide an API to get the file type, the programs would then have to explicitly support that.
Admin
In C++17 (or later), the supposed job of this function is even simpler. Stuff the input string into a
std::filesystem::path
and call thestem()
method, then convert the return to a std::string and keep the last three characters, in much the same way that @prueg described for C#, but using C++ objects.Admin
This code is a thing of beauty, it is poetry, it is a display of needless and verbose virtuosity.
Admin
I have a suspicion given the rampant Hungarian-esque warts that this is intended to be a straight port of VB code. Expectation of -1 for not finding a dot would be consistent with InStr.
Admin
Oops, forget that, wrong, InStr also returns zero for not found.