The Daily WTF: Curious Perversions in Information Technology

Mr. TA · 2024-06-12 Reply Admin

Should have been return "FileNotFound";

Seriously though, LastIndexOf is probably sad right now because it's been forgotten :((

2024-06-12 Reply Admin

Wouldn't this just always return 'Error'? The While loop will only break out when intDotIndex is -1, and the If that follows will push it down the Error path.

2024-06-12 Reply Admin

You mix intDotCount with intDotIndex.

Steve_The_Cynic · 2024-06-12 Reply Admin

Wouldn't this just always return 'Error'?

Danm. You beat me to it.

But even if that is corrected, the final step won't do what's wanted - imagine if the filename was somelongfilename.jpg. intDotIndex will be 16, and the final step will create a substring of three characters starting at character 13 of jpg, which seems ... difficult.

2024-06-12 Reply Admin

strImageName.Substring(intDotIndex - 3, 3) : strImageName, the file name of the image, contains a user ID in the last 3 characters before the file extension.

Steve_The_Cynic · 2024-06-12 Reply Admin

I'd just like to point out, as I have done before here on this site, that TRWTF is programmers.

2024-06-12 Reply Admin

Actually the opposite. It can never return error. The index defaults to 0. It only ever gets updated to the same as count if the count is not -1. Not sure why they needed two variables for it, or why they have two ways of not being -1( !(var == -1) and var != -1 ), but it does mostly do something vaguely resembling correct behaviour.

2024-06-12 Reply Admin

I don't think this is getting the extension. I think the name and comment are right: it's getting the last 3 chars of the non-extension part. So with a strImageName of "myuserid.jpg" this will find the position of the '.' and return the 3 chars before it: "rid".

Despite the loop to handle multiple dots it does not actually do so: "my.file.png" sets intDotIndex from "file.png" and then indexes the full string, getting "y.f".

Contrary to Darren, who I think has (as I did at first) confused intDotIndex with intDotCount, the "Error" case will never be hit. I think it was meant to handle strings with no dot, but those leave intDotIndex at 0, not -1. I do not know what string.Substring does when passed. (-3, 3) because I'm not a C# guy. Speaking of, the intLength calculation has to be redundant, right? Surely there is a way to tell Substring "from x position to the end."

2024-06-12 Reply Admin

Given a filename of "67a86c53d05643be989ada8bfabc37e9.jpg" (assuming "user id" + extension as image file name), this will return "7e9" - while is the last three chars of the ID instead of the extension

adamantoise · 2024-06-12 Reply Admin

Per the documentation, string.Substring() throws an ArgumentOutOfRangeException if the startIndex parameter is negative. It's not like Python or Ruby where a negative index means to index in from the end of the string.

This code also has a Schlemiel the Painter algorithm in it: if the string contains N dots in it, will take O(N^2) time to perform all of the splits & substring operations.

Medinoc · 2024-06-12 Reply Admin

The lack of LastIndexOf use is the second-most WTF part of this code; what they use instead of it is the most WTF.

2024-06-12 Reply Admin

It really is an odd function.

Here are some results of the function.

filename.ext: ame extended.filename.ext: ded silly.file.name.ext: ill silly.very.long.extended.file.name.longerext: ill silly.very.long.extended.file.1.longerext: Exception: Invalid startIndex (-2) silly.very.long.extended.file.12.longerext: Exception: Invalid startIndex (-1) silly.very.long.extended.file.123.longerext: sil silly.very.long.extended.file.1234.longerext: ill silly.very.long.extended.file.12345.longerext: lly silly.very.long.extended.file.123456.longerext: ly.

Note that the starting index of the selected 3 characters depend in the length of the second to last name segment, as divided by '.' characters.

2024-06-12 Reply Admin

Assuming the filename has exactly one ".", this code should return the three chars before the ".". So, depending on their file naming conventions (putting a user id at the end of the filename), the name of the function might be accurate.

Even in n that case the logic and variable naming is super confusing,

Dlareg · 2024-06-12 Reply Admin

This I think this was (once) used to match userprofile pictures to the user. Let's say we take a picture of each employee of a medium sized company and store them as Dlareg.realname123.jpg But another secretary stored them as d.f.t.realname123.png

And now we want to import all those into our shiny new HRM software. That is where this function would make sense.

2024-06-12 Reply Admin

System.IO.Path.GetExtension(string: full path to file)

2024-06-12 Reply Admin

determining file type based on the three letter extension in its filename - "not a good practice"

I'd argue that determining based on content is worse. Windows used to sniff content in certain cases, so some old malware practices were to distribute malware with .mp4 or .jpg extensions. Windows would see that it was an executable and run it rather than passing it to a file type handler.

2024-06-12 Reply Admin

Either someone used AI to write the function comment or a second dev who doesn't know what UID stands for did. Still not a proper use for UID.

Whoever wrote this function failed to realize two things.

First, Path.GetExtension exists.

Second.string.LastIndexOf exists.

2024-06-12 Reply Admin

Looks like something that, along with the code which invokes it, went through an iterative/evolutionary process to solve some trivial but awkward business problem, where the developer tried to come at the problem a few different ways. So, the method retains the legacy of an earlier unsuccessful attempt in its name even after its implementation likely retains little to nothing of its original concept.

Barry Margolin · 2024-06-12 Reply Admin

File extensions long predate DOS. They were on DEC operating systems TOPS-10 and TSS/8 in the 70's. Multics and DTSS had them. I think even IBM file systems in the 60's had something analogous.

stoborrobots · 2024-06-12 Reply Admin

This I think this was (once) used to match userprofile pictures to the user. Let's say we take a picture of each employee of a medium sized company and store them as Dlareg.realname123.jpg But another secretary stored them as d.f.t.realname123.png
And now we want to import all those into our shiny new HRM software. That is where this function would make sense.

Yep, that was my first thought too - there's useful metadata embedded in the file name, prior to the extension, and what we have here is just a really bad way to get it out...

2024-06-12 Reply Admin

File extensions have three letters, for example index.html or program.cs

MaxiTB · 2024-06-12 Reply Admin

Those methods of Path are platform/protocol independent - you should always use them instead of doing it yourself ;-)

2024-06-12 Reply Admin

Try again, and hopefully the formatting will be preserved.

It really is an odd function. Here are some results of the function.

filename.ext: ame
extended.filename.ext: ded
silly.file.name.ext: ill
silly.very.long.extended.file.name.longerext: ill
silly.very.long.extended.file.123.longerext: sil
silly.very.long.extended.file.1234.longerext: ill
silly.very.long.extended.file.12345.longerext: lly
silly.very.long.extended.file.123456.longerext: ly.

Note that the starting index of the selected 3 characters depend in the length of the second to last name segment, as divided by '.' characters.

2024-06-12 Reply Admin

This definitely doesn't work: https://godbolt.org/z/rG9dYqda9

2024-06-12 Reply Admin

This makes mime types seem elegant.

2024-06-12 Reply Admin

More fun: it throws an exception when the basename is under 3 chars: https://godbolt.org/z/Y514zjvjb

2024-06-12 Reply Admin

I think you have it backwards. For one thing, most "users" of Windows leave "show extension" turned off because it's neater or something, so they get a file naed youarepwned.jpg.exe, and it shows in the browser or desktop as "youareownedjpg" and they think everything is cool.

Heck, I used to bypass moronic IT-dept blocking of email attachents by renaming goodstuff.zip to goodstuff.txt.

Apple had it right way back in the 80s and 90s, when Macintosh computers had that separate ".name" file which controlled what apps would open the file. Sadly, they gave in to the "majority rule" imposed by Microsoft

2024-06-12 Reply Admin

The double extension is a different security risk where user security was sacrificed in the name of pretty file name displays.

For file content sniffing, look up Internet Explorer and Mime Sniffing for more detail.

Steve_The_Cynic · 2024-06-12 Reply Admin

Imagine that we have blah-di-blah.whatever.frob as the name:

I'll ignore strSegment because it's never used, but it accumulates the parts of the original string up to and including the last dot found.

First pass: intDotCount is 12. Executes the body of the inner if().

intDotIndexis also 12 (because it is based on the same calculation as intDotCount).

intLength is 14 because it's 27 (total length of strSplit at this point) minus (12+1).

strSplit is whatever.frob

intDotCount is not -1, so the tortured logic in the while() continues (it's "not-the-case-that intDotCount is -1).

Second pass: intDotCount is 8 (position of the "." in whatever.frob. Executes the body of the inner if().

intDotIndexis also 8 (because it is based on the same calculation as intDotCount).

intLength is 4 because it's 13 (total length of strSplit at this point) minus (8+1).

strSplit is frob

intDotCount is not -1, so the tortured logic in the while() continues (it's "not-the-case-that intDotCount is -1).

Third pass: intDotCount is -1 (position of the "." in frob. Does not execute the body of the inner if().

Reminder: intDotIndexis still 8.

intDotCount is -1, so the tortured logic in the while() ends (it's "not-the-case-that intDotCount is -1).

intDotIndex is 8, so the function returns "di-", the three characters starting at index 5 of "blah-di-blah.whatever.frob" .

So the filenames had absolutely for sure better be set up correctly as "someuseridthingwithoutdots.extension" or the function returns gibberish.

Addendum 2024-06-12 14:35: and "returns gibberish" might include "throws an ArgumentOutOfRangeException".

molleafauss · 2024-06-12 Reply Admin

The return Error (which seems to be the usual result) is the icing of the cake. I think I will use that as a file extension in the first project I need it.

LorenPechtel · 2024-06-12 Reply Admin

I think most of these comments are off target. As I see it, the inputs are <anything><userid>.<extension>, it's supposed to return <userid>. It's building up the <anything> string but this is probably legacy and no longer used. And it was supposed to correctly handle periods in the <anything> part, but this routine barfs on it.

2024-06-12 Reply Admin

Well, I'm not an expert in "#", but in C (both with and without "++") it's quite trivial to scan a string and find the last '.' that is after the last '\' or ':'...

prueg · 2024-06-13 Reply Admin

It only needs two lines. string fileName = Path.GetFileNameWithoutExtension(strImageName);return fileName.Substring(Math.Max(0, fileName.Length - 3));

R3D3 · 2024-06-13 Reply Admin

Apple had it right way back in the 80s and 90s, when Macintosh computers had that separate ".name" file which controlled what apps would open the file.

Can you extend that more? Where would that .name file be and would it be a system-wide setting or a file-wise setting? If it is per-file, than that's actually pretty awful, because it means that the user can't change default programs for file types.

I honestly really don't see the issue with file extensions. It is an easy way to decide what program to open something with. It also makes it easier to search for file types than having to parse contents of a file. For an average user it is also likely easier to understand that file ending with ".docx" will be opened by word, than starting to talk about mime types.

It also makes operations like "list all files in this directory that are images" cheap. Without the file extensions, you'd need programs to maintain caches of detected file types.

Streamed file systems (like supported by Google Drive, Onedrive, Dropbox) would suddenly be faced with programs fetching and reading files just in order to list them, instead of having to access only metadata. If those services would provide an API to get the file type, the programs would then have to explicitly support that.

Steve_The_Cynic · 2024-06-13 Reply Admin

Well, I'm not an expert in "#", but in C (both with and without "++") it's quite trivial to scan a string and find the last '.' that is after the last '' or ':'...

In C++17 (or later), the supposed job of this function is even simpler. Stuff the input string into a std::filesystem::path and call the stem() method, then convert the return to a std::string and keep the last three characters, in much the same way that @prueg described for C#, but using C++ objects.

2024-06-13 Reply Admin

This code is a thing of beauty, it is poetry, it is a display of needless and verbose virtuosity.

2024-06-19 Reply Admin

I have a suspicion given the rampant Hungarian-esque warts that this is intended to be a straight port of VB code. Expectation of -1 for not finding a dot would be consistent with InStr.

2024-06-19 Reply Admin

Oops, forget that, wrong, InStr also returns zero for not found.

Gonna Need an Extension

Leave a comment on “Gonna Need an Extension”