• (nodebb)

    Should have been return "FileNotFound";

    Seriously though, LastIndexOf is probably sad right now because it's been forgotten :((

  • Darren (unregistered)

    Wouldn't this just always return 'Error'? The While loop will only break out when intDotIndex is -1, and the If that follows will push it down the Error path.

  • Dot (unregistered) in reply to Darren

    You mix intDotCount with intDotIndex.

  • (nodebb) in reply to Darren

    Wouldn't this just always return 'Error'?

    Danm. You beat me to it.

    But even if that is corrected, the final step won't do what's wanted - imagine if the filename was somelongfilename.jpg. intDotIndex will be 16, and the final step will create a substring of three characters starting at character 13 of jpg, which seems ... difficult.

  • Dot (unregistered)

    strImageName.Substring(intDotIndex - 3, 3) : strImageName, the file name of the image, contains a user ID in the last 3 characters before the file extension.

  • (nodebb)

    I'd just like to point out, as I have done before here on this site, that TRWTF is programmers.

  • Scragar (unregistered) in reply to Darren

    Actually the opposite. It can never return error. The index defaults to 0. It only ever gets updated to the same as count if the count is not -1. Not sure why they needed two variables for it, or why they have two ways of not being -1( !(var == -1) and var != -1 ), but it does mostly do something vaguely resembling correct behaviour.

  • Smithers (unregistered)

    I don't think this is getting the extension. I think the name and comment are right: it's getting the last 3 chars of the non-extension part. So with a strImageName of "myuserid.jpg" this will find the position of the '.' and return the 3 chars before it: "rid".

    Despite the loop to handle multiple dots it does not actually do so: "my.file.png" sets intDotIndex from "file.png" and then indexes the full string, getting "y.f".

    Contrary to Darren, who I think has (as I did at first) confused intDotIndex with intDotCount, the "Error" case will never be hit. I think it was meant to handle strings with no dot, but those leave intDotIndex at 0, not -1. I do not know what string.Substring does when passed. (-3, 3) because I'm not a C# guy. Speaking of, the intLength calculation has to be redundant, right? Surely there is a way to tell Substring "from x position to the end."

  • Hans (unregistered)

    Given a filename of "67a86c53d05643be989ada8bfabc37e9.jpg" (assuming "user id" + extension as image file name), this will return "7e9" - while is the last three chars of the ID instead of the extension

  • (nodebb) in reply to Smithers

    Per the documentation, string.Substring() throws an ArgumentOutOfRangeException if the startIndex parameter is negative. It's not like Python or Ruby where a negative index means to index in from the end of the string.

    This code also has a Schlemiel the Painter algorithm in it: if the string contains N dots in it, will take O(N^2) time to perform all of the splits & substring operations.

  • (nodebb)

    The lack of LastIndexOf use is the second-most WTF part of this code; what they use instead of it is the most WTF.

  • AddAName (unregistered)

    It really is an odd function.

    Here are some results of the function.

    filename.ext: ame extended.filename.ext: ded silly.file.name.ext: ill silly.very.long.extended.file.name.longerext: ill silly.very.long.extended.file.1.longerext: Exception: Invalid startIndex (-2) silly.very.long.extended.file.12.longerext: Exception: Invalid startIndex (-1) silly.very.long.extended.file.123.longerext: sil silly.very.long.extended.file.1234.longerext: ill silly.very.long.extended.file.12345.longerext: lly silly.very.long.extended.file.123456.longerext: ly.

    Note that the starting index of the selected 3 characters depend in the length of the second to last name segment, as divided by '.' characters.

  • Weretaco (unregistered)

    Assuming the filename has exactly one ".", this code should return the three chars before the ".". So, depending on their file naming conventions (putting a user id at the end of the filename), the name of the function might be accurate.

    Even in n that case the logic and variable naming is super confusing,

  • (nodebb) in reply to Smithers

    This I think this was (once) used to match userprofile pictures to the user. Let's say we take a picture of each employee of a medium sized company and store them as Dlareg.realname123.jpg But another secretary stored them as d.f.t.realname123.png

    And now we want to import all those into our shiny new HRM software. That is where this function would make sense.

  • LZ79LRU (unregistered)

    System.IO.Path.GetExtension(string: full path to file)

  • Registered (unregistered)

    determining file type based on the three letter extension in its filename - "not a good practice"

    I'd argue that determining based on content is worse. Windows used to sniff content in certain cases, so some old malware practices were to distribute malware with .mp4 or .jpg extensions. Windows would see that it was an executable and run it rather than passing it to a file type handler.

  • The MAZZTer (github)

    Either someone used AI to write the function comment or a second dev who doesn't know what UID stands for did. Still not a proper use for UID.

    Whoever wrote this function failed to realize two things.

    First, Path.GetExtension exists.

    Second.string.LastIndexOf exists.

  • see sharp (unregistered)

    Looks like something that, along with the code which invokes it, went through an iterative/evolutionary process to solve some trivial but awkward business problem, where the developer tried to come at the problem a few different ways. So, the method retains the legacy of an earlier unsuccessful attempt in its name even after its implementation likely retains little to nothing of its original concept.

  • (nodebb)

    File extensions long predate DOS. They were on DEC operating systems TOPS-10 and TSS/8 in the 70's. Multics and DTSS had them. I think even IBM file systems in the 60's had something analogous.

  • (nodebb) in reply to Dlareg
    This I think this was (once) used to match userprofile pictures to the user. Let's say we take a picture of each employee of a medium sized company and store them as Dlareg.realname123.jpg But another secretary stored them as d.f.t.realname123.png

    And now we want to import all those into our shiny new HRM software. That is where this function would make sense.

    Yep, that was my first thought too - there's useful metadata embedded in the file name, prior to the extension, and what we have here is just a really bad way to get it out...

  • Álvaro González (github)

    File extensions have three letters, for example index.html or program.cs

  • (nodebb) in reply to The MAZZTer

    Those methods of Path are platform/protocol independent - you should always use them instead of doing it yourself ;-)

  • AddAName (unregistered)

    Try again, and hopefully the formatting will be preserved.

    It really is an odd function. Here are some results of the function.

    filename.ext: ame
    extended.filename.ext: ded
    silly.file.name.ext: ill
    silly.very.long.extended.file.name.longerext: ill
    silly.very.long.extended.file.123.longerext: sil
    silly.very.long.extended.file.1234.longerext: ill
    silly.very.long.extended.file.12345.longerext: lly
    silly.very.long.extended.file.123456.longerext: ly.
    

    Note that the starting index of the selected 3 characters depend in the length of the second to last name segment, as divided by '.' characters.

  • R Samuel Klatchko (unregistered)

    This definitely doesn't work: https://godbolt.org/z/rG9dYqda9

  • Alan Scrivener (unregistered)

    This makes mime types seem elegant.

  • R Samuel Klatchko (unregistered)

    More fun: it throws an exception when the basename is under 3 chars: https://godbolt.org/z/Y514zjvjb

  • cellocgw and I am logged... or not (unregistered) in reply to Registered

    I think you have it backwards. For one thing, most "users" of Windows leave "show extension" turned off because it's neater or something, so they get a file naed youarepwned.jpg.exe, and it shows in the browser or desktop as "youareownedjpg" and they think everything is cool.

    Heck, I used to bypass moronic IT-dept blocking of email attachents by renaming goodstuff.zip to goodstuff.txt.

    Apple had it right way back in the 80s and 90s, when Macintosh computers had that separate ".name" file which controlled what apps would open the file. Sadly, they gave in to the "majority rule" imposed by Microsoft

  • Registered (unregistered) in reply to cellocgw and I am logged... or not

    The double extension is a different security risk where user security was sacrificed in the name of pretty file name displays.

    For file content sniffing, look up Internet Explorer and Mime Sniffing for more detail.

  • (nodebb)

    Imagine that we have blah-di-blah.whatever.frob as the name:

    I'll ignore strSegment because it's never used, but it accumulates the parts of the original string up to and including the last dot found.

    First pass: intDotCount is 12. Executes the body of the inner if().

    intDotIndexis also 12 (because it is based on the same calculation as intDotCount).

    intLength is 14 because it's 27 (total length of strSplit at this point) minus (12+1).

    strSplit is whatever.frob

    intDotCount is not -1, so the tortured logic in the while() continues (it's "not-the-case-that intDotCount is -1).

    Second pass: intDotCount is 8 (position of the "." in whatever.frob. Executes the body of the inner if().

    intDotIndexis also 8 (because it is based on the same calculation as intDotCount).

    intLength is 4 because it's 13 (total length of strSplit at this point) minus (8+1).

    strSplit is frob

    intDotCount is not -1, so the tortured logic in the while() continues (it's "not-the-case-that intDotCount is -1).

    Third pass: intDotCount is -1 (position of the "." in frob. Does not execute the body of the inner if().

    Reminder: intDotIndexis still 8.

    intDotCount is -1, so the tortured logic in the while() ends (it's "not-the-case-that intDotCount is -1).

    intDotIndex is 8, so the function returns "di-", the three characters starting at index 5 of "blah-di-blah.whatever.frob" .

    So the filenames had absolutely for sure better be set up correctly as "someuseridthingwithoutdots.extension" or the function returns gibberish.

    Addendum 2024-06-12 14:35: and "returns gibberish" might include "throws an ArgumentOutOfRangeException".

  • (nodebb)

    The return Error (which seems to be the usual result) is the icing of the cake. I think I will use that as a file extension in the first project I need it.

  • (nodebb)

    I think most of these comments are off target. As I see it, the inputs are <anything><userid>.<extension>, it's supposed to return <userid>. It's building up the <anything> string but this is probably legacy and no longer used. And it was supposed to correctly handle periods in the <anything> part, but this routine barfs on it.

  • the cow (not the robot) (unregistered)

    Well, I'm not an expert in "#", but in C (both with and without "++") it's quite trivial to scan a string and find the last '.' that is after the last '\' or ':'...

  • (nodebb)

    It only needs two lines. string fileName = Path.GetFileNameWithoutExtension(strImageName);return fileName.Substring(Math.Max(0, fileName.Length - 3));

  • (nodebb) in reply to cellocgw and I am logged... or not

    Apple had it right way back in the 80s and 90s, when Macintosh computers had that separate ".name" file which controlled what apps would open the file.

    Can you extend that more? Where would that .name file be and would it be a system-wide setting or a file-wise setting? If it is per-file, than that's actually pretty awful, because it means that the user can't change default programs for file types.

    I honestly really don't see the issue with file extensions. It is an easy way to decide what program to open something with. It also makes it easier to search for file types than having to parse contents of a file. For an average user it is also likely easier to understand that file ending with ".docx" will be opened by word, than starting to talk about mime types.

    It also makes operations like "list all files in this directory that are images" cheap. Without the file extensions, you'd need programs to maintain caches of detected file types.

    Streamed file systems (like supported by Google Drive, Onedrive, Dropbox) would suddenly be faced with programs fetching and reading files just in order to list them, instead of having to access only metadata. If those services would provide an API to get the file type, the programs would then have to explicitly support that.

  • (nodebb) in reply to the cow (not the robot)

    Well, I'm not an expert in "#", but in C (both with and without "++") it's quite trivial to scan a string and find the last '.' that is after the last '' or ':'...

    In C++17 (or later), the supposed job of this function is even simpler. Stuff the input string into a std::filesystem::path and call the stem() method, then convert the return to a std::string and keep the last three characters, in much the same way that @prueg described for C#, but using C++ objects.

  • Gavin (unregistered)

    This code is a thing of beauty, it is poetry, it is a display of needless and verbose virtuosity.

  • Craig (unregistered)

    I have a suspicion given the rampant Hungarian-esque warts that this is intended to be a straight port of VB code. Expectation of -1 for not finding a dot would be consistent with InStr.

  • Craig (unregistered) in reply to Craig

    Oops, forget that, wrong, InStr also returns zero for not found.

Leave a comment on “Gonna Need an Extension”

Log In or post as a guest

Replying to comment #:

« Return to Article