• 🤷 (unregistered)

    I too am happy that pneumonoultramicroscopicsilicovolcanoconiosis is on the _uninflectiveWordList, which, despite it's variable name, is an array and not a list. Oh well.

  • Ruts (unregistered) in reply to MiserableOldGit

    Why and who says so? I'd always say mice and I never heard "mouses" anywhere in my travels.

    Is it not 'meeces', in Tom & Jerry? ;-)

  • Richard (unregistered)

    This code should never have been written, clearly.

    It certainly wasn't.

  • Loko8765 (unregistered)

    Everything is fixed in _knownSingluarWords but no one uses it because they cannot spell it.

  • (nodebb) in reply to Scott

    "Attorney General" is already a plural, of "Attorney Specific".

  • Wyrm (unregistered) in reply to Sole Purpose of VIsit

    Going by their _irregularPluralsDictionary list, it's "octopuses". :D I'm more curious about "pneumonoultramicroscopicsilicovolcanoconiosis" and "----".

  • nasch (unregistered) in reply to Sole Purpose of VIsit

    The Audubon Society considers the plural of titmouse to be titmice: "in recent decades, Tufted Titmice have been steadily pushing north."

  • 🤷 (unregistered) in reply to Wyrm

    The "----" has a link behind it: https://thedailywtf.com/articles/The-Clbuttic-Mistake-

    So, I buttume it has something to do with donkeys.

  • (nodebb)

    I've actually used this dotnet service to sweeten up end-user diagnostic messages like "your CSV file contains 1 person without a last name." My users appreciate the readable messages. It's obviously implemented in a way that incurs scorn from linguists. But so what? It does a useful little job.

  • (nodebb) in reply to Steve_The_Cynic
    I think I'll reserve final judgement on this atrocity until I see what it does with the proper noun Lego, then it may be time for the big red button.
    According to the company itself, the plural should be "Lego bricks", but quite frankly that's a long-lost cause.

    You're both wrong: "Lego" is an adjective. It does not have a "plural" form. "Lego Corporation," or "Lego brick(s)" , etc.

  • WTFguy (unregistered)

    Naomi & skington found the same info by the same author; one's the explanatory paper, the other is the actual code. Which actually seems like a pretty smart 99%-ish solution to the problem. It's certainly more thought-out and researched than the typical harried WTF dev's roll-your-own solution would be.

    It looks to me like this .Net implementation is a simple port/copy of that work. Without attribution. Oops. Microsoft Legal might want to know about this.

    Of course it was done by somebody who's not a native English speaker who has little intuitive sense of English spelling and hence can't tell an unusual case from a typo. Why would anyone assign a highly idiosyncratic domain-knowledge-heavy task like this to one of thousands of readily available natural SMEs when they can save a buck or two by choosing someone manifestly underqualified for the job?

    The original article and Perl code is is by an Australian CS prof. As such some of his choices reflect high register Commonwealth/British English over the more colloquial 21st Century US English that most WTFers speak.

    Other than blatant plagiarism and typical ESL typos and grammar faults there's not much to complain about here. Oh yeah, and the Capitalize() method that's missing a ! and is hence a no-op.

  • medievalist (unregistered) in reply to Sole Purpose of VIsit

    "octopus" is in there, it pluralizes to "octopuses". But of course, octopus is one of a great many English words that has multiple acceptable plurals. Cow is probably the champion, what with "cattle" and "kine" and "neats" and so forth....

  • Some Ed (unregistered)

    I only figured out how hard pluralization is to program until after I tried it. It's not an intractable problem at its face. You need to really get into the bowels of it before you find the real issue.

    Languages have a lot of special rules. Plurals like Attorneys General can be difficult for people, but depending on the level at which the program is determining what a word is, it could be trivial, and in fact the hard plurals turn out to be things like expert guards, which is very different semantically from experts guard. But that having been said, it's just additional cases to handle.

    This certainly isn't going to be concise code, because it's not a concise problem. A correct algorithm going to go on and on. Even if you handle each special case in its own subroutine, the code for each special case is going to go on for pages.

    That's not the problem. It doesn't make it impossible to code.

    This is why pluralization is impossible to code: https://thedailywtf.com/articles/comments/microsoft-s-english-pluralization-service#comment-511272

    Note, it doesn't matter whether Just Me was right or wrong. What matters is there's disagreement. If we can't nail down the requirements of the program, we cannot make the program work in a way that everyone will agree with. Full stop, end of story.

    I personally do not take any issue with this having been written. It's been written many times, I have, as I mentioned, done something like it myself. The issue I have with Microsoft's solution is that they assume that theirs is right, and they're so convinced they have provided no ability to fix it.

    It's kind of like their grammar checker, which feels it needs to point out every time I use either its or it's and assert that I could be confused. Number one, I'm not. Number two, pointing out every time and always underlining it would not be helpful even if I was confused. This is not my only issue with their grammar checker, mind you, it's just a poster child that readily comes to mind.

  • Some Ed (unregistered) in reply to Shadow

    You know it's impossible. Everybody here knows it's impossible. Including Alex.

    That doesn't matter. Saying that their code is WTF free is fundamentally part of one of the terms in his job description as CEO.

  • MiserableOldGit (unregistered) in reply to cellocgw
    You're both wrong: "Lego" is an adjective. It does not have a "plural" form. "Lego Corporation," or "Lego brick(s)" , etc.

    If Lego is not a proper noun then such a thing does not exist. It is also a mass noun when referring to a pile of the stuff and that does not stop it being an adjective when used to referring to bricks or wheels or sets or fans etc. The Lego Group never, ever called itself the "Lego Corporation", although over the years has often just referred to itself as "Lego".

    If you are attempting to troll you need to take a side on the whether sticking an s on the word is a punishable crime or not. And then we can cover the academic subject involving numeracy and whether or not that has an s on the end.

  • PageTurner (unregistered)

    There's a DSL for word stemming, Snowball: https://snowballstem.org/

    Stemming algorithms for several languages in Snowball are here: https://snowballstem.org/algorithms/

    It's used by digital library software (like Logos: https://www.logos.com/):

  • (nodebb) in reply to Ruts

    Is it not 'meeces', in Tom & Jerry? ;-)

    In Pixie and Dixie, surely? As spoken by Mr. Jinks: "I hate those meeces to pieces!"

  • Jeff (unregistered) in reply to Shadow

    Did you miss the "that our that our" in the sentence? I was trying to figure out if it was intentional or not.

  • Sigako (unregistered)

    The rumour is Microsoft employs lots of Chinese and Indian migrants that write code for food, and most of them know English barely enough to kiss some higher-up's butts. This explains both "textbook as a source" and mistakes in actual English.

  • <string> admirerer (unregistered)

    I know this is old, but you have unescaped html, like many other articles, that hides the <string> generics, since my browser thinks it's an html tag. Many other articles with generics have the same problem.

Leave a comment on “Microsoft's English Pluralization Service”

Log In or post as a guest

Replying to comment #511337:

« Return to Article