- Feature Articles
- CodeSOD
-
Error'd
- Most Recent Articles
- Office Politics
- Secret Horror
- Not Impossible
- Monkeys
- Killing Time
- Hypersensitive
- Infallabella
- Doubled Daniel
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Koremachiko !
Admin
HA HA HA duis is tsrif
Admin
He shoved the paper at Brian, who took it apprehensively.
"fukushita", "kakashite", "fukumihado", "diefatsu", "tokaduki", and "fukusuka", he read, collapsing inwardly and visibly shaking as he read down the list.
"They dropped our contract!" Brian shrieked, "Half our revenue is gone! You've killed our company!"
Brian protested that dropping the bad-word filter and using Japanese were both Barry's idea...
TRWTF is that poor old Barry can't get a word in edgeways
Admin
Actually, I think I remember where I saw this one before.
The sidebar.
Admin
Many prior grammatical indiscretions were forgiven after noticing the phrase "pored over" in this WTF rather than the swiftly-becoming-standard "poured over". Throw in a "bated breath" and I'll stop bitching about missing words.
Of course it's classic! Can't you see it's set in 1999?
(third attempt)
Admin
TRWTF is the client that can't handle psuedo-obscene IDs. I suppose their workforce consists entirely of 6 year-old children who must be shielded from such naughty language.
Admin
Not one of his cited words sounds remotely rude if you pronounce them correctly. Not only was the boss a prick, he was one of the ignorant variety.
Captcha: inhibeo
Admin
TRWTF is that in any real system with > 100 IDs, the limitations of what character strings you could generate would mean that each 'id' would be about a paragraph long. IOW, this story was made up.
Admin
Yeah, but the boss was a VP of Marketing--what the heck do you expect?
Admin
TRWTF is that Brian typed in all the pages instead of just scanning and OCR.
Admin
They're made up words. There is no "correctly".
Admin
The main point here is not that any of the words is rude, nor that any of them sounds rude when said by a native speaker, but that letting non-speakers of a language loose with its words is a hazard.
Making the IDs more like account numbers might have been a good idea, perhaps dropping the hex encoding and shoving a few hyphens in to break them up.
Admin
Admin
As soon as I read this in the Sidebar I knew it would be front-page material. Some people think it has appeared here before but are you guys sure you didn't just read it when it was posted in the forums? Anyway, thanks to the submitter, it's a beauty!
Admin
How do you figure that?
Even if you were to limit to the five short vowels and seven least irregular Japanese consonants, that's 35 distinct values that can be represented by a single syllable.
Two syllables = 35^2 = 1,225 values Three syllables = 35^3 = 42,875 values
If we assume a person could remember an ID of eight syllables, that allows for more than 2.2 billion distinct values.
Admin
Did anyone else notice how Barry and Brian switched places in the story a few times?
Admin
You can't divine "phonemic combinations" from textual analysis. Least of all in English, that would clearly be ridiculous. Noobs.
Admin
Admin
Admin
It based on Japanese phonetics. There is exactly one correct way to pronounce each word generated. It's only ambiguous when you try to pronounce it using an English phonetic system.
Admin
"In a language or dialect, a phoneme is the smallest segmental unit of sound employed to form meaningful contrasts between utterances." - http://en.wikipedia.org/wiki/Phoneme.
Now... do YOU want to try that again, or should you maybe stay quiet and let the grownups talk?
Admin
A much better idea would've been to just randomly string together plain English words from a sanitized list. Easier to remember, easier to pronounce, no risk of business-destroying garglepussy. Also, two days to write a Markov chain random text generation program? It shouldn't even take two hours. And while I'm nitpicking, Japanese learning books that use romaji are a blight upon humanity, and even losing his job is not too great a punishment for one who as sinned so.
Admin
Is there any particular reason they didn't just simplify their existing IDs? Switch from hex to decimal, throw in a few dashes, and you should wind up with something fairly easy to pronounce:
1027-4002-9530-3064
TEN TWENTY SEVEN! FOUR-THOUSAND TWO! NINETY-FIVE THIRTY! THIRTY SIXTY-FOUR!
OR, a real salesperson would offer to build this customer a brand spanking new system (for a mere million dollars) where they wouldn't need to speak these IDs aloud anymore.
Admin
Agreed. While reading the words in japanese (no spaces injected, and the look Japanese), I found nothing offenive. Translating, of course, I found them humorously offensive.
Still I wonder why they are using a Markovian process to transform numbers into words.
TRWTF is, of course, the yelling out of IDs all day.
TRRWTF is developing to the requirement of people that use the "shouting" method of communicating CS data. If only there were some sort of instant computer-to-computer method of communication.
Admin
can anyone explain how "tokaduki" is rude ????
Admin
It's not very rude, but it reminds me of the word "toke", and you can have client's employees thinking about drugs.
Admin
Toke A Dookie. Dookie being a slang term for that which occasionally drops out of your butt.
Admin
Clbuttic!
Right up there with "Never, ever leave the singer in charge of the mix" has got to be "Never let the marketers make technical decisions".
One time when I was poring over Apache logs, idly wondering what it would be like to generate pseudorandom text from people's IP addresses, just so I'd have something phonetic to call them. Now I know!
Admin
that makes sense, although if their first thought is 'toking the duki' you probably want to get rid of the employee rather than the software.
anyone had a rude captcha yet?
Admin
So you americans are so "political correct" that a string like "fukumashita" is an insult?
PS: Ironically, I wrote "political correct" to be political correct and not writing stupid... then I'm stupid... well, at least I'm not 250 million stupid.
Admin
I suppose none of them are Cubs fans. Fukudome? What? Is that the F-You Dome? Why are they so offensive?
Come to think of it ... couldn't some of these wtf captchas be considered questionable?
Admin
The solution is even easier than that. Consider:
So combine them.
CAMPAIGN-RANDOM_CRAP
In their accounting, tell them to drop everything after the dash. In their software, use the full string.
Admin
There's no need to make fun of other people's comments like that. I'm sure they're doing the best they can.
Admin
'di' and 'du' are not very japanesey phonemes.
He shouldn't have agreed to the idea in the first place -- why would you take a simple problem about synthetic keys and introduce natural languages and other extreme complexities?
Admin
Your math sucks dude.
Admin
Software engineering rule #1: The customer doesn't know what they want.
Classic example of a perceived problem causing more artifical ones. Also goalposts were moved (probably contrived for effect, just to make giggle words in Japanese).
Customer did not have a real problem (too many to do manually, not enough to automate). If they were going to pull out because of it, I would have to think there were other problems.
Definitely solution fixation. Why multiple syllible words? Why couldn't the dictionary be updated to remove the giggle words(Gutenburg is PD, right?)? Where's the sales guy to explain how such words could improve morale? What happens when the client starts putting context on the nonsense words (you could use some pusdiction, or most words + knob)?
Change the solution, and those artificial problems go away.
Admin
I don't buy it.
The first thing most coders would think of is to drop inappropriate words.
I can't imagine a customer being happy having to pronounce nonsense words.
I can't imagine a customer dropping you because of something so minor and easily fixable.
Sorry, too many improbabilities. Not buying it.
Admin
TRWTF is these guys were still making code changes days before the release.
Are you telling me a 9 month project can be regression tested in a day or two?
Admin
Well, either you are very lucky or you are living under a rock. You wouldn't belive how stupid and over reactive people can be... Yes, there are even people who would drop a contract because of some insignificant stuff...
Admin
And miss the chance of playing with Markov Chains??? C'mon, how could anyone pass that!!!??!!??!!
Admin
Oh man, I thought we voted in Pres. Obama so that everyone would like America again.
Admin
Why not just give the client a link to an IM client? Problem solved without any changes at all
Admin
Wait... what?
Admin
Admin
私は分かりません。
Admin
It is in Nihon-shiki romanization, where ぢ and づ are romanized just like that.
Hepburn is a bit more common, though (at least in my experience), presumably because the romanization matches the pronunciation better. In that case, you'd have romanized those characters as ji and zu. However, that means there's no one-to-one mapping between romaji and characters, since ji and zu are also used for (the more common) じ and ず.
The real question is perhaps how we have both "shi" and "di" in there, since Nihon-shiki doesn't use shi, but si. (You'd once again need Hepburn for shi.) That suggests a mostly ad-hoc scheme intending to match pronunciation fairly closely, while still maintaining a one-to-one mapping (but I believe dzi and dzu are more common romanizations in that case).
Admin
As usual, the simpliest solution would be aborred by most engineers
Admin
Is it possible to replace the present captcha with this generator?
Admin
Oh bugger - I remember when I first wrote an automated username generator - no forbidden characters like 0O1lI, and only pronouncable syllables..
Luckily, the customer discovered the first "interesting" words during testing, before we went live.
We quickly added a patch, where we generated a bunch of random names in a database table and manually filtered the offensive ones - the server picked the top one, and only generated a new one if there were no screened ones in the database table.. :D
Admin