- Feature Articles
- CodeSOD
- Error'd
- 
                
                    Forums 
- 
                Other Articles
                - Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
 
 
            
Admin
I hate anything but nihonsiki. For the purpose of spelling and grammar it makes no sense. The words bear only a rudimentary resemblance to what they should be. Eg the conjunctive form in hepburn/kunrei transforms ku -> ki su - shi tsu -> chi nu -> ni fu -> hi (etc)... There's no pattern! The same conversation in Nihonsiki ku -> ki su -> si tu -> ti nu -> ni hu -> hi That's a lot easier to work with because it all follows the same pattern.
Admin
I hate anything but nihonsiki. For the purpose of spelling and grammar it makes no sense. The words bear only a rudimentary resemblance to what they should be. Eg the conjunctive form in hepburn/kunrei transforms ku -> ki su - shi tsu -> chi nu -> ni fu -> hi (etc)... There's no pattern! The same conversation in Nihonsiki ku -> ki su -> si tu -> ti nu -> ni hu -> hi That's a lot easier to work with because it all follows the same pattern.
Admin
Come on, man. I have a hard time believing that this story is real.
Yeah, really. Half of the business of the company depended on this silly stuff and the biggest client swings from happy to dropping the contract overnight because of some trivial silly thing like this.Doesn't sound very realistic to me.
Admin
So, this is the genesis of the TDWTF captcha generator?
Admin
Admin
It can also be that there are more than one vendor in the competition, they're both about the same, give or take a few things, so the decision might be difficult (and include, again, company politics)...and all they need is one excuse to make the decision easy.
Admin
These are often expressed as "taboos"...or, if you wish, being "Politically Correct".
Racially homogeneous societies may not have PC(racial), but may have PC(class) or PC(caste) or PC(heredity) or PC(dialect) or PC(location) or PC(religion), etc.
Admin
It's a true story, there are some more details in the sidebar thread. the story has obviously been simplified--reality is always more complicated.
Answers to some questions which have popped up more than once:
- Markov generation was one of many options presented to the client. Others were:
      
- Client was old-school-- no Internet access for those workers involved.
- nobody had to type in the ID's. My understanding was that someone n their staff maintained a dictionary file mapping these to their internal identifiers, portions of which were printed and distributed to those who needed to know. Not sure about all the details on their side.
- we used the romanization of the book used as a source for the text. 
- it took two days because in addition to the corpus parser and word generator, there were design docs, db changes to hold the mapping id's, URL parameter format parsing, internal mapping between these id's and actual entities, test cases, and lots of time searching for good text sources.  We went through a lot of test runs to see what would come out.
- Much of the testing staff being foreign and time pressure being high, nobody noticed the fairly rare oddities.
- A bad-word filter was in fact implemented, and a few hundred curse words were checked in to version control. One the target language was changed away from English, everybody thought it was unnecessary. There was actually a group of people scribbling away on a whiteboard. But nobody could agree how bad or how marginally offensive a word needed to be Included.
- Internally, the client wasn't fully supportive of the project. In their company politics there were some who thought it was a mistake to spend much effort on the Internet until it was more well established, and were ready to pounce on any excuse to shut down this particular VP's project
Captcha: damnum. Oh how I wish iPhone could take a safari screenshot.- Numeric (999-999-999)
- CIA "codewords" with prefix+word a la HAVE BLUE or PAVE HAVOK
- UK post code style AnAnAnAn "H4j3d2l5"
- manually entered
- import of their internal account IDE
Each of these were first rejected for various reasons.Admin
I think this is a relatively small advantage. Yes, Nihon-shiki does illustrate patterns more easily, but you still have to abandon those patterns when you pronounce them aloud anyway. In other words, the irregularity doesn't go away, it just gets moved over into pronunciation instead of spelling. So now you still have a counterintuitive system, and one that people can't even pronounce correctly without a bit of mental training.
For the record, I learned using kana anyway, sidestepping the whole issue. If you know kana, then which romanization system you use when you do have to use romaji is surely a trivial matter, as any mind capable of learning the kana is capable of handling multiple romanization systems.
Admin
psst... that isn't the UK post code format.
United Kingdom post code format is: A(A)N(A/N)NAA ( though most leave a space before the last 3 characters. )
And if by UK you meant Ukraine, well their post code format is: NNNNN
I don't think your suggested format "AnAnAnAn" actually matches the post code of any nation, but a shorter version - "AnAnAn" - is the format for post codes in Canada.
Admin
Fukumi. Very common girl's name. The second "u" is voiced, though, since the following consonant is voiced. (But it's often not voiced very much ... ;-) )
Fukuyu would not be a common name, but would still be entirely possible. I would not be surprised to be introduced to someone named Fukuyu.
Shiho is another matter all together ...
Admin
Admin
Political correctness is substituting words with an euphemism or jargon. Like using 'physically challenged' instead of 'disabled'. So 'you are politically correct', instead of 'you are stupid'. Which is recursively challenged.
Admin
Ok.
"diefatsu"
I'd buy a car with that logo.
Admin
well... I stand corrected :)
( maybe I shoud just get back to... The Project... )
Admin
Can you at least grasp the concept that offensive words can only be offensive when Person A says them in order to insult Person B?
Here we have A reading computer-generated words to B in order to transfer information.
Or do you feel that Free Speech should bow to courtesy? How about a public reading of a novel with so-called strong language? Should the reader just skip the iffy words?
Admin
The problem of reading binary strings aloud was partially solved by 1995 with the 11-bit to English word S/KEY encoding specified in RFC1760 (http://tools.ietf.org/html/rfc1760).
It would remain to introduce some kind of checksum to ensure validity and possibly a length encoding for variable binary strings (c.f. also Dan Bernstein's "netstrings" proposal).
It is still possible to produce offensive statements in RFC1760 encoding (e.g. "EVIL" "DAVE" "ATE" "GAY" "KNOB" is a valid string) but perhaps less likely, and the list could easily be sanitised further.
Admin
Rip Blu-ray with 12 different ways. Download Blu-ray rip software for free and rip Bluray to DVD, PC, MP4, ISO, AVI, MPEG, iPod, WMV, MOV, etc.
Admin
actually, there is a specific and set way to pronounce words in Japanese - even though the words shown are "made up", any Japanese speaker would pronounce them exactly the same way. (presumably this is one of the reasons why they settled on it as a source language for the generator - Japanese pronunciation is both regular and easy for western speakers.
For example one of the words given - "fukushita" - would be pronounced "foo-koo-she-tah" (and actually the word "fukushita" is the past tense form of the verb "fuku suru" meaning "to return to normal" and would be written 復した in Japanese). Another example would be the cited "kakashite" which would be pronounced "kah-kah-she-teh" (and now that I think of it, that is actually a conjugated version of a real verb in Japanese 欠かす, meaning "to miss") Anyway, I think that's what Incourced was getting at when s/he was talking about "pronouncing them correctly."
Admin
Fukushita ??????
FUKUSHIMA !!!!!!
Admin
I call bull-ファッキングシット.
You can't spell a single past tense sentence in Roman characters in Japanese without the substring "shit". Which means there's no way this happened.
Admin
You actually make it seem so easy with your presentation but I find this topic to be actually something that I think I would never understand. It seems too complicated and very broad for me. I'm looking forward for your next post, I will try to get the hang of it! donate for ukraine
Admin
The 10 Scariest Things About Adultwork Pornstar adultwork Pornstar (images.google.com.gt)