It was 1999, and Brian's company's new online marketing venture was finally off the ground and making a profit using an off-the-shelf conglomeration of bits and pieces of various content management, affiliate program, and ad servers. Brian's team had hit all of the goals for the first funding tranche, and the next step was to use those millions of dollars to grow the staff from twelve to fifty, half of whom would be software developers working directly for Brian.
The project was an $8 million, nine-month development effort to build, from the ground up, the best 21st-century marketing/e-commerce/community/ad network/reporting system mousetrap possible. Leading a team of twenty people was a big step up for Brian, so he buckled down, read management theory books, re-read The Mythical Man Month, learned the ins-and-outs of project management software, invested in UML and process training, and carefully pored over resumes to find the best candidates.
Having assembled, trained, and indoctrinated his team in best practices, they went to work. They held stakeholder interviews, pored over requirements, developed use-case models, charted process flow, designed domain entity models, and built their development plan. They developed cleanly separated business logic, persistence, and user experience tiers. They followed formal test-driven development. They held weekly group code reviews. They did just about everything right, so much so that Brian had never before been on a project which ran so smoothly.
Until the last week before release.
One of the key components was the custom-built ad server. It used a single unique ID for each ad placement and handled the switching of creatives on the backend. At the time, that was still uncommon, with many affiliate programs and ad servers hardcoding the creative images directly into the HTML ad serving code. Being able to manage updates of creatives, optimize banner rotation, and take down unwanted ads on the back end was one of the major advantages as seen by the business users.
"Hey Brian, quick question." It was Barry, VP of Marketing. The entire company was his brainchild, and he was CEO in all but name. "Some clients are complaining that these IDs are kind of ugly, always 'F0DB57A3C10EE7D28277' or some other unpronounceable jumble."
Now, the IDs were uniquely generated each time a new ad placement was created. With dozens of client and hundreds of affiliates signed up representing thousands of web sites and tens of thousands of pages, they needed to generate them automatically. Also, to avoid possible fraud, the IDs needed to be non-sequential but unique. Plus, they were purely used on the backend — nobody needed to pronounce them.
"These are only shown to users as URL parameters, no different than the session ID," Brian protested. More diplomatically, he asked, "What's the reason to for making them pronounceable?"
"Well," he admitted, "one of our biggest advertisers has this legacy accounting system which their IS department won't integrate with our online reports. When talking by phone with people in different offices, they have to read the IDs to each other to be able to identify which accounts they are talking about."
After thinking a moment, Brian realized that this was the perfect place to apply an algorithm he had learned about recently. "Markov chains!" he blurted. "We can use statistical textual analysis to generate random words built up from natural phonemic combinations. They won't be real words, but they will match expected English patterns, and people will be able to pronounce and read them completely naturally."
Intrigued and satisfied, Barry assented, reminding him that the release at the end of the week still had to be met.
The deadline wasn't an issue. Barry was already off, thinking through the design in his head. Over the next two days he developed a corpus analyzer to build the necessary statistical models, and created a generator to randomly string them together and output the "pronounceable IDs". It was a great success. Everyone crowded around to see the server spitting out fake words like "enspattle", "flargleblum", "unclorifical", and "macrodestic".
Barry was ecstatic too, "This is great! That client has been threatening to drop us because reading off those codes is slowing down their operations. They're our most well-known anchor client, so if they go, others will drop with them. This is exactly what I we need to keep them. Let's demo it right away."
The two of them drove up to the client's offices in the city, and Barry proudly told them that he brought along the genius who came up with a way to make readable codes and increase their workers' productivity. Barry opened up a browser to the demo word generator page and clicked "New random word".
"garglepussy" immediately popped up on the screen.
After a silent five seconds while the client stared in horror, Barry said "Well, it is random after all. Brian, you can filter that, right?"
"Sure, I'll put a bad-word list together," Brian said, groaning that he hadn't thought of it before. They ran through it a few more times, getting nice normal-sounding words like "blutterful", "trimbolid", "anavastic", and finally left the client happily and contented in their offices.
During the drive back, Barry said, "I've been thinking about it and it's too dangerous to just have a bad-word filter. We'll never be able to think up every possible offensive-sounding combination. Can you make them sound like a foreign language instead of English?"
It was a good idea — he already had a corpus analyzer and could plug in just about any text he felt like. "Sure, I'll show you some samples this afternoon," Brian told him.
Brian spent the next few hours running the corpus on just about any foreign text he could find. He plugged in "Lorem ipsum dolor" to get some fake Latin, the German libretto of Die Zauberflöte, some Balzac novels cribbed off of the French Gutenberg Project, the text from some Italian airplane manufacturer's web site, and Don Quijote.
Barry stopped by and they tried the samples out, one by one.
Latin: "Everybody pronounces it wrong, differently."
Hindi: "Too many weird vowels, and it makes me want to slip into an Apu accent."
German: "All those consonants and throaty sounds are too hard."
French: "Are you kidding? Most of the letters at the end of words are silent."
Italian: "Better, but it look me two years to learn to say 'gnocchi' right."
Spanish: "Easy vowels, simple sounds, best yet! But some of their staff are Hispanic."
They sat there for a few minutes trying to think of something else when Barry cried out, "I've got it!" And ran out of the room. He came back with a Japanese study book. "I'm planning to expand overseas after this release, and bought this to study since everything is written in the English alphabet. It's perfect — there are simple vowels, only a few consonants, and no funny sounds to trip you up. Even if people pronounce a little differently it's still easy to figure out. And nobody knows what it means so it can't be offensive!"
The next day, one day before release, Brian had finished typing in page after page of meaningless Japanese and they were off once more to demo to the client. Barry carefully clicked "New random word."
"koremachiko", "sabashimasu", "tobetokaga", "mitsukaremo". The client carefully read each example, and after a few minutes leaned back, chuckling, "That's perfect, Barry. Anyone can read these, and no chance of offense. Thanks for doing this for us, we're onboard for the launch."
Barry was ecstatic.
By the next morning, everyone in the office was in a great mood, too. The launch had gone smoothly, and their realtime reporting showed all of website activity, ad serving, and commerce transactions ramping up. Everything was working perfectly.
Everyone was in the middle of a celebratory all-hands wine & cheese party in the conference room when Barry got a call and stepped out of the room. Several minutes later he came storming back into the room, waving an email he had printed out and yelling, "Brian, Brian, you useless screwup! What the hell is wrong with you? How do you explain this? Read it, OUT LOUD!" He shoved the paper at Brian, who took it apprehensively.
"fukushita", "kakashite", "fukumihado", "diefatsu", "tokaduki", and "fukusuka", he read, collapsing inwardly and visibly shaking as he read down the list.
"They dropped our contract!" Brian shrieked, "Half our revenue is gone! You've killed our company!"
Brian protested that dropping the bad-word filter and using Japanese were both Barry's idea, but he could see that this would do nothing to abate his fury. The next day he asked for, and was happily granted a two week vacation which he used to start looking for a new job.
Five years later at an industry mixer, Brian exchanged cards with a developer and saw that he worked in Barry's department at his old company. He told him that he had worked there years before and was glad to see the company had recovered and was going strong.
"You worked there too?" he asked, looking at Brian's name tag. He suddenly got a strange look on his face. "Wait, are you that Brian, who first developed the business platform?"
Cautiously, Brian replied in the affirmative.
He broke out into a huge smile. "You're famous. You know, we're still using it. We call it The Automated Curse Generator."