Emma W. was hired on by BerkTech’s QA department in preparation for a major code rewrite. A Russian company had purchased a thousand copies of BerkTech’s emponymous software package, but as it only supported English, it would require a substantial localization project to support Russian.

After Emma started, it didn’t take long for her to notice some common patterns in her unit tests.

“Why is the script scrubber.awk telling me a file is missing a semicolon?” she asked Danny, her supervisor. “Shouldn’t this error come from the compiler?”

“No, we scrub everything before it gets to the compiler,” Danny explained. “Nothing ever reaches it without passing our best practices. That’s what scrubber.awk is for.”

Curious as to what kind of pre-processing that file was performing, Emma cajoled a developer into letting her take a peek at the source code. There wasn’t just one .awk script pre-processing the C++ code.

There were 107.

If All You Have Is A Hammer…

“Danny,” Emma asked over lunch one day, “don’t you think we rely too much on AWK for our build process?”

“AWK is our build process,” he replied. “It’s like mortar, joining our bricks of C++ code. At least that’s how Rupert describes it.” Rupert was the chief code architect, and the one who first wrote BerkTech’s code decades ago.

“But you don’t need AWK scripts for almost anything. Compilers can give you much more nuanced syntax errors and lexical analysis than ad-hoc scripts can.”

“Yeah, but it’s Rupert’s baby. Rupert wrote the app in AWK first, using a branch of one of the old Unix-based interpreters. Later he rewrote parts of it in C++ for better performance. He’s never wanted to let go of AWK. He won’t even run the compiler from the command line. When I say everything has to be done through AWK, I mean everything.”

Time Capsule

Soon after, Emma met Rupert for the first time. His office was a perfectly preserved time capsule from 1983. Books older than Emma sat on a shelf, arranged by subject, and surprisingly dust-free. Rupert seemed like he had been preserved in the time capsule, too, with a tight collared polo and polyester slacks.

“Danny sent me,” Emma said. “We’ve noticed a lot of multi-byte encoding issues come up lately.”

“Multi-byte?” Rupert said. “Our developers should just use regular ASCII.”

“But that won’t work for localization. Our Russian translators give us our localization files in Unicode, which uses multi-byte characters for the Cyrillic alphabet. The problem is our version of AWK. It wasn’t designed for multi-byte encodings.”

“AWK can handle Unicode,” Rupert said, dismissing Emma with a wave. “It can handle anything.”


Soon, development stalled on the localization project because of the multi-byte encoding issue, and deadlines were missed. With the top brass breathing down his neck, Rupert called an all-staff meeting at a fast food joint down the street to discuss the issue.

“I know what you’re thinking,” Rupert started, “But we’re not ditching AWK.”

Danny spoke first. “There’s no other way,” he said. “There are dozens, hundreds of scripting tools we can use. We can just hand off the localized strings to another tool, and AWK won’t even have to touch it.”

“Not going to happen.” Rupert crossed his arms.

“The installation process is taking too long!” someone else added. “We have to install our own version of AWK on every computer the application runs on!”

“AWK is not a resource hog,” Rupert said, adamant. “It’s no big deal.”

The complaints raged for an hour. Finally, Rupert said, “This is a waste of time. We’re not ditching AWK, and we’re not bringing in another toolset. I’ll fix the encoding issue myself.”

…Everything Looks Like A Nail

Rupert’s localization code arrived so late, Emma was forced to work nights unit-testing all the new code. By and large, everything worked as Rupert had promised. The project shipped, and soon a thousand PCs in a corporate office in Moscow had Rupert’s obscure version of AWK installed on them.

Danny was uneasy, as he told Emma over lunch following the release. “You know those obfusgation coding contests? I tried looking at Rupert’s localization code. It’s more impenetrable than the samples I’ve seen for those contests. If you find anything wrong, he’s the only one who can fix it.”

Emma remembered Rupert’s perfectly-preserved office. “If he can keep his code as clean as his office, maybe all of his AWK scripts will keep working for another ten years.”