Keeping hundreds of millions of sheets of paper on file isn't easy, so the IRS had an application built to computerize their records. It'd scan paper tax returns into a WORM (Write Once, Read Many) drive system and record lookup data in a database. That way they could filter by any fields they recorded in the database and access a scanned image of the tax return for any further information using a simple app, which sure beat the old method of data retrieval — digging through boxes, incurring huge wait times.
The nice thing about the old method, though, was that it generally worked. The new system was full of bugs, in addition to several other irritating issues. On Bobby's first day he was put in front of the application, and right off the bat it looked amateurish. Form elements not lined up properly, buttons not always the same size, inconsistent menus — not broken, but certainly not professional looking. His boss, Boris, explained some of the finer features in a dry, humorless, low monotone.
"Now, as you can see heeere," his boss mumbled, "this is pretty exciiiting." Boris would linger on certain syllables presumably in an effort to sound more boring. His tone made Ben Stein sound like Freddie Mercury. "Allll we have to do is click a feeew buttons annnnd..." Meanwhile he was typing some text into the fields to perform a lookup. "And affffter a few short minnnnnutes..."
Bobby was briefly impressed — Boris had made a joke! Not a particularly funny one, granted, but certainly worth a polite chuckle... except- wait a minute. This is actually taking several minutes, Bobby realized. And in Boris's company, each passing minute felt like ten. Boris whiled away the minutes describing some of his hobbies — toy piano tuning, model taxidermy, palindromatic haiku composition — each somehow more boring than the last. After an eternity (nine minutes) of struggling to keep his eyes open and subtle closed-mouth yawning, an error message popped up.
"Ohhh, fidddddlesticks," mumbled Boris in lifeless anger. As Bobby would eventually learn, the database's average response time for tax form searches was in the realm of 8-10 minutes. That was if you were fortunate, though — it'd often just time out.
Although Bobby wanted to help, he wasn't allowed to lay a finger on the database. He was only authorized to edit the frontend code. So to start off he gathered whatever information he could — the hardware (Pyramid Minicomputer running four 150Mhz processors) and the software (AT&T Unix and Informix DB server). He looked through the code and made a few UI improvements, and it didn't take long for him to locate the biggest bottleneck — the database.
After getting in touch with the DBAs, Bobby was granted read-only ODBC access. Bobby kept making tweaks and little optimizations, constantly running SELECT queries, but couldn't get them to run any faster than six minutes. He dug through ODBC settings, hoping to find something... anything that he could change to improve performance, but judging by the crazy, arcane settings already set up, many had tried the same thing and failed.
Bobby had spent several hours and not gotten far from where he started. Wanting something to show for the time spent, he threw together a quick "Please Wait" dialog box with an animated hourglass to give the bored operators something to watch while they waited for the data to come back.
Several days later, Bobby had a stroke of luck. One of the higher-ups had caught wind of his UI changes and wanted a demonstration. When the topic of speed came up, Bobby mentioned that he'd found the bottleneck and suggested he get full access. It shouldn't be a problem since he worked under constant supervision anyway.
"I don't think that'll be an issue," he said with a smile. "Can you do it right now?"
Bobby hesitated, then gave a weak "...sure." He flushed a little red, concerned that he was wrong and he'd embarrass himself in front of everyone. A DBA was called to the room, who opened up a console window and turned Bobby loose.
Of course, it's highly unorthodox for someone to be given absolute access to one of the IRS's production databases, but no one wanted to question the big cheese. The DBA watched Bobby like a hawk.
Bobby swallowed the lump in his throat, tried to ignore the seven sets of eyes on him, and typed some simple queries in an attempt to isolate the problem. In short:
- Indexes: none
- Primary keys: none
- Rows: ~2M per table, four or more joins per query
- Processor speed: 150Mhz
Bobby fired off a few commands to create indexes on fields that he knew were frequently queried. The DBA's stare intensified. Bobby felt more nervous, hoping that it would all work ok. And after verifying that queries were still being profiled, he ran a SELECT query.
It took 0.22 seconds.
Crap, he thought. There's no way it could've run that fast. I must've totally destroyed the production database. Wiping sweat off his forehead, he ran the query again, this time to verify the results. And the results were correct.
With each query he ran his smile grew larger, as did the rest of the room's collective disbelief. The "Please Wait" window would appear and hide so quickly that no one could even see it.
They tried to chart the difference, but what's the point? 8-12 minutes versus 0.35 seconds average? Management did all but break out the pom-poms and do cartwheels throughout the office. The operators were so ecstatic that they actually threw him a party. As for us US taxpayers, we benefited as well since we're getting taxed faster!