The Sohcnum Family Convenience store chain knew two things about their fax-based price distribution process: it was extremely time consuming and completely un-cool. Managers at their twenty-eight locations absolutely hated having to write out, by hand and with a big marker, the hundreds of new price signs that came through every morning. It all seemed so unnecessary, especially considering that it was 1997 and "hi tech" was officially in. Plus, with their aspirations to grow the chain by impressing and attracting big investors, a whiz-bang price distribution system was a must-have.
Enter the Automated Sign System. With a shiny new UI on the frontend and a hulking Oracle server running on the backend, it would send data across the Internet to the individual stores for automated, overnight printing. In the morning, the managers would take the large stack of signs and hang them up around the store. All told, the corporate developers had created a system that generated consistently-formatted signage for all stores and saved a ton on labor costs. The managers also appreciated not having to arrive at 4:30am to write out a day's worth of signs.
So what was one thing that could improve upon this modern marvel? Application performance would have been an added plus.
Since the system had been thrown together in somewhat of a rush so that they could say to investors, "hey investors, we print our signs over the Internet and save a bundle doing it", performance was a bit slow. So much so that it would take a good two hours to consolidate the information for a day's worth of signs for a single store. They could eventually work on this, but in the meantime, when the dough from investors would start rolling in, they planned to just throw some additional hardware at the problem and run the print jobs in parallel.
"We'll get around to rewriting the thing," the IT manager would say, "One of these days."
Twelve Years Later...
Skip ahead twelve years, andthe once venerable Sohcnum Family Convenience Stores grew to over 2,000 stores and gained the more hip moniker of Flying S Convenience Stores. And recently, Scott, who worked in the corporate headquarters, had been given a task of truly great magnitude: improve the performance of the Automated Sign System.
By this point, the application's performance wasn't just hurting, it was bleeding. As the number of stores grew, so did the data set. In fact, it had grown so large that, even with processing split across twenty-four physical servers, it took nearly fifteen hours to create a day's worth of signs.
It turned out that, all the while, the corporation found it simpler (read:cheaper) to just keep adding hardware whenever they needed to scale up their printing needs. Sure, there was some maintenance here and there for Y2K and the occasional bug or upgrade, but for the most part, the now-legacy VB application was still puttering away on 21st century hardware. Hardware that was due to go off-lease in three months.
Migrating the software to a new server, surprisingly, wasn't that big of a deal. But what WAS a deal was that the Flying S big wigs were planning on acquiring and integrating a smaller, 400-store competitor in the next year. Hoping that they could continue to use the application with the additional stores, management turned to Scott for help.
Peeling Back the Wallpaper
During his investigation of how the system worked, Scott was surprised to find that the process of creating the actual sign was pretty simple stuff: roll through a list of signs generated from a SQL query, render a TIFF of each sign, then make a Batch for the printer and print them.
Scott also found that the vast majority of the signs in stores were the same across different markets. For example, something like a box of tissues for $1.50 or a bottle of anti-freeze for $7.00 is the same price everywhere. Unless, of course, the store is in Canada where a third-party module would do the US-to-Canadian dollar conversion.
However, even though there were so many common items, the software would still scroll through every sign record for every store within the company. Immediately spotting a point to improve, Scott got to thinking "Why hadn't the original developers used a unique identifier for each generated sign? That way, the system wouldn't have to regenerate the same TIFF over and over, and things would have been more efficient from the start."
Funny thing was that the original developers did include a unique identifier...mostly.
Chicken or the Egg?
Scott noticed that, after a TIFF was created, it passed through a fairly complex hashing function to generate a 64-byte long "ImageHash", which was then used as the TIFF's unique identifier. Oddly enough, the hashing function seemed to be tied to server's MAC address: given the exact same data, one server would consistently produce the same ImageHash while another would consistently produce a different ImageHash.
After realizing this, Scott knew what he had to do. Surely he could improve performance by tweaking the hashing function to produce consistent results, which would allow the servers to share what TIFFs were created, so that they wouldn't have to keep creating the same TIFF over and over again.
"Waitaminute," he thought, "You don't know what the ImageHash of a sign is until AFTER the TIFF has been created." Realizing that the Hash Generator paradoxically created the very condition that it was helping to solve did nothing but give Scott an even greater headache than he had already received after pouring through enterprise-level VB5 code.
In the end, Scott weighed the level effort it would take for a complete rewrite against just throwing more hardware against the application. Since management had said time and time again that the application DID work, Scott ended up suggesting top-of-the-line systems to replace the off-lease ones. His only hope was that, one day, a server would be released that would not be capable of running the old VB 5 code, and could therefore end the viscous cycle.