Steven worked for Integrated Machinations, a company that made huge machines to sell to manufacturers so they could actually manufacture stuff. He didn't build the machines, that would require hard physical labor. Instead, he wrote computer programs that interfaced with the machines from the comfort of the air-conditioned office. One such program was a diagnostic app used to log the performance of Integrated Machinations products. The machines didn't break down often, but when they did, logging was very important. Customers wouldn't be in a mood to hear that IM didn't know why the equipment they dropped fat stacks of cash on failed.

Text-xml file-type image Steven also had a subordinate named Thomas, who was foist upon Steven in an effort to expand the small development team. Steven could have easily handled everything himself, but Thomas needed something to do so he was given the simplest part of the diagnostic app - the downloader. Steven's code handled the statistical compiling, number-crunching, and fancy chart-making aspects of the application. All Thomas had to do was make the piece that downloaded the raw files from the machines to pass back.

Thomas spent two months on something that would have taken Steven a week tops. It worked in their test environment, but Steven wanted to code review it went to production. Before he could, the higher-ups informed him there was no time. The logging and downloading system was installed and began to do its thing.

Much to Steven's pleasant surprise, the downloader piece worked in the real world. Thomas had it set up to run every minute from Crontab on every machine their pilot client had. It passed back what the compiler needed in XML format and they had neatly-displayed diagnostic stats to show. This went on for a week, until it didn't.

Steven came in that Monday to find that nothing had been downloaded over the weekend. As soon as Thomas meandered in, unshaven and bleary-eyed, he instructed him to check on the downloader. "Sure, if I can fight off this hangover long enough. Are you sure your stuff isn't broken??" Thomas replied, half joking, half trying not to pass out.

Two hours passed, half of which Thomas spent in the bathroom. He finally came back to Steven's office to report, "Everything is back to normal! We lost all the logs from the weekend, but who works on the weekend anyway?" He quickly disappeared without further explanation.

So began a repeating cycle of the downloader crashing, Thomas coming to work hung over, then fixing it without explanation. The Thomas problem got resolved before the downloader problem. He was relieved from his employment at Integrated Machinations after his sixth "no-call, no-show". This left Steven to support the downloader the next time it went down. It was completely undocumented, so he had to dig in.

He found the problem was with the log file itself, which had bad XML for some reason. Since XML has a rigorously specified "Parse or Die!" standard, and Thomas wasn't much for writing exception handlers, the next time the downloader ran, it would read in the XML file, get a parse error, and die. It was at this point Thomas would have to delete the XML file, restart the downloader, and things would get back to normal.

Digging in further, he found every time the downloader ran, it read and parsed the entire log file, then manipulated the parse tree and added a new <download> element after each record. Finally, it wrote the whole thing back to disk.


<logfile>
 <download timestamp="2016-09-04 16:23:00">
 <file name="foo_1234.data"/>
 <file name="foo_1235.data"/>
 <file name="foo_1236.data"/>
 </download>
 <download timestamp="2016-09-04 16:24:00">
 <file name="foo_1237.data"/>
 <file name="foo_1238.data"/>
 <file name="foo_1239.data"/>
 </download>
 ...
</logfile>

There was quite a bit of code dedicated to this rather complex and intricate process but there were no obvious problems with the code itself. When it ran, it worked. But if you just left the thing running from Cron, then sooner or later, something would go wrong and the XML file on disk would get corrupted. Steven never could figure out what was causing the file corruption, and it wasn't worth investigating. He tore out all the logging code and replaced it with three lines:


 # Name the file after today's date
 # >> opens the file for append. Linux *always* gets this right.
 open FILE, ">> $log_dir/$date.log";
 # Look Ma! No XML!
 print FILE "$date $time $downloaded_filename\n";
 # Delete any files that are more than 3 days old
 unlink grep { -M > 3 } <$log_dir/*.log>;

From there on, the downloader never failed again and the scourge of Thomas had been put to rest.

[Advertisement] BuildMaster allows you to create a self-service release management platform that allows different teams to manage their applications. Explore how!