We’ve all inherited legacy systems. You know the sort; 20 years old, more than 50,000 lines of code, poorly designed - even for its time, completely undocumented externally and useless code comments within, mangled beyond recognition due to countless developers making myriad ad-hoc changes upon changes and so-on. Now imagine such a system written in a tool that’s been around for nearly half a century, but rarely used for the intended purpose of the application.
Reg worked for a firm that built space-rocket related applications; specifically an Ada compiler, written in SNOBOL, for a 15+ years obsolete legacy processor used in the rocket. The system itself consisted of more than 100 SPITBOL (a speedier compiler of SNOBOL) programs, most of which were written by one guy nearly four decades ago, Barry. Barry was a former sixties hippie-turned-coder. Though long since retired, he had been called back to active duty to try and help decipher what this thing does.
The code is full of comments explaining what each block does, but not why. Nor were the comments up to date with what the code actually did, which was one set of “bugs”, in addition to the more normal set of errors. Of course, in those days, nobody wrote unit tests (was it even possible to write test suites for SNOBOL?) Some of the more interesting phenomena included mangled memory addresses, incorrect hex/decimal conversions, offsets disappearing, seemingly random mangling/unmangling/remangling of variable names, etc.
Reg’s ongoing project was to replace this mess with a shiny new Ada compiler written in Python.
Along the way, Reg had to deal with all the control flow of SNOBOL (e.g.: goto’s), on-the-fly execution of strings containing arbitrary SNOBOL code, the immediate-value-assignment operators (‘.’ and ‘$’) and pattern matching that would reduce a regex-wizard to a quivering mass of Jello.
Even Barry, the tie-dyed, retired, hippie could no longer decipher what the internals were doing. Maybe he’d just fried too many neurons. Reg couldn’t get any further- maybe he just wasn’t smoking enough marijuana to understand what the hippie had done. Reg decided to simply try to replicate the output of the legacy system. This was accomplished by running both systems on the same input and doing diffs.
This project started long before Reg joined the firm, and will probably be going strong long after he’s gone.
Reg got the number of diffs on the output down to less than 1,000. That might not sound great, but almost all of them were caused by bugs in the legacy code.
Now his toughest job begins: explaining to management why success must be defined as about 1,000 differences in the output between the legacy and replacement systems, and, more importantly, determining whether correcting the output of the previous systems will cause the rocket to act in an undesirable manner. Like exploding.