At the time, it seemed like a good idea. When a fatal error would occur in a batch program, a message would be sent to the operator’s console that notified him of the error and prompted him to terminate the program. From there, the operator would call the program’s support contact (generally a programmer) and ask him what to do. Management figured that someone with some knowledge of the program should be the one that decides, not just some operator.
Of course, as the years passed by, management learned three things about this policy. First, the support contact’s answer was always “it’s a fatal error! What can I possibly do from home at 2:00 AM? Just terminate it!” Second, after years and years of developing batch programs for internal clients, there were a whole lot of batch programs (about 20,000 in all) and a whole lot of programmers that were called in the middle of the night. And third, because the operator notification would block until an answer was received, subsequent programs would be delayed while the operator tried to get a hold of the responsible programmer, who would occasionally disconnect his phone at night to avoid the inevitable call. Obviously, something had to be done.
Bo Christensen and his team of COBOL programmers were assigned to review the error handling code and recommend a way to automatically terminate the job and write the fatal error message in the program’s logfile. The standard error handling code looked something like this...
IF (some error) DISPLAY 'A FATAL ERROR HAS OCCURRED IN PROGRAM GLV2030C, TERMINATE JOB' UPON CONSOLE END-IF.
All that would need to be done was to change the code so that it’d perform a brute termination (an ABEND, for ABnormal END). The replacement code looked something like this...
IF (some error) DISPLAY 'A FATAL ERROR HAS OCCURRED IN PROGRAM GLV2030C, JOB TERMINATED' CALL 'ABENDMOD' END-IF.
Of course, going back to the second lesson from above – 20,000 or so batch programs that had code like this – there was a lot of error handling code that needed to be changed. However, with more than enough “real work” to go around, the in-house programmers simply did not have the time to dedicate to this project. So, they did the next best thing: they found a contractor.
In some dark corner of their office, they found a contractor who had been hired to help write a systems spec for another project. The contractor – I’ll call him Reggie X. Preston – had done a fantastic job on the specifications. In fact, he did such a good job that he finished well ahead of time and had two months left with nothing to do. Reggie seemed like the perfect person to help change all of the error handling.
Reggie was not an experienced programmer, nor did he claim to be. He was, however, technically competent, great at following directions, and eager to do the most tedious of tasks. After walking Reggie through a handful of examples, he felt confident enough to take on everything else. As the months passed, Reggie went through program after program and changed each and every instance of the error handling code. Through the tens of thousands of changes, he didn’t complain a single time nor did he make a single typo. He was like a machine.
Once the changed code started rolling into production, programmers throughout the company were relieved that they were no longer getting calls. The calmness lasted about a week: in the middle of the night, a batch run failed. It was a simple problem to fix – the program ran out of space on a work file – so the operator resized the work file and restarted the job. It was standard operational procedure.
And then the program failed. It was restarted. Failed. Restarted. Failed. Restarted. Failed. Eventually, a programmer was woken up at 2:00AM and tasked with coming in immediately to research and fix the problem. Within minutes, he figured out the problem...
IF (restart condition) DISPLAY 'THIS IS A RESTART OF PROGRAM BAR, ACKNOWLEDGE RESTART OK' CALL 'ABENDMOD' END-IF.
That code was recently changed by none other than Reggie. The programmer quickly restored it to the original...
IF (restart condition) DISPLAY 'THIS IS A RESTART OF PROGRAM BAR, ACKNOWLEDGE RESTART OK' UPON CONSOLE END-IF.
At an emergency meeting the next morning, they figured out why Reggie changed that particular section of code. The programmer who had trained him simply forgot to tell him to ignore restart acknowledgements. Now, they’d need to go back through 20,000 or so programs and fix the restart acknowledgements. Fortunately, Reggie was available for the job again...