Credit: Marcin Wichary@Flickr; Data processing center, pt. 5About a decade ago, Coyne's employer at the time hosted various application systems for multiple clients running on their IBM host.

The clients ranged from a high profile Fortune 100 client all the way down to companies that were so small that they would probably rank in the Fortune 10,000,000's...and it was up to Coyne to do a Disaster Recovery Plan (DRP) for one of the clients' application systems. Specifically, the applications were responsible for payroll, direct deposit, etc.

For many, this would be a pain, but not for Coyne. He was in his element. As part of the DR test, Coyne decided to prove that the recovery plan could not only recover their large, important client, but also one of the smaller, less "important" clients.

The tricky part in the process was that in that day and age, the DRP involved pulling off-site tapes from the company that did off-site tape storage, and shipping them to a data center 6 states away. Since humans were a large part of this process, and Murphy's Law being being what it is, there were a lot of wrinkles that needed ironed out. However, in the end, the stars aligned and Coyne was ready to begin his testing.

The test for one of the larger clients went well - all 6 tapes were able to be restored onto one of the dev servers and boot successfully. Next came testing for one of the smaller clients. Coyne called the operator on duty and, while he was on the phone, he submitted the job to load the necessary files from the client's off-site tape. The system posted the request, and soon after he heard the computer "Ding" over the phone, the operator acknowledged, "Yep, got it - I'll go get the tape for Blavington Township."

After a fairly long delay, the operator came back on the phone and said that he didn't have that tape.

"Are you sure it's not on some different shelf?" asked Coyne trying to stifle a what-kind-of-idiot-are-you? tone.

"Nah man, I'm positive that I didn't see it. The tape guys are always REALLY good about putting in the right tapes. You sure you picked the right one?" The client's tape should have been there. They all ran on two bi-weekly schedules; ok, maybe he goofed and picked the wrong cycle.

Coyne picked another from his small client list. "Ok, what about Leaf Hut Ironwares?," as he quickly changed over the job and submitted it. "Ding!" announced the operator's computer which prompted the operator to respond, "Another one? No prob! I'll be right back with the tape."

After an even longer delay, the operator came back on the phone. "Ummm...So, you're not going to believe this one man, but we don't have that tape either. I'm real sorry."

Volumes could be written describing the excited investigation that followed. Various managers and supporting staff traveled to the different sites so they could personally point their finger of blame, but it all came down to one detail: None of the off-site tapes for the 28 "unimportant" clients were actually being shipped off-site.

The tape management software wasn't selecting them for transport, and hadn't been for at least half a decade. The clients were being backed up alright, it was just the tapes never left their drives. The best part is how no one else noticed; because in years of disaster recovery tests, nobody had ever bothered to test one of the unimportant clients.

[Advertisement] BuildMaster allows you to create a self-service release management platform that allows different teams to manage their applications. Explore how!