"None of our customers' web servers are online!" was not the kind of thing Ryan wanted to hear in the morning.  Nor was it the kind of question Ryan wanted to hear from the 15 different department heads and administrators all shouting on the conference call that morning.  Luckily (for everyone but Ryan), Ted from Net Operations was on the call. Ted was one of those hands-off system administrators who found that it was far easier to delegate work to someone else and leave early for a bar.

After 10 minutes of bickering, Ted announced that he had found the root cause of the problem: Ryan. He announced to the group that Ryan, in his ignorance and naivety, had deleted all of the customers' web servers from production.  In shocked realization that today might be his last day on the job, Ryan was unable to speak at first, but one detail gnawed at him - throughout it all, he had followed the steps that Ted had given him to the letter.  Ryan needed answers - if Ted said that he did indeed wreck the Production environment then it was Ted who would show him.

The Training Process

It all started about four weeks before that fateful morning.  Ryan had been working at Initech for three months as a junior developer and was eager to learn all facets of the development process, including setting up a web server. Ryan's boss asked him to work with Ted to set up a new test server as part of the training process. Ted sent Ryan an e-mail with a link to a virtual clone of the production server he had created, along with a copy of a Visio diagram showing the configuration of all the servers across the domain.

Ryan's primary role as a developer left him with an hour or two each day to work on the test server. Ryan struggled with server configuration and despite trying a dozen different options; the test server would simply not talk to the other computers on the network properly. He sent numerous e-mails to Ted asking for assistance but rarely received a response. On rare occasions when Ted did respond, he often just resent the first e-mail that he had sent to Ryan with the same instructions and Visio diagram. Ryan also tried calling Ted and stopping by his desk, but Ted always seemed to be 'elsewhere'.

Finally, Ryan cornered Ted by the water cooler one day and begged for his help in setting up the test server. Ted instructed Ryan to clear out the old directories on the test server and reinstall IIS since they probably contained old data from the production clone.

Assigning Blame

"I'm looking at the logs right now.", explained Ted while tapping on his monitor, obviously annoyed, "You were the last one on production and it was working just fine before you logged in."

Ryan shook his head in disbelief, "That's just not possible, I mean. It must have been someone else using my login."

"Look again!"

Ted pointed out that unless Ryan wasn't at his desk at 2:52pm the previous afternoon or the logs showing Ryan's workstation name and network id as being responsible were lying, it was indeed him.

Ryan couldn't deny it - he was at his desk the entire afternoon.

Ted chided, "Logging into a production server is strictly forbidden! You do not know our systems well enough. This isn't some college dorm room computer! All production requests need to go through me."

"But I'm telling you, there's no way I could have been on the production server, I logged into the test server, just like your documents showed, see?" and with that, Ryan thrusted out Ted's own documentation.

Suddenly Ted realized what had happened. Ted had given Ryan production server information instead of the test server he was supposed to be working on.  To add insult to injury, Ryan had carefully compared the Visio diagram against the server information Ted had sent him not realizing that the Visio diagram, too, was out of date. At this point, Ted changed his tone and in a follow-up message to the group managed to talk his way out of trouble.

Ted stated that it wasn't Ryan's direct fault but rather it was a fault of his inexperience. He argued, "We can solve this in the future by more developer education. Since I don't have time in my busy schedule for training, perhaps one of the other operators can train him..."

Shortly thereafter, a backup of the server was loaded onto production and all was back to normal.  In the end, Ryan's boss agreed with Ted's assessment and that it was a good learning experience. Ryan also learned to double check anything Ted told him with other members of the Net Operations team - not that Ted and Ryan spoke much after this, as Ted was rarely at his desk.

[Advertisement] BuildMaster allows you to create a self-service release management platform that allows different teams to manage their applications. Explore how!