"Welcome back, Susan!"  one of Susan’s co-workers said as she walked in the door.  As she strolled to her desk, the co-worker continued, "Hope you enjoyed your vacation.  You missed quite a fireworks show while you were away.  We lost an entire day’s worth of orders, bit of a nightmare."

Susan had spent last week at sunny resort with her family, and was trying to adjust back to the cold January mornings in Massachusetts.  Being greeted with the news that their ordering system had been offline for almost a day was not something she wanted to hear.

"What exactly happened?" Susan asked, preparing herself for a response she could almost see coming.

"Oh... well, Alex happened".

Alex had been at the company almost a year now, but his knowledge would suggest it was still his first day on the job.  The company normally employed two senior sysadmins, Susan being one of them, and despite being fresh out of college with no experience, Alex was hired as the second.  Susan had lobbied vehemently against this approach but after management got wind of Alex’s salary requirement, or rather lack thereof, he was quickly hired with the idea that Susan was so wonderful she would train him to do the job.

One morning while Susan was away, Alex was working from home, in part due to a snowstorm, and decided it was important for the web server’s network connection to be restarted.  Despite the fact that Susan had instructed him to schedule these kinds of tasks after hours, he had just made a minor change to the network configuration and decided now was as good a time as any.  From a remote SSH session he issued the following command:

 /etc/init.d/network stop ; /etc/init.d/network start

And with that first command, the network came down, killed Alex’s SSH session, and the server sat there idle.  While her co-worker recounted this part of the story, Susan felt the sudden need to find Alex and slap him.

"Oh, but it gets better," Susan’s co-worker told her.

After realizing his mistake, Alex panicked and tried a wide variety of tasks to fix a remote computer with no internet, which Susan was told probably involved praying to some deity.  After some calming down, Alex decided to call the hosting company and ask them to restart the network on the server.  He gave the hosting company the root password over the phone, but it was rejected. 

After that, Alex had them reboot the server and try to login again, but still no luck.  Alex really started to panic now and his mind raced with tons of fears.  He instructed the tech support at the hosting company to boot directly into the shell and reset the root password, aka the init=/bin/sh trick.  The staff on-site was then able to login, but the server was still down and Alex could not connect remotely. 

Alex finally realized he would need to get up, brave the winter storm, and drive the three and half hours to hosting company’s office.  With the storm under way, it took Alex nearly five hours but he finally arrived and the staff took him to the server rack. 

Initially, everything looked fine on the computer.  The box was responsive and Alex could login with the newly reset password.  Alex then realized tons of files were missing on the server, the entire database installation for instance.  On the verge of a nervous breakdown, Alex began imagining every worst-case scenario he could think of, from hard-drive failure, to viruses, to a hacked server. 

Alex looked down, about ready to run away and tender his resignation when he noticed the model on the side of the server did not match the model he had been told the server was running on.  Suddenly, Alex realized the problem - the hosting company had him working on another company’s server.  A few rows down, Alex found his computer, logged in and saw it was just waiting for someone to restart the network connection.  The hosting company now had the pleasure of contacting the other client and explaining why their server was restarted multiple times and the root password changed. 

"In the end, Alex fixed the problem caused by Alex and it only took most of the day," Susan’s co-worker summarized. 

Susan was speechless, not sure which action in this tale of incompetence was the biggest WTF.  Susan paused for a moment, and then said "So are they thinking of firing Alex, then?" 

"Not quite," her co-worker replied.  "They have scheduled a week of training for you with him, and want you to spend the next month updating the standard operation procedure document so that an incident like this never happens again."

Susan shrugged.  While her body may have then returned to her desk, her mind returned to the beach she had been on only two days earlier.

[Advertisement] BuildMaster allows you to create a self-service release management platform that allows different teams to manage their applications. Explore how!