- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Admin
It seemed that the user in question didn't have sufficient privileges, but being in a special group meant he had access to root, so he did what anyone would do...
and then ran the command (not realising that when he switched user, he shifted to the home directory for root user - which for some reason on this system was simply /
The command appeared to kick of nicely, until suddenly
Ouch!
Admin
Admin
Incidently, the Co-Worker who seems to know what's going on ("wait it gets better" - implies they understand the hilarity of the situation) is TRWTF. Sounds like they should be the Senior SysAdmin [flame] - or because they were more intelligent, were they therefore better suited to a programming role... [/flame]
Admin
Admin
Who would want girl to be sysadmin?
Admin
I miss Irish Girl.
Admin
hapen in American movie all the times.
Admin
Susan needs only show the boss (the one who collects the amount shown in the company's bottom line) this inequality:
(Alex's Salary) + (One Day's Orders) > (Good Sysadmin Salary)
This is a win-win. If the boss fires Alex, Susan has less to worry about and looks like a hero to the boss for saving him money.
If the boss doesn't fire Alex, it is clear the boss is an incompetent businessman. This means the boss can be exploited in so many other ways, and Alex is the perfect scapegoat to take the blame for any scheme that requires a sacrificial lamb.
Admin
Do you guys watch any professional sports? Do you think that any of the 32 NFL teams have drafted a kid "straight out of college" and now has them as their backup quarterback?
Admin
Admin
Better?
You mean not following clear instructions; modifying the network config and remotely restarting the network during office hours? I suppose I can accept that a programmer with a "fresh out of college" ego might make such a mistake.But mistake or not, losing an entire day's worth of orders because he was screwing around in the network config is not okay.
Admin
Obviously he thought that the second command would be executed even if his ssh session went away. It's an understandable mistake; after all, when you press return both commands are sent to the shell, so it's reasonable to think it would execute both.
Admin
The command sequence "/etc/init.d/network stop ; /etc/init.d/network start" DOES work.
The system will not immediately kill the connection just because the network device goes down, it would only time-out after a while. Also the shell already got the complete command sequence and will execute it till completion.
Just try it, it works. So this story does not make sense from the beginning and is probably just a untrue tale. Sad.
Admin
Well, that's exactly what happens. Try it. So he thought right and the story is troll.
Admin
Good idea...she should have CYA-ed this anyway.
Admin
Admin
I think too many "it's not Alex's fault" commenters are forgetting that he did this during business hours. The technical aspects? Sure, he's new and he'll learn, but he fucked up big time by doing this outside of a scheduled outage.
Admin
He probably typed the two commands in by hand, one after the other, rather than send them both through on the same line.
Admin
Admin
Admin
Of course sports is not a good analogy here.
Admin
I switched our remote server from 10Mbps to 100Mbps once.
One way to find out that your company is too cheap to buy a 100MBbps switch, in 2001...
Admin
That's assuming you made a sane network change.
I have not (so far) bricked a machine, not from when I was a young whippersnapper and of that "just out of college" age, to now, 16-ish years later. My boss, on the other hand, at one of my previous gigs created a new lvol to use as a new raw partition for this app's database to eat, and promptly chose the previous partition number to tell the application to use. You know, the partition that existed, and was in use. App crashed beautifully. 6 hours later, we had lost 3 transactions, made nurses use pen and paper, and I had learned a powerful lesson. What you do as root matters. AKA Peter Parker Syndrome (with great power, etc.)
I had an applications admin once remove all permissions to the C: drive of a windows box. As in "this SYSTEM guy doesn't need to have permissions to see the C: drive, we're going to have sensitive information here". Worked until a reboot. That was a nice pretty Win2K brick. Box would boot, you'd be asked for credentials... but since the system wasn't allowed to read its own filesystem, it couldn't authenticate anyone locally, to log a user on. Or really, any other thing, since all the useful utilities were in c:\windows\system32. No NIC, no domain authentication, etc. I think our manager ended up charging his department heftily for the 2nd windows install.
Admin
Yup. Learned the hard way: http://www.srijan.in/blog/analysis-screw
Admin
Admin
Admin
Admin
Admin
It's unfortunate you haven't had the pleasure of knowing one but they are out there.
Admin
Admin
Admin
Admin
Was more responding to your "WTF did he think was going to happen?" Mistakes happen, and people need to be held to account, but you seemed to suggest that he should have known the shit was going to hit the fan. I think it's perfectly reasonable to think that he didn't realise there was an issue - in fact I've seen (and perhaps even perpetrated sometimes) the same sort of thing more times than I's care to count....
Admin
Yes, I've seen you. You ARE a robot.
Admin
I would like to point something out:
"The company normally employed two SENIOR sysadmin’s, Susan being one of them, and despite being fresh out of college with no experience, Alex was hired as the second."
I blame the company for not taking Susans advice and hiring someone fresh out of college for a SENIOR position. That is the first wtf. The second is a hosting company with some major security/incompetence issues.
So no, You cant sit there and yell at Susan for being told she was going to break in a rookie into a Senior position. Senior means experienced and qualified (typically) which Alex was clearly not.
Admin
This story reads like bad internet porn or a Jr. HS english project. Whoever wrote it should be fired from life.
Admin
I'll give it a crack.
Ganesh had just started his 4am shift at the call center he worked for in India. It was his 2nd week on the job but he was confident. His poor understanding of the english language, his thick accent, and his online database of scripted responses would see him through the day.
His first call of the day, A man named Alekshe (as far as Ganesh could tell) was speaking to him in a panic, and all Ganesh could catch was "bla bla bla server reboot". Ganesh asked the man to calm down in his best English accend, and flipped through the dropdown box and pulled up server reboot instructions.
After getting the 3 digit server identification number from Alekshe, he was able to look up the ip address of the server causing the problem. The server was online, but the customer was requesting a hard reset, so he did what he was told. He pressed the hard reset command to momentarily activate the power relay on the remote server (Because having Americans on site to do this manually was way more expensive).
His screen indicated that the server had gone down and after a moment came back up. Ganesh now instructed Alekshe to try logging into the server again. "Bla bla bla something incoherent USERNAME something something PASSWORD RESET", was the response he got. Flipping over to the drop down box was the script and instructions for reseting the password on the server.
Upon typing the commands into the software and hitting send, his software made a pleasant "ding" which indicated that the server had accepted the command. "Bla bla bla bla bla something incoherent some long english words something something".
Not sure what to do Ganesh simply transferred Alekshe to his supervisor. Another server problem solved, with Ganesh representing the hosting company as the hero. He pressed the phone button to take his next call.
Admin
Technical incompentency can be trained, but absent-mindness can't. IMO he couldn't be qualified for any senior position that requires to work without supervision. If Susan's company wants to keep him, they have to bring in the third technical support, so there would be at least one competent senior supervise him on any time.
Admin
Admin
In a misguided attempted to save money, Initech Hosting had hired largely incompetent techs for their "Managed" hosting support line. Of course, solving this was a mere matter of providing a little booklet of step-by-step instructions, "The Book."
So, when Brian received a call requesting a remote reboot, he knew exactly what to do:
One might argue that following instructions literally would result in poor support, but Brian had found that competently following "The Book" to the letter resulted in fewer messes than incompetently following the "The Book" to the letter, which is what his coworkers did.
"Alex" was the listed contact name for only one server, CXInc, so Brian "knew" to reset CXINC's server, and did so. Susan, being the listed contact name for a different server, did not have her machine reset.
Brian was unsurprised when Alex called back in a panic, but happily did whatever Alex asked on "his" server. By the time Alex arrived to Initech, Brian had already informed his security officer, as per the instructions written in "The Book," which server Alex would need access to. Well, actually Brian violated "The Book" by pointing the security officer to a different, fresh machine and merely claimed it was the CXInc machine.
This was not the first such data breach. As was common on such consulting jobs, Brian's report had about 100 pages of detail to justify his high consulting fee. Still, his concluding advice (hire competent staff) was neatly summarized in a 10 page Power Point.
In Brian's experience, 10% of his client's followed his advice and 90% of his clients spent their "hire competent staff" budget hiring consultants 100 page reports into updated versions of "The Book."
Brian prided himself on his ethics: more jaded consultants used their Power Point presentation to recommend updating "The Book" (for a low additional fee).
Admin
Admin
They can fire workers in US for something like that? It could happen to anyone. Here in Finland you should prepare for a law suit if you fired people for doing few mistakes at work.
Admin
Weeeell, I managed to bork my server like this - mind you, the commands completed, but I just happened to make a minor change in the networking config right before (which was the reason I wanted to restart it). Guess whether the new config worked ;) But indeed, this was 1) my "home server" (read: an old beige box, almost literally scavenged from trash) 2) running nothing important , and 3) it required me to get off my lazy ass, walk two rooms over and fix it on the physical console. Everyone has done goofed sometimes; alas, I'd expect better of a sysadmin on a remote production server in the middle of the business day.
Admin
Now, Now. Let's not go flinging charges of racism around. What we see here is technically ethnocentrism. See, he's not linking the negative traits to physical or genetic causes, but rather to cultural ones. let's keep our prejudices straight people.
Admin
I've actually been in many situations where I've had to do this. Granted, I was never willing to do this on a production server when I'm not right at the co-lo.
The restart option can be just as useless as stop; start.-- It really depends on when the characters are flushed to the console. If the shell tries to flush the characters 'network....stop' to the console when the network is dead, it is going to stall, and the connection will eventually be dropped, killing the session. Piping all of the output to /dev/null would probably make 'stop; start' work, but the real solution is to do the whole operation in 'screen' ;-)
Admin
Reminds me about my coworker's script
test.sh
Admin
Better late than early
[image]Admin
Is there any good sysadmin, who haven't made such a mistake? I for one have made mistakes like that twice.
One time I had to upgrade sshd on a frontend of a computing cluster due to a vulnerability. So I log in and type: service sshd stop service sshd restart
It turned out that did not work as well as I had hoped for. I could have typed service sshd stop service sshd start and it would have worked. Or I could just have typed service sshd restart which is a shortcut for stop followed by start. However since I used stop followed by restart, what I did was two stop commands followed by a start, which incidentally never happened.
When using stop the first time the script would find the sshd server process and shut it down, but all the child processes handling open connections would stay up. When using stop for the second time the script would not find the sshd server process, because it had already been shut down. So it falls back to killall sshd thereby killing all open connections, including the one in which I had typed the restart command.
I managed to fix that remotely. I happened to have VPN access to the internal nodes of the cluster from where rsh was permitted to the frontend. That way I was able to bring up ssh remotely.
Another time I messed up the iptables configuration on my server at home. Luckily it was only a few hours before I came home and could fix it. It turned out that if there was a syntax error in a rule, the script to load the config would apply the default policy for each chain but leave the chains empty. With an INPUT chain with a default policy of DROP, there was not much that could be done to resolve that remotely.
After having made such mistakes I got to do system administration in a company that takes availability very seriously. Of course mistakes can happen, the important part is to learn from them.
At one time a segment of the network had been reconfigured to change the IP prefix used for that segment. Unfortunately one important server had not been updated. By the time the maintenance window was over and I had to bring this server back into the redundant pool of servers I find out about this mishap. The server was located in a different timezone, and everybody who had been working on the reconfiguration had gone to bed. So I had a server that didn't know its own IP, and I had to fix it remotely without having any access to the router configuration. I succeeded in doing that. I won't spoil the story by telling you how I managed to ssh to a machine with a network configuration so hosed I couldn't even ping it.
I have made a few mistakes, but nothing so bad I couldn't fix it myself. I have learned to use screen whenever I am doing remote administration.
I think learning from mistakes and being able to bring a hosed system back into a proper state using whatever tools still happens to be working is more important than having made a few mistakes.
And I agree with other comments, that the hosting company providing access to the wrong machine is a bigger WTF than a little typo.
Admin
But he wasn't new, he'd been there a year, and should have learnt that much by then if he was ever going to be suitable for the job.