- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Admin
Also, the guy in this article is called Alex, who had to do a 5-hour drive to, and another 5-hour drive back. Today's article was posted by some guy named Scott instead of Alex. That means Alex must've been out and about somewhere. And there's an Alex who's doing a 10h trip to and from a hosting company. A coicidence? I think not!
Admin
Yes, how can I help you?
Admin
Uh we have one like this were I work. I am sorry but I have to deal with remote hands all the time. One of the first things I always do is have them read the hostname listed on the login screen if they have to cart a machine to make sure its the right machine.
If the guy had to go by the model of the server then it sounds like they dont have things setup well or not things correct int he DB or doesn't even notice the different hostname on the machine.
Also as others said the ; was the problem. I have seen those fail (with things like ethtool, etc..) when the connection dies it doesn't run past the ; sometimes so it has to be ran in something like screen. That lesson has been learned since a long time ago. For someone who worked there already for a year they should have known better.
Also weird that the DC doesn't have any type of KVM over IP setup as driving should really never be what should be done when the server is several hours away. Alas the guy where I work who has been employee 2.5 years often will suggest driving out to the other DC (even one 1 hour away) when the issue can easily be fixed without going out or via remote hands.
Also you guys should really be using IPMI or serial console or something as that greatly reduces the need for remote hands and mistakes (like the one the DC had) from happening as you can take the DC/remote hands out of the picture.
If the servers were just setup for serial console (with nothing on them) at the very least you could have used another machine in the same rack and had the DC run a serial cable between them.
Admin
It probably didn't kill your SSH connection because your using DHCP. A lot of times stopping network when its DHCP will just kill the process and leave the IP on the interface. Sometimes it might remove the gateway but I am thinking you might have been on LAN with the machine so why that would have borked it from normal remote access it wouldn't from LAN.
Admin
Yes people do indeed make mistakes, it happens. Of course there are some people who make mistakes all the time and a lot more than others. Sounds like Alex was one of these.
Also how you deal with fixing the problem after the mistake shows how good you are as well. IMHO he should not have had to drive out there and fixed it in the slowest/most inefficient method possible. Also he was ignoring what a senior employee told him not to do (make changes during business hours).
I have a guy who has worked at the company for 2.5 years. I have tried showing him how to fix something that anyone else on the admin team a good 15 times. I have showed others how to do this with no problems but he just never gets it. He never remembers how to handle things and wont take notes.
I basically have to hold his hand/babysit him through every little thing and it gets really annoying fast. Just the other day I decided I will never help/walk him through anything again and just will wait for him to give up and pass it off to me to fix when he rebooted a server after I explicitly told him not to.
Admin
Wow. I assumed someone out there actually DID watch the NFL and would detect my sarcasm. There are many examples of this, perhaps the best being the latest SuperBowl champion quarterback.
http://en.wikipedia.org/wiki/Aaron_Rodgers#Backup_seasons_.282005.E2.80.932007.29
And for the record, I think sports is a fine analogy here. Unbelievable: a registered user with a rather long post linking to Wikipedia and Askimet rejects it.
Admin
Alex made the mistake of accepting a title but not the pay of a senior position. I would bet a vast majority of new grads would accept a good job and discount the title as some industry thing. An auditor may wonder why they have 2 senior admins with wildly different pay rates. If this is his 1st mistake, firing him would squander the training expense i.e. a lost day; or do you think the company will learn and not replace him with another rooky?.
Admin
No, usually the restart script is just running stop and start anyway. Whether they're being run in a subshell or not, when the connection dies, sshd will send SIGHUP to the shell.
So you could run: nohup /etc/init.d/network restart
Or even:
The sleep is just to make sure that it's disowned before the command can switch off your network.
Admin
Quite simple really ... just f*up your net config files, restart service and boom you're toast.
Did that first time I played with iptables - 5 minutes later I had a script rewriting the iptable rules every 5 minutes from a verified .sh - in case I made another mistake on the iptables-config.sh I was working on.
Oh and . the only wtf here is another r*tarded client who thinks they can get real competent people for no moneys --
Oh i see, 3k for a professional website is too much ? NP, i'll give you a few contacts who can do it for cheaper ... using joomla, dreamweaver or another fail toy.
Admin
I once worked for a company (which, when I started there, was 3 people: boss, designer and me, the developer). They boasted 10 years of experience, but the designer was there for 2 years or so, and was the first employee. I was at least the 3rd dev. they saw passing by.
Most of their sites were done in Contribute (a dumbed-down version of Dreamweaver), and either done by the designer, the first developer, or the boss during those 8 or so years he was the only employee there.
In other words: 10 years of web dev. experience... but only through a WYSIWYG editor without knowing anything 'bout the technical aspects behind it, nor about HTML or anything at all. Joomla is, without a doubt, several steps higher up the ladder than that. At least that requires some technical background.
Oh, and they charged about 10 - 15K for a site that's basically a online business card. So, definitely not cheap at all...
Admin
I worked for something similar, too.
This price could include SEO.
But yes, clients mostly payed for luxury. Business luxury, not design one.
Admin
Eh . several steps higher up the ladder still gets VB nowhere (oooh yes there is much much worse still ... unfortunately).
I don't think anyone should be nice to Joomla-using people ;)
After all it's just an overly complicated cms that enables non-developers to create/update (often) template-based websites.
I mean . that tool is supposed to help people and even doing the most basic editing looks like a 15-step action ... when simple in-place editing would do the trick far better.
And templates .. don't get me started.
I recently had the option to pick Joomla to offer content editing features to a client .. and realized it would be faster and better to just provide them with a jeditable-enabled version of their website rather than trying to get a real website to work with Joomla.
Admin
Ring, ring.
Larry's mind filled with dread, as he was already engaged in a rather delicate process that could not be interrupted.
Ring, ring.
This was supposed to be a two-man operation. Larry thought of several elven curses for his supervisor who was basking on some beach somewhere, avoiding his involvement in what was likely the worst crisis since December 31, 1999.
Ring, ring.
Larry finished, barely avoiding a re-enactment of the scene from Something About Mary, and rushed to the phone. Before he could even say "Hello, Initech Web Hosting", he was interrupted by a frantic voice on the other hand.
"This is going to cost me my #$#$-#$&ing job, you #&$**! You have to restart the network now!"
"Please calm down, sir, and we can get this problem resolved," Larry said as he unlocked his computer. "Can you tell me your account number and password."
"There's no time! Restart the network now!"
"Sir, I need your account number to log into the PC, and then your account password." Larry patiently replied.
"Fine, it's 8838-384738-384773, and the password is 'admin'".
Larry rolled his eyes...why did no one ever change their default password? "Ok, the number you gave me was 8838-384738-384773".
"Yeah, whatever, just log in and restart the network."
Larry checked the account number, and it matched the account password. He was told the root password and tried to log in. The password did not work.
"Restart the server! I'm the customer, restart it now!!!"
Larry tried to explain that restarting the server was not going to change the root password, but he got shouted down. It didn't matter what he did: he wasn't able to log in. Suddenly the screaming on the other end stopped, and he heard a dial tone.
5 long hours later, a bleary-eyed warrior was startled to consciousness by the sound of a door being ripped off it's hinges. Instinctively, Larry grabbed for his sword to battle the murderous, snow-covered maniac that was advancing toward him. Then he remembered: he had no sword.
"Take me to my server!" the wild-eyed madman shouted. Larry tried to help, but the man didn't know his account number. Then it clicked: "Are you the one who wanted me to restart your network?"
"Yes, that's me. I'm going to lose my job. Help me, Larry, you're my only hope."
Put that way, there was no denying his request. Larry pulled up his notes on the ticket and was able to locate the machine. Larry went back to working on his tickets, keeping watch out of the corner of his eye. He wasn't sure how this guy was going to be able to log in unless he had suddenly remembered his password on the trip over.
"You blew it up! Damn you! Damn you all to hell!" Somehow, he had logged in, but reported that key files and even a database were missing. Larry had to make another toilet run, but when he came back the man was sitting at another computer and beaming. "I found the correct computer, and fixed everything!"
"How'd you get in to the other computer?"
"Oh, I used this USB drive. It has a tool on here that allows me to reset my password. That's why I had to come down in person."
In an instant, Larry had the man on the ground, and bound his hands and ankles with zip ties. First, the police were summoned, and then Larry had to explain to the server's owner why their website had gone down several times that day. (Fortunately, it was a website for a lawn service, which understandably did not get much traffic).
"Thanks for catching this guy," the policeman said, "he owes the state thousands in unpaid parking tickets."
The End.
Admin
Admin
Of course not. But it does handle a number of situations, including the one mentioned here. (There is no mention that the network config was messed up.)
IF you are so curious to know, however, it IS possible to set up a system that can recover from that sort of error. I worked for a company that had 2000 servers spread across the US. If a change was pushed that caused any of them to lose the ability to phone home, it would roll back the change.
Fun bit of code, but more than what is required when you have a body on the other end that is supposed to be competent to do a system reset.
Admin
I don't know what the API / framework they offer is like, but paying for plugins or not being able to do anything really useful with your CMS says enough for me.
That's why I'm sticking to Drupal. Ok, it's not perfect either and I'd die for in-place editing, but at least it offers decent modules for free and gives me the freedom to create them myself as well -- which I end up doing for pretty much every project, to "glue" everything together.
I've seen plenty Joomla folks come over to Drupal, but almost nobody go from Drupal to Joomla (except those that are used to Joomla, go over to have a look, don't bother learning anything and leave back for Joomla with a big huff)
Admin
Admin
I for one do most of my online shopping after hours (when I'm not working, you know).
Admin
You don't need to die for in-place editing (although its a worthy cause), just ask and I'll quote you whatever you need ;)
Admin
When .. or is it IF ?
Like in :
"If management and HR knew an competent IT dude when they see one, I'd get lotsa moneys" ?
There are competent people on the other end, sometimes ... on the other hand, in this reality it's more often not the case.
A major fraction of the available tools have at least some degree of "retard-handling" code - and it's a good thing.
Also that bit of code did more than just restart their servers, it brought them back to a "last known good settings" state - not something you can ask any helpdesk person.
But ... I don't think their script can restore from anything unless they actually have a network reboot and all in there ;)
Admin
Admin
Excellent title.
Admin
Not hoodaticus here, but I have been a fan of BOFH for a long time.
Admin
Was it entered as one line, just like that?
You'd think that shell would hold the entire semicolon-separated line and work off of the whole line, despite the network not working after the first half of the command.
Would the following work?
( sleep 5 ; sh <<< " /etc/init.d/network stop ; /etc/init.d/network start" ) &
Admin
He is not my/our father/brother/sister/uncle/aunt/zunesis/grandfather/whatever; he is just the father of a retarded son and this is no laughing matter!
Admin
In my eyes best submission so far!
Admin
Admin
The Rule of changing network configuration Settings is as follows:
1: During a Major snowstorm or related weather event, you don't change networking configuration settings on a major system.
2: During Working Hours, you don't change networking configuration settings on a major system.
3: Without a plan to change the settings, and to get the system back up pernickety split if those plans fail, you don't change networking configuration settings on a major system.
I don't care if it's wintel, linux, oracle, cisco, juniper, BigIP.
There's 2 caveats to this rule.
A: Working hours are defined as 24/7/365. At which point you beat the accountant who came up with the idea of working everyone 12hrs per day to within an inch of their lives, and proceed to look elsewhere for work. We all know the same accountant will also insist on the cheapest hardware, no vendor support, using the hardware until it implodes, and a big personal paycheck for the effort.
B: If you're working in a production 24/7/365 environment. At which point, either it's down hours or whenever you're at work and don't !$!$!# up because it's your !$!#rse if you do.
Admin
The SOP already was written correctly - don't reboot the network connection during office hours. The SOP was not followed. It doesn't matter how many SOPs are written or how often they are updated, when someone does not follow the SOP, the problem is not with the SOP, it is with the employee.
If the SOP wasn't followed one time, fine. But if this is a pattern of not following the SOP, Alex should be fired for incompetence. Period!
Admin
Exactly, if the hosting company had gotten the correct server in the first place Alex would have had the problem resolved in minutes instead of hours. Poor noobie Alex.
Admin
I'd like to know the rest of the story. I've been the rookie before, and it's almost always been the case for me that my superior sees me as a potential threat rather than as a potential help, has been hostile about training, then behaves as if I'm the idiot rookie when I don't know what to do.
Admin
Admin
Admin
My mistake.
Admin
Admin
For some reason, you didn't like my answer, so you called it speculation. I said I didn't see how. You wouldn't come right out and tell me, just rambling meaningless details instead (when you shop online is meaningless). I shot down your hints, so you acted like a dick (look that up in the dictionary!). And so I acted like a dick back, and you said I must be talking rubbish.
I'm not sure what facts you think I need to prove to you. Maybe I just don't understand your original question - I thought you were asking what the significance of running maintenance tasks "after hours" was. I guess that was speculating.
Admin
Admin
I prefer to think of my story as a comical parody using cultural stereotypes that exist because of cultural differences.
If you can't laugh at yourself and you can't laugh at others for being different, then you end up resenting them and starting a group based on their erradication.
Also as a writer, we often try to put ourselves into different people's perspectives. In this case Ganesh, My not so bright Indian in his call center in India is very stereotypical, and its always a neat challange to tell a story from anothers (fictional) perspective.
If you can't get past that without getting a racist stick up your ass, you are too high strung for society and for that I prescribe marijuana. But I've gone off on an off topic tangent and will stop now. Smile more and relax people. Enjoy it for what it is, an entertaining story to pass some time.
Admin
Sounds like Susan was a co-worker, not a supervisor.
This dude was hired as a senior system admin right out of college because he was willing to work for cheap. Completely an upper-management screw-up there. Also, he ignored the advice he did get from Susan about restarting after-hours.
Admin
For one thing, it was important enough to be noted in the story.
Who said this was a public-facing website? It may very well have been an internal web-app for employees to create and track orders. A substantial amount of b2b commerce occurs that way.
Admin
gess who just remotly restart a server after writing the wrong gateway address ^^
and i have 10 years of experience, and 3 years in that particular job...
sometimes, you juste made mistake...
Admin
Please show some sensitivity! My son was not american once and it was an oblivious matter! (Hint: Time to date zune!)
. . .
There is no guess from the story that maintenance work outside business hours caused the loss of orders in those hours (only that maintenance work during business hours caused the loss of orders).
Maybe processing of orders happened only during business hours and orders outside of business hours where saved for later processing?
Admin
Next time put "Nagesh" in you're name field!
Admin
Admin
Hmm, you would think that the First/Frist/Frixth/whatever jokes would get old after a few years, but I see that TDWTF commentards just won't let it die.
Admin
Admin
"And we see the princess, scraping off the royal pudding from the prince's arms, as is of course tradition. "
Admin
shutdown -r now
Atleast the server would have been back up in a couple of minutes, with completely correct networking configuration ;-)
Admin
However, in context, what were you asking? Why everyone was going on about peak hours? That's what I was answering, and though my choice of words make it look like a guess, it's actually based on experience.
You're missing the point. Peak hours are defined as the hours when the volume of traffic is highest. I don't need data, because I'm relying on the definitions of the words. Who said anything about being offended? (Now who's speculating?) I only thought your dickish behavior was worth bringing up just to explain my own dickish behavior. Did I hit a nerve? Let me rephrase that - WTF exactly am I supposed to be proving? That the reason people avoid maintenance during peak hours is because it affects the most users? I guess I can't prove that, since it's impossible to really prove anyone's motives. But I bet a lot of people could confirm it; I, for one, can confirm that every time I (or anyone I've worked with) have ever scheduled outages after hours, it was to minimize the affected customers. That's completely stupid. Worrying about who's on support at the hosting company is hardly a concern when compared to that of losing money. Peak hours = lots of users = if your site is down you lose money. How are you still not getting that?