• Erick (cs)

    Millions of dollars in fines plus the risk of being shut out of business vs. the purchase of a single generator.

    Talk about stubborn.

  • andyl (cs)

    It amazes me how companies are willing to risk millions of dollars in liability to save a few bucks here and there. Even after there are frequent problems they still don't get the idea

  • Reed (unregistered)

    Hmm, my $30 home UPS has this sort of serial cable thing? And I can plug it into my computer? And if power goes down it can schedule a system shutdown? I've always wondered why anyone would want such a feature... :)

  • yakko (unregistered) in reply to Reed
    Anonymous:
    Hmm, my $30 home UPS has this sort of serial cable thing? And I can plug it into my computer? And if power goes down it can schedule a system shutdown? I've always wondered why anyone would want such a feature... :)

    the PAM needed to be up the whole time of the outage, a nice shutdown only made the return time better. An extended outage would have the same issues if a nice shutdown was done or not. One of our New Orleans offices was without power for 7 days, it would be nice to know about gas leaks during an outage like that.

  • CD (unregistered)

    Sounds like they need a bigger gas tank for that UPS!

  • /b/ (unregistered) in reply to CD

    More like: "A UPS is fine too"

  • frosty (cs) in reply to CD

    Anonymous:
    Sounds like they need a bigger gas tank for that UPS!

     

    What happens when all the gas stations are bone dry from everyone evacuating, and shipments in are delayed by blocked roads?  Do you call in a refueling helicopter or something? 

  • ParkinT (cs)

    P.A.M.

    Pain in the A**, Man

  • Anonymous (unregistered)

    This one really makes me wince. Yikes. Please tell me where this company is so that I can NEVER MOVE THERE.

  • marvin_rabbit (cs) in reply to frosty
    frosty:

    Anonymous:
    Sounds like they need a bigger gas tank for that UPS!

     What happens when all the gas stations are bone dry from everyone evacuating, and shipments in are delayed by blocked roads?  Do you call in a refueling helicopter or something? 

    An excellent point!  How will the detect gas leaks when there is no gas left!  

     ...  hey,  wait a minute .... 

  • Rich (unregistered)

    We had a hard time convincing management to let us set up the servers in our bright new shiny server room with big fancy UPSs so that they would automatically shut down after a period of time. It's not much good having fancy UPSs if your servers go down hard when they finally give out (And since they are overloaded, that's not much time anyway) and you end up with disc corruption (They finally gave in when they had a DR consultant come in and he expressed a few "WTF"s of his own)

     Then there was the WTF of the portable AC unit (main AC was underspecced of course) that shut down when the power went out and couldn't power back on automatically. Mmmm, toasty...

    We're now moving to a new building. AC and UPS have finally been specced correctly (apparently) but a generator was requested but squashed by the higher-ups. Guess they have faith that there won't be any repeat of the 4-day power outage of a few years ago.

     

    Rich 

     

     

  • Raymond Chen (cs)

    I still don't get it. The next time there's a power outage, the PAM stays up, the router stays up, but James can't sign on because he has no electricity!

  • BA (unregistered) in reply to marvin_rabbit
    marvin_rabbit:
    frosty:

    Anonymous:
    Sounds like they need a bigger gas tank for that UPS!

     What happens when all the gas stations are bone dry from everyone evacuating, and shipments in are delayed by blocked roads?  Do you call in a refueling helicopter or something? 

    An excellent point!  How will the detect gas leaks when there is no gas left!  

     ...  hey,  wait a minute .... 

    Natural gas, not gasoline.
     

  • Rich (unregistered) in reply to Rich

    I should add that all the bits-and-pieces for installing and wiring a generator are still going in. So clearly they are planning in advance to lock the door after the horse has bolted.

     

    Rich     

  • fork(2) (unregistered)

    Here's what I thought the real WTF was: During an extended power outage, problem monitoring for a utility like this seems like a very important thing to keep running. So if the power is out for too long they SHUT THE FUCKING THING DOWN??!!? People who think their money is more imoprtant than public safety don't deserve the right to make ANY decisions about safety.

  • Dazed (unregistered) in reply to Erick
    Erick:
    Millions of dollars in fines plus the risk of being shut out of business vs. the purchase of a single generator.

    Talk about stubborn.

    More to the point: what about the risk of people being killed?

    Appalling.

  • GoatCheez (cs)

    I'm still left wondering how these things are overlooked when designing the system. It's like they didn't even create any scenarios in their head to test against what they thought would work. I gotta say that I'm not surprised though. I'm sure this kind of stuff happens all the time. I'm just the type of person that thinks decisions related to engineering should be left to engineers. Am I crazy for thinking that?

  • Anony Moose (unregistered)

    I love the whole "we haven't had a problem in a long time - see, there was no need for all that, er, hey, what happened to the lights?" aspect to that story.

    And even if James can't sign on, I thought they were supposed to have staff on site 24/7 - and one would assume that they're competent to handle emergencies, and that a night-shift janitor wouldn't qualify as being fully staffed. So where were the on-site staff who could get there in 5 minutes?

    I just hope they perform regular maintenance of the UPS so that they never have the unfortunate case of having a power failure reveal the fact that the shiny new high end UPS is dead. But I wouldn't be that optimistic - they don't even verify that their technicians can access the server remotely.   ;)

  • ptomblin (cs) in reply to Raymond Chen

    Raymond Chen:
    I still don't get it. The next time there's a power outage, the PAM stays up, the router stays up, but James can't sign on because he has no electricity!

     

    You never hear of laptops? I have my cable modem and wireless router on my UPS just in case the cable company aren't complete fuck-ups and they manage to keep the cable modem going through a power outage.  And if they don't, there's always dial-up - at least the phone company usually keeps going through power outages.

  • |+| (unregistered)

    The Real WTF is that James didn't report this to regulators.

  • Tyler Durden (unregistered)

    Woman on plane: Are there a lot of these kinds of accidents?
    Narrator: You wouldn't believe.
    Woman on plane: Which car company do you work for?
    Narrator: A major one.

  • Wickity (unregistered) in reply to |+|

    Agreed.

     

    Well, one of the many WTF's in there anyway. 

  • darin (cs) in reply to andyl

    andyl:
    It amazes me how companies are willing to risk millions of dollars in liability to save a few bucks here and there. Even after there are frequent problems they still don't get the idea

    Yeah, but think how stupid the management would have looked if they had spent thousands or tens of thousands of dollars on better reliability, and it turned out it wasn't need?  It's a twisted style of risk management.  People who save money get promoted, but people who spend money with no return on investment have a tougher time justifying their jobs.  The same people who have no problem buying insurance for their car or home may have difficulty applying the same rationalization at work.

     

  • darin (cs) in reply to GoatCheez

    GoatCheez:

    I'm just the type of person that thinks decisions related to engineering should be left to engineers. Am I crazy for thinking that?

    You're already an outlier case just for thinking.

  • Jay (unregistered)

    Recently our 1800 watt UPS gave out in the server room hard. Batteries didn't just die, whole unit died, taking 8 servers with it. In the meantime our primary net admin set up one of our 300 watt UPSes to just act as a surge protector (since 300 would keep 8 servers alive for, oh, a couple of milliseconds, give or take). Not a bad idea considering.

     A few weeks later the replacement UPS arrived. The net admin shut down 7 servers and plugged them into the new UPS. The 8th one, the all-important file and print server, remained on the 300 watt UPS. He decided he would unplug the UPS from the wall and plug it into the big UPS "real quick" and wouldn't worry about shutting it down until after business hours. It sounded flawless.

    That's when he discovered, though, that a power-hungry server will run on a 300 watt UPS for all of 4 seconds before draining the battery. The users were not especially happy that afternoon.
     

  • zerrodefex (unregistered) in reply to Rich

    And I thought it was absurd when the Electrical Contracting Company I had fixed a server for was too stubborn to purchase a UPS unit for it when they were having frequent power outages of their own, crashing the server almost daily.

    captcha: wtf 

  • MisFit (unregistered) in reply to Raymond Chen

    Raymond Chen:
    I still don't get it. The next time there's a power outage, the PAM stays up, the router stays up, but James can't sign on because he has no electricity!

    No, actually Alex garbled up the context.

    James couldnt sign on, because the UPS technician came early, and the management decided to switch off PAM without knowing how to do it, or even getting James to do it. That eventually resulted in the next crash.

     

    captcha: creative; Oh yes, WTFs are !!

  • stevekj (cs) in reply to Jay
    Anonymous:

    Recently our 1800 watt UPS gave out in the server room hard. Batteries didn't just die, whole unit died, taking 8 servers with it. In the meantime our primary net admin set up one of our 300 watt UPSes to just act as a surge protector (since 300 would keep 8 servers alive for, oh, a couple of milliseconds, give or take). Not a bad idea considering.

     A few weeks later the replacement UPS arrived. The net admin shut down 7 servers and plugged them into the new UPS. The 8th one, the all-important file and print server, remained on the 300 watt UPS. He decided he would unplug the UPS from the wall and plug it into the big UPS "real quick" and wouldn't worry about shutting it down until after business hours. It sounded flawless.

    That's when he discovered, though, that a power-hungry server will run on a 300 watt UPS for all of 4 seconds before draining the battery. The users were not especially happy that afternoon.
     

    What kind of ancient cheap-a$$ UPS technology (and software) is your company using that doesn't calculate the expected battery runtime for you by dividing the battery's charge level by the current power drain?  When I got this feature on my not-quite-bottom-of-the-barrel unit from a respectable brand name manufacturer, I figured it must be pretty standard.  If I'm drawing more than about half the rated power, it even tells me helpfully that I might want to consider getting a bigger UPS, as my runtime will probably be too short for a safe shutdown.

     

     

  • krisztian (unregistered)

    good god. this thing must be up 25/8, and those guys talk about how to make it shut down smoother. hope they don't design airplanes, as they certainly would put an emergency stop button near the door.

  • Local view (unregistered) in reply to frosty
    frosty:

    Anonymous:
    Sounds like they need a bigger gas tank for that UPS!

    What happens when all the gas stations are bone dry from everyone evacuating, and shipments in are delayed by blocked roads?  Do you call in a refueling helicopter or something? 

     Something like that. About 10 years ago, I worked for a company that got it. The whole building was on a UPS for surge protection. They had 3 very large turbines as a triple backup, a 30,000 gallon tank to keep them humming, standing contracts with multiple geographically disperse fuel suppliers, and monthly drills to fire the whole thing up for 10 minutes, and then switch back - live - midday. Except for the clouds of black smoke billowing out of the exhaust stacks, you couldn't even tell that the switch had occurred. All of this was at both the primary AND disaster sites. It was the one thing to which nobody ever said no (not the business, not the executives - it was $whatever-you-need - *very* rare!)

  • Freman (unregistered) in reply to Local view

    That sounds so much like management at my place. Instead of PAM we have 'Accounting Cache'.

    In essence it's responsible for everything that is our source of income.

     The Accounting Cache is a high performance machine that is built with speed in mind (ie: raid0). All the data on the Accounting Cache is volitile and only exists for 15 days before being erased, over that 15 days other databases log in and fetch the data for their own accounting purposes as the Accounting Cache is essentially a dump site for all the accounting data in the network (Don't underestimate the speed at which we go through a few million rows...)

    Every time there's a power outage it takes me 4 hours to recover this box - provided the only thing that gets messed up is the database. Then there's all the lost accounting data - which we rely on for an income stream, oh... and other companies rely on for their income stream. 

     Sure, it was on a UPS at one point, a cheap 2200 va that held it for 20-30 minutes, and sure we have a generator (we have 3). Problem is, these generators are the outback kind (IE: you run your fridge, maybe your TV on them, not $50,000 worth of computer equipment). Did I mention the cheap 2200 va UPS doesn't like the output from the generator?

  • mcguire (unregistered) in reply to Anony Moose

    Anonymous:
    I love the whole "we haven't had a problem in a long time - see, there was no need for all that, er, hey, what happened to the lights?" aspect to that story.

     

    Never, ever, ask "I wonder what would happen if the power went off?" while you are standing in a machine room.  Ever.  Really.  I mean it.

     

  • rmg66 (cs) in reply to Tyler Durden

    Anonymous:
    Woman on plane: Are there a lot of these kinds of accidents?
    Narrator: You wouldn't believe.
    Woman on plane: Which car company do you work for?
    Narrator: A major one.

     Best Movie EVER!

  • Anonymous (unregistered) in reply to GoatCheez

    GoatCheez:
    I'm just the type of person that thinks decisions related to engineering should be left to engineers. Am I crazy for thinking that?

    Yes, engineers are for doing technical day-laborer type work.  Engineering should be performed by supervisors and executive assistants.

  • RonBeck62 (cs)

    I like the picture of the gas powered water pump.  Shouldn't the pump be on a wooden table?

  • Wes (unregistered) in reply to RonBeck62

    100000!!!

  • Jer (unregistered) in reply to |+|

    Anonymous:
    The Real WTF is that James didn't report this to regulators.

    Best point made yet. 

    How will James feel when a gas leak blows up an apartment building and kills 100 people, when he knew about the problems all along. 

    How will management feel when the lawsuits (both civil and criminal) come knocking.

    We all hate regulation, but there is a reason for it (at least in this case).

     

  • Grimoire (cs) in reply to Wes

    Anonymous:
    100000!!!

    Close, but no cigar... 

  • cavemanf16 (cs) in reply to GoatCheez

    A common problem I have found at my work place is that there seems to be some level of leadership (and I can't pinpoint where exactly) that believes that quality is not linked to speed and cost. Because measuring quality takes a little more fore-thought and effort to implement then simply measuring time and cost.

    So, for instance, we are required to work projects and always get them done on time and around the original budget, but once that project is done we don't really have to justify any cost savings, additional revenue, or improved reliability in the system. So guess what suffers on projects? That's right, the quality.

    The same decision is what has been made in James' case. Management decided that they could easily and immediately measure cost (how much will a better UPS cost) and time (how long will it take to make this change) with frightening disregard for quality (are their safety concerns, payback concerns, regulatory fines, or other quality issues that will either NOT be addressed, or negatively impacted by making this change). Sadly, this WTF really isn't that off-the-wall if you ask me. This is just sub-par business as usual.

    It doesn't even take an engineer to think about balancing cost-time-quality like I propose; just someone with more common sense than your average person.
     

  • tin (cs)

    Can I say only in America? Well, I guess I can, but it's probably not true.

     
    I do find it funny that a gas company doesn't even want to spend money on a gas fired generator. It'd be like a petrol company making all their company vehicles pedal-powered.

  • Franz Kafka (unregistered) in reply to frosty
    frosty:

    Anonymous:
    Sounds like they need a bigger gas tank for that UPS!

     

    What happens when all the gas stations are bone dry from everyone evacuating, and shipments in are delayed by blocked roads?  Do you call in a refueling helicopter or something? 

     

    Uh, yes? If it's really that important, you send in a chopper. 

  • Franz Kafka (unregistered) in reply to Raymond Chen

    Raymond Chen:
    I still don't get it. The next time there's a power outage, the PAM stays up, the router stays up, but James can't sign on because he has no electricity!

     

    Heh heh, so cynical, and entirely correct. 

  • N (unregistered)

    I worked for a minor phone company of a small middle european company (minor means 20% market share). There were 240 employees, one main office complex shared with some other companies, and of course BSCs and four MSCs and a bunch of mobile phoning service servers, all reachable by local support within 20 minutes. A power loss in the main center would have meant a 10 minute service shutdown until the backup MSCs and, especially, all routing equipement swallowed the last routing table backup (maximum age of 2 minutes. All data is shared on multiple Raid Volumes, every process fully mirrors to a large backup cluster (which, for those who would not guess, backs up to yet another cluster)). The whole system runs on a massive UPS (basically a floor full of batteries) for 4 hours. If after 3 hours the power is not back on, 3 diesel engines in the basement start. It takes about 15 minutes until all are up and producing electricity. Those run for 47 days with the gasoline stored below the basement. Every week all engines are started twice and one of the backup clusters switches to diesel generator power source to ensure all works and then switches back. That ment funny 30 minutes of incredibly loud, booming sound accompanied by shattering and shaking of the whole house twice a week. No one ever even _thought_ about shutting down the backup system for convenience or to reduce costs. So a company providing services of minor importance for a limited number of people is able to provide practically indestructable services (excluding, say, a bomb[1]), even if the lights go out throughout the country (and, just to give some idea: Even the headquarter of our military works for only 4 weeks before the fuel runs out - there are large tanks to refuel, though), which is why we have been (and they still are, I think) the second fall-back for emergency communication (police, medics, firefighters, military and parliament, basically) while a major gas supplier cant?

     

    What is wrong with those people.

     Yet there was a WTF. Once a very clever co-worker shut down the main data cluster. The transition to the still working clusters went flawlessly, only 2 minutes service loss. But the WTF was: There was no alarm, no e-mail to anyone, the system silently corrected the error and reconfigured all services to cluster two. We would not have noticed if another co-worker had not tried to access cluster one after the shutdown. We probably would not have noticed until the diesel engines started in case of a real power surge. We quickly set up an alarm system after that, though.
     

     

    ad 1: As this story is long already, only a short postscriptum: Funnily enough, if something within one of the server rooms spread across the house catches fire, the whole room is full of N2 within 5 seconds (after a 20 second emergency alarm so people can get out to prevent dying before the doors close) to put the fire out. And just in case the house collapses, there is a small cluster in a city 200 miles from the one the headquarters are located in which can keep up emergency services (as stated above) for 4 weeks and all services for 2, depending on how big the crisis is and what officials decide. 

  • jetcitywoman (cs) in reply to N

    I used to do programming and system support for a busy 911 dispatch center.  As I later learned is typical of small government, dispatch was shoved into a dark corner of the basement of the old government building.  The computers were in another room, also in the basement, but they had a very large battery-powered UPS system.  While I worked there, we had several total power failures in downtown that lasted several hours, and our servers hummed merrily along.

    The room that the dispatchers were in, on the other hand, were on city power with no UPS whatsoever.   When we had downtown blackouts, our 911 dispatchers were suddenly without any phones, radios, or computer access.  Oh, and no emergency lighting either.  When it went black and silent, there would be a momentary pause as everybody realized what happened, before they'd start scrambling for flashlights and cell phones.


     

  • Darryl Smith, VK2TDS (unregistered) in reply to jetcitywoman
    Comment held for moderation.
  • Mike (unregistered) in reply to andyl
    Comment held for moderation.
  • cheesy (cs) in reply to Raymond Chen
    Raymond Chen:
    I still don't get it. The next time there's a power outage, the PAM stays up, the router stays up, but James can't sign on because he has no electricity!

    Sure he does... he's on another UPS.

    Oh...right...
  • Kay (unregistered) in reply to Jay
    Anonymous:

    That's when he discovered, though, that a power-hungry server will run on a 300 watt UPS for all of 4 seconds before draining the battery. The users were not especially happy that afternoon.
     

    That's about 1200Ws or 1000As at 1.2V or 300mAh at 1.2V. Your UPS was either powered by a single, half drained AAA cell or broken to begin with. Or you might be talking out of your ass, I'm not completely sure.

  • N^2 (unregistered)

    Those of you that work for utilities know how frighteningly common this sort of "pinch pennies to spend dollars" mentality is. The rest of you ... well, it's what I consider irrefutable proof that the universe is innately benign. Sleep tight.

  • anonymouse (unregistered) in reply to Jay
    Anonymous:

    Recently our 1800 watt UPS gave out in the server room hard. Batteries didn't just die, whole unit died, taking 8 servers with it. In the meantime our primary net admin set up one of our 300 watt UPSes to just act as a surge protector (since 300 would keep 8 servers alive for, oh, a couple of milliseconds, give or take). Not a bad idea considering.

     A few weeks later the replacement UPS arrived. The net admin shut down 7 servers and plugged them into the new UPS. The 8th one, the all-important file and print server, remained on the 300 watt UPS. He decided he would unplug the UPS from the wall and plug it into the big UPS "real quick" and wouldn't worry about shutting it down until after business hours. It sounded flawless.

    That's when he discovered, though, that a power-hungry server will run on a 300 watt UPS for all of 4 seconds before draining the battery. The users were not especially happy that afternoon.
     

     This is where redundant dual power supplies in servers are handy.  You can unplug and replug each PSU without taking the server down.  Very handy when your UPS starts reporting a battery fault...

     

Leave a comment on “A UPS Should Be Fine”

Log In or post as a guest

Replying to comment #:

« Return to Article