• Michael Brecht (unregistered)

    Poor Dave. I feel sorry for him.

  • (cs)

    Ah... no red light that kept asking while he fixed it "What are you doing Dave..."

  • WoeIsMe (unregistered)

    Now we know why Dave was leaving: Karma Escape

    captcha: worry

  • (cs)

    His database had arrived at the Perly gates.

  • (cs) in reply to matthewr81
    matthewr81:
    Ah... no red light that kept asking while he fixed it "What are you doing Dave..."
    "I'm going to fix you." "Without your knowledge of Perl? You're going to find that rather difficult."

    (...it's a fair use of the quote given that in both scenarios the most likely outcome is some dude's head exploding)

  • (cs)

    He probably should have tested those backups and buffer overflows even if Dave wasn't worried about it.

  • Dave (unregistered) in reply to Michael Brecht
    Michael Brecht:
    Poor Dave. I feel sorry for him.

    Eh... don't worry 'bout it.

  • wk2x (unregistered)

    It's the story of Dave and Bowie, and a labyrinth of wtf-ridden code.

  • (cs)

    Wouldn't the quickest way to get the system back up again be to remove the last name added and re-run the script? Removing the last name would reduce the size of the file back below the limit so it should therefore run OK. Obviously until the script is rewritten you wouldn't be able to add anybody new, but at least existing users would be able to gain access.

    I am interpreting this as a a problem migrating the list of users around the network, so the original list would be intact but to long, however I may be entirely wrong on this....

  • ozyman (unregistered)

    You know, it's really Bowie's fault for not looking more closely at his system. We all inherit crap from time to time, but he had a couple of months to check up on all those bad feelings he had... What was he doing while it was all working well? Playing ZAngband?

  • Matthew (unregistered) in reply to Michael Brecht

    Ugh! I HATE DDS/DAT tapes. They are so SLLOOOOOWWWWW. Especially when they're unreadable.

  • SomeCoder (unregistered) in reply to ozyman
    ozyman:
    You know, it's really Bowie's fault for not looking more closely at his system. We all inherit crap from time to time, but he had a couple of months to check up on all those bad feelings he had... What was he doing while it was all working well? Playing ZAngband?

    Agreed. Every Unix admin should know Perl so Bowie was kind of unqualified in my opinion. Had I been in that position, I would have spent the first few weeks automating all the stuff that could be automated and fixing the Perl script along with whatever else needed to be fixed from Dave's feeble attempt at being an admin.

    After you do that, the failure would not have occurred, and Bowie could go to surfing the internet and fixing the occasional random problem.

  • Matthew (unregistered) in reply to DAL1978
    DAL1978:
    Wouldn't the quickest way to get the system back up again be to remove the last name added and re-run the script? Removing the last name would reduce the size of the file back below the limit so it should therefore run OK. Obviously until the script is rewritten you wouldn't be able to add anybody new, but at least existing users would be able to gain access.

    I am interpreting this as a a problem migrating the list of users around the network, so the original list would be intact but to long, however I may be entirely wrong on this....

    Since he pushed a broken database to all the machines, the scripts may not have been able to authenticate to push out a fixed copy.

  • brian j. parker (unregistered)

    I have to at least respect someone who takes responsibility for their WTF.

  • (cs)

    Interesting version of TDWTF's "anonymization" on this one.

    I'm sure that alarm bells, both on the interesting method of updating network passwords, and on the issue of verifying backup tapes (c'mon -- how hard can that be, even for a SysAdmin?) started ringing.

    Equally, I'm sure they didn't start ringing during the handover. I suspect they started ringing at just about the point that Dave's phone exploded. In other words, an ex post facto justification if ever I saw one.

    And what's this about DB2, for Chrissake? Are we adding a DBAdmin WTF to a multiple SysAdmin WTF?

    Still, we live and learn. I'm damn sure that Dave has...

    ed: as with most articles, only names were changed

  • (cs)

    floccinaucinihiliblackmetalipilificationbicarbonateblutausnord?

    nihil? black metal? blut aus nord?

    i get the feeling someone enjoys the dark side of thrash hehe

  • (cs) in reply to real_aardvark
    real_aardvark:
    Interesting version of TDWTF's "anonymization" on this one.

    I'm sure that alarm bells, both on the interesting method of updating network passwords, and on the issue of verifying backup tapes (c'mon -- how hard can that be, even for a SysAdmin?) started ringing.

    Equally, I'm sure they didn't start ringing during the handover. I suspect they started ringing at just about the point that Dave's phone exploded. In other words, an ex post facto justification if ever I saw one.

    And what's this about DB2, for Chrissake? Are we adding a DBAdmin WTF to a multiple SysAdmin WTF?

    Still, we live and learn. I'm damn sure that Dave has...

    Oh, and if that sounds deprecatory to Dave, I'd like to balance it with one of my own SysAdmin screw-ups.

    I was installing some new terminal drivers (or something, I forget what) that required a company-wide reboot. (Not Unix, but VOS, which is similar in some respects.) Unfortunately, it took me until 11:30 am before I'd got everything set up for the reboot, and the pub down the road was about to open ...

    ... and I think you can see where this is going. "No problem," thought I, "I'll just stick the reboot in the batch queue for mid-day, when everybody else will be down the pub."

    Turns out that everybody else, up to and including the company's owners, were indeed down the pub. In fact, they'd all done a pub crawl. Not for beer: just looking for me. (I hadn't mentioned where I was going, but they were pretty sure it was one of the twenty or so pubs in town.)

    Apparently, the first thing that VOS does when it starts up is to restart the batch queue. Including the last command in the batch queue, which it assumes didn't complete. Which would be the forced reboot.

    In those days, it took about two and a half seconds to restart the batch queue, although you could cancel a job from the console. Luckily I'm a fast typist.

  • (cs)

    Dave and Bowie? I assume the servers were the Spiders from Mars then?

  • (cs) in reply to real_aardvark
    real_aardvark:
    real_aardvark:
    Interesting version of TDWTF's "anonymization" on this one.

    I'm sure that alarm bells, both on the interesting method of updating network passwords, and on the issue of verifying backup tapes (c'mon -- how hard can that be, even for a SysAdmin?) started ringing.

    Equally, I'm sure they didn't start ringing during the handover. I suspect they started ringing at just about the point that Dave's phone exploded. In other words, an ex post facto justification if ever I saw one.

    And what's this about DB2, for Chrissake? Are we adding a DBAdmin WTF to a multiple SysAdmin WTF?

    Still, we live and learn. I'm damn sure that Dave has...

    Oh, and if that sounds deprecatory to Dave, I'd like to balance it with one of my own SysAdmin screw-ups.

    I was installing some new terminal drivers (or something, I forget what) that required a company-wide reboot. (Not Unix, but VOS, which is similar in some respects.) Unfortunately, it took me until 11:30 am before I'd got everything set up for the reboot, and the pub down the road was about to open ...

    ... and I think you can see where this is going. "No problem," thought I, "I'll just stick the reboot in the batch queue for mid-day, when everybody else will be down the pub."

    Turns out that everybody else, up to and including the company's owners, were indeed down the pub. In fact, they'd all done a pub crawl. Not for beer: just looking for me. (I hadn't mentioned where I was going, but they were pretty sure it was one of the twenty or so pubs in town.)

    Apparently, the first thing that VOS does when it starts up is to restart the batch queue. Including the last command in the batch queue, which it assumes didn't complete. Which would be the forced reboot.

    In those days, it took about two and a half seconds to restart the batch queue, although you could cancel a job from the console. Luckily I'm a fast typist.

    BRRILLIANT!! An infinite reboot loop. wait, I thought happy hour wasn't till 4p. I like how everyone knew where to go looking for you.

  • Coyo7e (unregistered) in reply to voidy
    voidy:
    floccinaucinihiliblackmetalipilificationbicarbonateblutausnord?

    nihil? black metal? blut aus nord?

    i get the feeling someone enjoys the dark side of thrash hehe

    ..And David Bowie, as well.

  • Crash Magnet (unregistered)

    Prepare three letters...

    Crash Magnet

  • (cs) in reply to Strider
    Strider:
    BRRILLIANT!! An infinite reboot loop. wait, I thought happy hour wasn't till 4p. I like how everyone knew where to go looking for you.
    Believe me, in England, happy hour is whenever the pubs are open. And only, ever, then.

    And try typing that fast when you've got four pints of HSB inside you ....

  • Ken (unregistered) in reply to Coyo7e
    Coyo7e:
    voidy:
    floccinaucinihiliblackmetalipilificationbicarbonateblutausnord?

    nihil? black metal? blut aus nord?

    i get the feeling someone enjoys the dark side of thrash hehe

    ..And David Bowie, as well.

    ... And David Bowie's pants as well.

    captcha: muhahaha

  • Zach Beane (unregistered)

    Wow, this sounds just like Bowie Poag!

    Here's some fine C code from Bowie Poag's "pogo" project:

    system("rm -f ~/.pogo-3.0/.pogo-lock"); system("touch ~/.pogo-3.0/.pogo-lock");

    The ultimate in portability!

  • (cs) in reply to Zach Beane
    Zach Beane:
    system("rm -f ~/.pogo-3.0/.pogo-lock"); system("touch ~/.pogo-3.0/.pogo-lock");

    The ultimate in portability!

    Actually, that would work just fine on my Windows machines. UnxUtils 4tw!

    http://unxutils.sourceforge.net/

    Edit: if anyone clicks the link and likes what they see, i had some trouble with man not working. I replaced it with man.bat:

    @echo off
    %1 --help
    
  • (cs) in reply to Ken
    Ken:
    Coyo7e:
    voidy:
    floccinaucinihiliblackmetalipilificationbicarbonateblutausnord?

    nihil? black metal? blut aus nord?

    i get the feeling someone enjoys the dark side of thrash hehe

    ..And David Bowie, as well.

    ... And David Bowie's pants as well.

    captcha: muhahaha

    Well, the "nihil" is part-Shakespearean, as in "floccinaucinihilipilification." I dread to think what "Bard-metal" might be. On the other hand, many people claim that metal started in Birmingham, just down the road from Stratford-on-Avon, and the original home of Ozzy Osbourne, so perhaps Will was on to something. Or on something.

    Either way.

  • Admiral Justin (unregistered) in reply to AgentConundrum
    AgentConundrum:
    Actually, that would work just fine on my Windows machines. UnxUtils 4tw!

    http://unxutils.sourceforge.net/

    Edit: if anyone clicks the link and likes what they see, i had some trouble with man not working. I replaced it with man.bat:

    @echo off
    %1 --help
    

    actually, I prefer the package selection from http://gnuwin32.sourceforge.net/, I've had no problems with them. Other than a crash bug a couple years ago when I tried to grep a folder containing about 15,000 text files... but it's been fixed.

  • Jonathan (unregistered)

    This WTF is almost exactly like my experience a couple years ago. Fortunately, I tore out and replaced the backups and the user/authentication system within the first month.

    My bad experience was that I had to be more strict about things, and I kept being compared to the previous guy, because he'd let them get away with things, and didn't have a regular maintenance period where they couldn't use the HPC clusters. Before I arrived, the update policy was "Don't break things with new updates."

  • Ambo (unregistered)

    Any time we point out to my boss something that is currently working but will eventually (usually sooner then later) blow up. No matter how we try to explain to him how much cheaper, faster, and easier we could fix it today then after it's broken, we get the same response.... "That sounds like work." So we are constantly trying to recover from crap that should have been prevented in the first place.

  • WC (unregistered)

    You know, these things are getting more boring all the time. It's like your Uncle telling stories of when he was a kid, but he can't help but elaborate on every little detail, no matter how inconsequential to the story, simply because he has a captive audience.

    Try to cut these things down to just the original story, instead of adding all the extra crap. You're boring us.

  • Trix (unregistered)

    One, as others have pointed out, he knew there were problems with the code and potential problems with the backups - and didn't test them in advance?

    Second, as much as he would've loved to point the finger of blame on Dave, professional integrity prevented him from doing so.

    Since when? If I inherit a PoS system from someone, I'm not going to shuffle my feet and grin when it fails on me and I have to shovel the resulting crap, or I have to spend significant amounts of time getting it into a decent state.

    The guy should have had the balls to say "Unfortunately the system wasn't in a great state when I started, and even more unfortunately, I didn't rectify the potential issues in time." and included a huuuuge mea culpa about not testing the backups! Of course, there is a built-in time limit for that kind of thing.

    The best thing in those situations is to spend some time writing a report to management right at the start identifying potential issues (and hopefully fixing the worst ones as you go along). While dropping someone else in it is never fun, getting some leeway from management for you to deal with the issues is better. Also, you can write it in such a way that you don't sound all "X was an idiot, and look at the mess he made".

  • Mitchell T (unregistered) in reply to Trix

    He didn't blame the prior admin because he had at least a month to do HIS due diligence.

    Which to me would have been:

    Test that the backups work by restoring to a separate system and bring up a database. Start learning perl (honestly, what admin doesn't know perl?) and rewrite at worst, update the existing script to gracefully handle errors.

    You cannot blame your predecessor for things you had time to fix before they broke. That is his job. Period. He also should have done what you said and alerted management at the first opportunity. That way they at least have a good idea of what might have happened when things hit the fan.

    But blaming your predecessors for things you are paid to maintain/fix? That is a wtf in itself. If you want a real wtf I have seen, take this lovely abbreviated snipped of Korn shell

    VAR=some command that normally returns a path, except when it doesn't ... ... rm -Rf /${VAR}

    (for you non-unix people, rm -Rf means recursively force removal of all directories in the path specified, since the variable happened to be null it dutifully removed the entire os while it was running. Fortunately hpux doesn't let you remove files with processes attached, so it took a while to notice nothing worked anymore. :)

    And yes, there were NO checks of the variable or return code of the command. This happily brought a hpux db server down quite perfectly. And this by one of the companies "best programmers".

    Riiight.

  • RedNek (unregistered)

    The real WTF: his hacking rhythm "Hack, hack, hack, breathe, hack, blink, hack, hack, breathe, breathe, hack," thats just ridiculous, everyone knows you should always "Hack, hack, hack, breathe, hack, blink, hack, hack, hack, breathe, hack, blink ..."

  • tfug (unregistered) in reply to Zach Beane

    "This guy's kind of a dick, thought Bowie."

    Ding ding ding, it's Bowie Poag!

    Bowie, why don't you tell them about the tube top.

  • dkf (unregistered) in reply to Mitchell T
    Mitchell T:
    And this by one of the companies "best programmers".
    Did they become one of the company's best ex-programmers? Please?
  • Tias (unregistered) in reply to real_aardvark
    real_aardvark:
    real_aardvark:
    Interesting version of TDWTF's "anonymization" on this one.

    I'm sure that alarm bells, both on the interesting method of updating network passwords, and on the issue of verifying backup tapes (c'mon -- how hard can that be, even for a SysAdmin?) started ringing.

    Equally, I'm sure they didn't start ringing during the handover. I suspect they started ringing at just about the point that Dave's phone exploded. In other words, an ex post facto justification if ever I saw one.

    And what's this about DB2, for Chrissake? Are we adding a DBAdmin WTF to a multiple SysAdmin WTF?

    Still, we live and learn. I'm damn sure that Dave has...

    Oh, and if that sounds deprecatory to Dave, I'd like to balance it with one of my own SysAdmin screw-ups.

    I was installing some new terminal drivers (or something, I forget what) that required a company-wide reboot. (Not Unix, but VOS, which is similar in some respects.) Unfortunately, it took me until 11:30 am before I'd got everything set up for the reboot, and the pub down the road was about to open ...

    ... and I think you can see where this is going. "No problem," thought I, "I'll just stick the reboot in the batch queue for mid-day, when everybody else will be down the pub."

    Turns out that everybody else, up to and including the company's owners, were indeed down the pub. In fact, they'd all done a pub crawl. Not for beer: just looking for me. (I hadn't mentioned where I was going, but they were pretty sure it was one of the twenty or so pubs in town.)

    Apparently, the first thing that VOS does when it starts up is to restart the batch queue. Including the last command in the batch queue, which it assumes didn't complete. Which would be the forced reboot.

    In those days, it took about two and a half seconds to restart the batch queue, although you could cancel a job from the console. Luckily I'm a fast typist.

    Wait, you were drinking during work-hours? Is that, uhm, cool over there? Cause if it is, I'm moving to england!

  • Ed (unregistered)

    Two things about this site:

    Fix the line wrap in Full Article mode

    Make a cookie or whatever so the site comes up in Full Article mode when I come back tomorrow.

    Thank you!

  • (cs)

    Let me get this right... guy who seems a bit unqualified takes over as a sysadmin.

    He knows the add users script is likely to fail. He knows the tape backup is likely to fail.

    He does nothing about either of these things, sits on his backside and waits for them to fail, then panics because he doesn't know how to do his job properly.

    I'm glad Bowie isn't running my network!

  • Mike Dimmick (unregistered) in reply to real_aardvark
    real_aardvark:
    And what's this about DB2, for Chrissake? Are we adding a DBAdmin WTF to a multiple SysAdmin WTF?

    Unless this is some highly weird IBM implementation, NIS/YP uses Berkeley DB. Now that is a WTF in itself. A proper database would have either applied all the changes or rolled them all back.

  • (cs) in reply to Strider
    Strider:
    real_aardvark:
    Apparently, the first thing that VOS does when it starts up is to restart the batch queue. Including the last command in the batch queue, which it assumes didn't complete. Which would be the forced reboot.

    In those days, it took about two and a half seconds to restart the batch queue, although you could cancel a job from the console. Luckily I'm a fast typist.

    BRRILLIANT!! An infinite reboot loop.

    I just recovered my workstation from an infinite reboot loop. TIP: Don't tell windows that you have an "APC Back-UPS" on COM1 when you don't. If the drivers fail to find that model of UPS on that port or even if there's nothing plugged in then they assume you have no power and shutdown the pc. Start up, 5 sec wait, automatic shutdown. I managed to change it to the correct UPS in the end ("APC Smart-UPS"), but the shudown was queued already so the pc shutdown anyway.

  • Pooma (unregistered) in reply to voidy
    voidy:
    floccinaucinihiliblackmetalipilificationbicarbonateblutausnord?

    nihil? black metal? blut aus nord?

    i get the feeling someone enjoys the dark side of thrash hehe

    Floccinaucinihilipilification (sic?) is "a thing of no use", thus spawning a new class of words which are what they mean - meanomaticpayees if you like. It's also very hard to spell, and is, as a lot of us probably already know the longest word you are likely to find in most dictionaries. There's some kind of silicon-in-lung disease that isn't considered at 'real' word.

    Yes, we are trivia monkeys here.

    Is blutasnord Blood of the North?

  • steve (unregistered) in reply to Mitchell T

    rm -Rf /${VAR}

    I had the same thing in some legacy code... slightly better in that it was

    cd ${VAR} rm -Rf .

    First people came to me and said that they couldn't log into the machine... then they told me, oh yeah, their home directory disappeared seemingly randomly on that machine...

  • Cloak (unregistered) in reply to SomeCoder
    SomeCoder:
    ozyman:
    You know, it's really Bowie's fault for not looking more closely at his system. We all inherit crap from time to time, but he had a couple of months to check up on all those bad feelings he had... What was he doing while it was all working well? Playing ZAngband?

    Agreed. Every Unix admin should know Perl so Bowie was kind of unqualified in my opinion. Had I been in that position, I would have spent the first few weeks automating all the stuff that could be automated and fixing the Perl script along with whatever else needed to be fixed from Dave's feeble attempt at being an admin.

    After you do that, the failure would not have occurred, and Bowie could go to surfing the internet and fixing the occasional random problem.

    I must agrre with that. You wouldn't let things run like that just because the guy who left said "Don't worry". And also how come the user name could be longer than 1024 characters??? Or was it the entire list of allowed users??? But then why is there a limit? And who creates those user names? There must still be a second person who should have known about the limit or was Bowie the "system administrator"? Then it would be his fault.

    After all Bowie seems not to be the best person for this job. He didn't take action when it was needed and wasn't up to date when the catastrophy started. He relied too much on "Don't worry" and didn't check himself if he really has not to worry.

  • Cloak (unregistered) in reply to SomeCoder
    SomeCoder:
    ozyman:
    You know, it's really Bowie's fault for not looking more closely at his system. We all inherit crap from time to time, but he had a couple of months to check up on all those bad feelings he had... What was he doing while it was all working well? Playing ZAngband?

    Agreed. Every Unix admin should know Perl so Bowie was kind of unqualified in my opinion. Had I been in that position, I would have spent the first few weeks automating all the stuff that could be automated and fixing the Perl script along with whatever else needed to be fixed from Dave's feeble attempt at being an admin.

    After you do that, the failure would not have occurred, and Bowie could go to surfing the internet and fixing the occasional random problem.

    I must agrre with that. You wouldn't let things run like that just because the guy who left said "Don't worry". And also how come the user name could be longer than 1024 characters??? Or was it the entire list of allowed users??? But then why is there a limit? And who creates those user names? There must still be a second person who should have known about the limit or was Bowie the "system administrator"? Then it would be his fault.

    After all Bowie seems not to be the best person for this job. He didn't take action when it was needed and wasn't up to date when the catastrophy started. He relied too much on "Don't worry" and didn't check himself if he really has not to worry.

  • (cs) in reply to real_aardvark
    real_aardvark:
    (Not Unix, but VOS, which is similar in some respects.)
    Wow, someone else who's worked with VOS. I've even used it at THREE different jobs (though #1 was as a consultant to the company of #2)....
  • (cs)

    While we're confessing our own little WTFs: I once did some work on a server by ssh'ing in from home, and wanted to reset the main Ethernet device. Out of sheer force of habit (I rarely ssh'ed at the time, but occasionally needed to reset an interface), I did "ifdown eth0". Yes, eth1 existed on the hardware, but no, it was not enabled.

    Fortunately this was experimental, not production, so neither the client nor my boss was peeved when I called the client and asked him to have someone reboot the server. (Yes, I could have asked him to have someone log in and "ifup eth0", but that would have either required directions or assumed competent personnel. Rebooting, I figured, they probably had to do once in a while anyway, and would know how to do safely....)

  • (cs)

    Ah.... the mark of a "true" professional - quit before it blows. The blame will always fall onto the person who took over your job.

    I sometimes wished i owned a gun and knew the address of past developers who wrote code i have had to maintain, and fix.

  • (cs) in reply to WC
    WC:
    You know, these things are getting more boring all the time. It's like your Uncle telling stories of when he was a kid, but he can't help but elaborate on every little detail, no matter how inconsequential to the story, simply because he has a captive audience.

    Try to cut these things down to just the original story, instead of adding all the extra crap. You're boring us.

    Don't read them then. Man, you whiners just don't get it, do you? If you don't like the posts, don't read them. It's that simple.

    What gets posted and how the content is written isn't up to you, or me, or anyone else but Alex and people he thinks should decide. If you don't like it, don't read it.

    Have I mentioned that if you don't like it you shouldn't read it? I figure if I say it enough, you might understand. You know.... Sorta like training a not-too bright puppy. You tell him "No!" when he starts to pee on the carpet, and you put him outside to finish. Eventually, he learns that he shouldn't do that. The less intelligent the puppy, the more times the process has to be repeated. So have you got it yet, or do you need to be told more times?

  • Dan (unregistered) in reply to Tias

    Yeah, that's quite common, albeit a bit less so now than it used to be.

  • anon (unregistered)

    And the lesson is, there really was nothing to worry about. The users didn't come with pitchforks and torches. They never do. The world wasn't pitched into economic chaos. It never is. Sysadmins are too uptight, and need to relax a little. Maybe smoke some dope.

Leave a comment on “Eh, Don't Worry About It”

Log In or post as a guest

Replying to comment #147586:

« Return to Article