• akatherder (unregistered) in reply to Mark Ransom
    Mark Ransom:
    I know this varies between companies of differing sizes, but if my company lost 1/2 a days data, that's millions of dollars down the drain and someone's going to lose a job over it. Does the general feeling of "a day or two of data loss isn't so bad..." that has been somewhat prevalent on this thread strike anyone else as a little WTF'ish?

    Mark

    A day or two is unacceptable, but by the nature of backups you're going to lose that half a day worth of data.

  • Schmed (unregistered) in reply to Mark Ransom
    Mark Ransom:
    I know this varies between companies of differing sizes, but if my company lost 1/2 a days data, that's millions of dollars down the drain and someone's going to lose a job over it. Does the general feeling of "a day or two of data loss isn't so bad..." that has been somewhat prevalent on this thread strike anyone else as a little WTF'ish?

    Mark

    The trouble seems to be that most of the discussion is geared towards a "home" or "company with <5 employees" situation. They aren't serious solutions for serious businesses. I know our daily backup is one of the most important things in my job, if things start going wrong with it, I start flapping.

  • (cs) in reply to Schmed
    Schmed:
    Loren Pechtel:
    Ok, the price of tapes has dropped relative to drives since the last time I priced them. The last I was looking the tapes were slightly less per gb but there was no way that would ever overcome the price of the tape drive.

    Yeah, the only way I can justify the cost of three tape drives (the last one was £850) and the software to handle the backups is by using so many tapes that the £/GB savings completely outweigh the cost of the drives.

    My previous job used tape backup, and this was a policy me and my boss imposed.

    a DDS-4 40Gb DAT tape ~ 75 pesos (something like 7 bucks).

    And those using CD's or DVD's for backup ... just wait and see when those things fail. I lost about 6 GB and 5 years worth of data because of CRC failure.

    I've totally mistrusted CD/DVD's since then, and I'm currently saving for a DAT 72 drive.

    External Hard Disks are a good alternative ... but not by abusing RAID like these dudes!!!!! (and I still prefer tape over other methods)

  • Crypl (unregistered) in reply to Jerim
    Jerim:
    I have to agree. I can see a "I have never seen that before" but not a WTF. The goal of any backup system is to be able to recover most of the data if something breaks. It seems to me that that requirement was met. You may not like the means, but the results are all that matters.

    By this logic (ends justify the means) if everyone is so concerned about ending the conflict in the Middle East, all military presence should be removed and we should just nuke the hell out of it until everyone is dead. No life = no conflict.

    Just because it satisfies 1 requirement does not mean that it should be considered a solution.

  • Abigail (unregistered) in reply to Rimbaud
    Rimbaud:
    The REAL WTF here is that a network support guy uses Fahrenheit to measure temperature. In 2007!

    In America! Shocking. :)

    Captcha: muhahaha (should I be worried?)

  • (cs) in reply to Tigress
    Tigress:
    Jimmy Jones:
    Ummm....so you put the previous day's disk as "master" and there's all your files.

    YMMV but personally I'd call that a "backup".

    PS: It worked! They only lost half a day's files!

    I love you guys! This is why I make so much money and get to tell WTF-stories to my friends. :)

    To all you "it works, it's a backup" people... consider what would've happened if they HADN'T pulled the drives on a daily basis. The "master" would've died and the "slave" would've kept the array alive. Result, NO data lost, not even that half a day's worth. (Note that this does not replace a proper backup solution).

    Disk based backups are an excellent alternative these days and I strongly recommend those in any environment. They should be used in conjunction with other forms, such as off-site storage or tape backups though. And, they should be set up in a proper way that ensures data integrity.

    But, by all means, please keep abusing systems for your backups. I'll be there to pick up the pieces after you're gone. :D

    "What do you mean, just use the tank cap? Replacing the tank works fine!"

    If mirroring is done correctly, the two disks stay updated at all times. When you write to one you write to the other at the exact moment. The second the master drive goes down, you simply pull the master, replace it with the backup, and restart. You lose about 2 minutes. The problem here is that a lot of time went by before anyone noticed the problem, which shouldn't happen. There are a lot of tools out there to alert someone of a harddrive/system failure. A simple ping alert system would tell you the moment the server stopped responding. At my old office, we used an audible system that would beep loudly if there was a problem.

    I am not saying RAID is the best way to do backups, but it certainly isn't a WTF. A WTF is someone using floppy disks for backups or using the network to back up everything to the secretary's computer. We can quibble over the specifics all day, but there is nothing here that seems so far out there that you wonder what the person was possibly thinking. There are many ways to do backups. You may have a personal preference, but I don't believe in the "my way is the only way" mentality.

  • (cs) in reply to Mark Ransom
    Mark Ransom:
    I know this varies between companies of differing sizes, but if my company lost 1/2 a days data, that's millions of dollars down the drain and someone's going to lose a job over it. Does the general feeling of "a day or two of data loss isn't so bad..." that has been somewhat prevalent on this thread strike anyone else as a little WTF'ish?

    Mark

    I would be curious how your company avoids ever losing any data. Even with tapes, 1/2 a day or more would be lost, unless you routinely back up every hour or so. I think if your business is dealing in the millions, you probably have a pretty robust/expensive system of redundant servers, and various forms of backups.

  • Tim (unregistered)

    I'd like to know where you guys are finding such cheap hard drives. An LTO2 drive is under $1,300, and 200/400GB tapes are $33 apiece. On the off chance that you'll find a hard drive that comes anywhere near that price, you're dealing with a rebate, which means you'll only get one or two at that price.

    External drives are fine for a small shop, but for anything requiring a real sustainable backup solution, tape is the only way to go.

  • MJ Fan (unregistered) in reply to dan s
    dan s:
    Looks like the company previosly hired montell jordan for the network gig.

    Agreed. Yet MJ never came whack on an old-school track (i.e. all hail, the RAID config only failed after MJ bailed).

    CAPTCHA = DUBYA (I think Jeb is next!)

  • Blair (unregistered)

    What got me is "ove" never even saw the "server room" until there was a problem.

    Thats just stupid...

    And using raid like that is just asking for trouble, you're just working the master disk more than you need, with all those rebuilds every day, shortening its life, as the story illustrated.

    I have no problem putting data on a removable hard disk to store off site permanently, but certainly not on a weekly basis, cost be damned, you're creating more work for yourself, and more chance for data loss / corruption. Now if the disks were dirt cheap, and I mean DIRT cheap, then I could MAYBE see it.. but then that "master drive", would need to be RAID 5 or higher for full redundancy.

    And what about once they run out of disk space???

    WTF indeed.

  • postitnote (unregistered)

    Am I the only one who caught:

    "When Ove arrived onsite, he was lead -- for the fist time -- over to the server room."

    ?

    Hilarious.

  • (cs) in reply to Antoine B.
    Antoine B.:
    Argh !

    I had to work on a setup like that.

    <snip>

    One year after that, I got the information by a friend working there that the setup failed and that they say it was my fault.

    Assigning blame like that is basic human nature ... at least you got out before it blew all up in your face.

  • (cs) in reply to Mark Ransom
    Mark Ransom:
    I know this varies between companies of differing sizes, but if my company lost 1/2 a days data, that's millions of dollars down the drain and someone's going to lose a job over it. Does the general feeling of "a day or two of data loss isn't so bad..." that has been somewhat prevalent on this thread strike anyone else as a little WTF'ish?

    Mark

    I agree, look at this from "Protecting your data" from Fujifilm in the UK (http://www.fujifilm.co.uk/recmedia/site/documents/ProtectDataGuide.pdf ) for re-creating 20 MB (twenty mega bytes) of data (I converted the original money amounts from british pounds to US$):

    department sales & accounting production engineering marketing cost (US$) 33,328.40 37,249.40 52,933.40 192,128.00 time (days) 19 21 32 42

    Again: this is just for 20 MB of data .....

  • matt s (unregistered) in reply to Alan

    That's why I'm a Raid 5 guy. I understand and appriciate (and under the "wrong" conditions give thanks to) Raid 0, but it sure bugs me that I paid for 2 and only got 1. Paid for 5 and got 4, that I can live with.

    Now if I can only convince my wife that it is "the way things are done" to have a terabyte at home. (And what do you know, my captcha is Bling, just right for a terabyte of Raid 5.)

  • (cs)

    I read this and I kept thinking back to the software company I worked for when I first got out of college. They had the exact same set up. Except they only used 3 drives on a rotating basis. I asked about this when I got assigned to switch out the backups. I was told, what else, "thats just how its done." After working there for a while I realized that they should have added, "because we are cheap." to the end that statement.

    In case you are wondering, I worked there for 3 years, and the server crashed 2 times. People lost days worth of work when it happened.

  • (cs) in reply to matt s
    matt s:
    Now if I can only convince my wife that it is "the way things are done" to have a terabyte at home. (And what do you know, my captcha is Bling, just right for a terabyte of Raid 5.)

    Hmmm. Personally I'd be bringing home cd's/dvd's of content that my co-workers have downloaded and be trying to convince my wife "that's just how things are done" in the bedroom.

  • jim (unregistered) in reply to Tukaro
    Tukaro:
    Stories like this make me weep for what colleges and certifications spit out.

    Stories like this also mean that I'll have a much better time in the job market than previously thought.

    Not so much. Decent companies now just assume everyone they look at is an idiot unless proven otherwise. So you have an uphill battle to get hired.

  • (cs) in reply to Jerim
    Jerim:
    Tigress:
    Jimmy Jones:
    Ummm....so you put the previous day's disk as "master" and there's all your files.

    YMMV but personally I'd call that a "backup".

    PS: It worked! They only lost half a day's files!

    I love you guys! This is why I make so much money and get to tell WTF-stories to my friends. :)

    To all you "it works, it's a backup" people... consider what would've happened if they HADN'T pulled the drives on a daily basis. The "master" would've died and the "slave" would've kept the array alive. Result, NO data lost, not even that half a day's worth. (Note that this does not replace a proper backup solution).

    Disk based backups are an excellent alternative these days and I strongly recommend those in any environment. They should be used in conjunction with other forms, such as off-site storage or tape backups though. And, they should be set up in a proper way that ensures data integrity.

    But, by all means, please keep abusing systems for your backups. I'll be there to pick up the pieces after you're gone. :D

    "What do you mean, just use the tank cap? Replacing the tank works fine!"

    If mirroring is done correctly, the two disks stay updated at all times. When you write to one you write to the other at the exact moment. The second the master drive goes down, you simply pull the master, replace it with the backup, and restart. You lose about 2 minutes. The problem here is that a lot of time went by before anyone noticed the problem, which shouldn't happen. There are a lot of tools out there to alert someone of a harddrive/system failure. A simple ping alert system would tell you the moment the server stopped responding. At my old office, we used an audible system that would beep loudly if there was a problem.

    I am not saying RAID is the best way to do backups, but it certainly isn't a WTF. A WTF is someone using floppy disks for backups or using the network to back up everything to the secretary's computer. We can quibble over the specifics all day, but there is nothing here that seems so far out there that you wonder what the person was possibly thinking. There are many ways to do backups. You may have a personal preference, but I don't believe in the "my way is the only way" mentality.

    Sorry I'm with Tigress on this one.

    RAID is used for integrity of the system, not backups. You do not intentionally break the raid mirror on a daily basis to make a rebuild. If you do, then during the rebuild you have no assurance of system integrity. RAID is there if you have a failure of one drive so the others keep going until you replace the failed one.

    BACKUPS are used for archival and offline retrieval of data with a secondary purpose of data integrity. If your entire system goes belly-up you can restore from the backups and get right back to work(given the time to restore this could be hours or days) but if something is also simply deleted you still have the option of pulling it back from your backups.

    These two technologies have differing purposes yet work well in tandem, do not make one of them try to do the job of both.

  • Brainwater (unregistered) in reply to KattMan

    Using Raid for backup is a total WTF! The most dependable solution is backing up to 3.5" 1.44MB disks. We've been doing it for years. We have one person working what we call "the disk swap shift".

  • Tom Harrison (unregistered)

    Oh my god. I laughed out loud (sorry, no acronym in my lexicon for that) reading this. A [b]classic[/b/ WTF backup tail of woe. Awesome.

  • Jack Mercier (unregistered)

    Retainer plus time and materials is common insofar as contract work goes, especially when bringing on a new client and it's not well established just how much of your time they're likely to consume. Retainer ensures availability.

  • Anonymous (unregistered) in reply to waefafw

    "I find it funny that people actually recommend tape backups. The failure rate for tape is unsettingly high. A major oil company recently had an email server failure, and their tape backups didn't work."

    This is incorrect. Failure rate is high for a system that is never checked. At the bank we clean the tape drives every week, check our tapes every month with test restores, and shred the tape when it is over a year old. I've had the occasional tape break or go bad, but I have had about six restore requests in my 3 years at the bank and ALL were successful.

    Know your tech, use it, and you'll own it. Don't know or don't use your tech and IT will own YOU.

  • Gizmo (unregistered) in reply to iwan
    iwan:
    I googled it, and it turned up a bunch of sleezy British massage parlors. I don't know what was meant by "Punter-net hardware", and I don't want to know.

    He might have ment this rather than the massage parlor. That still doesn't really explain the 'punter-net hardware' though.

    Nope. Here's a definition I found for "punter":

    Punter - Punters are customers. Originally came from the betters at the racetracks but has extended in use to mean anyone who should be persuaded to part with their money.

    Source: http://www.effingpot.com/people.shtml

  • Eric (unregistered)

    RAID is not a backup solution. It never has been, is not now, and never will be. It is a solution to increase the on-line availability of data.

    The Gold Standard of backups is to backup to a good quality tape. That is especially true for archival media that you might need ten or more years from now.

    Those of you who complain about tapes either used them far too often without replacing them or you are using those little toy tape units that are nearly worthless.

    My preference is to use a tiered approach where the first level of backup is to a separate computer dedicated to backups and from there to high quality tape. You always keep your latest backup on-line and ready for immediate restore, if needed. If you need something older, go through the logs (you do keep backup logs, don't you) to find the proper tape(s) and restore from there.

    DVDs aren't bad if your backup needs are not too great. If you have too many gigabytes of data, dealing with DVDs are a royal pain. If you must use DVDs, make sure you use Taiyo Yuden DVD+R media - DVD-R, DVD-RW, and DVD+RW are considerably less reliable.

    And pay attention to your storage conditions.

    And keep good logs so that you know what you have and where to find it.

  • kelv (unregistered) in reply to PGB

    Petty, I know, but calling yourself a grammar nazi is asking for it: possesive nouns have an apostrophe, plurals do not, and the apostrophe is also used for contractions it's - it possessive it's - it is its - not correct by any definition of english, google it :) You're welcome to my time

  • Ian (unregistered)

    Actually, I wish full restore operations were as easy as grabbing yesterday's media and rebooting... No mussing with full and diff tapes, loaders, stacks of this or that, recovery cd media, etc, etc... Heck, at least it worked. And probably faster than any other method!

    Push a button and reboot: the kind of things previously found only expensive SAN's and the like! Now available to paper-MSCE's everywhere!

  • timecop (unregistered)

    Thanks for the tip. I've been looking for a low-cost weekly backup solution and looks like this is perfect!

  • randompasserby (unregistered) in reply to Gareth Martin
    Gareth Martin:
    Although it might have been saner if it was a 3-disk raid 1, with two masters and a rotating backup, that way if a disk failed it wouldn't destroy that day's data.

    Not in this case, as the guy didn't discover the weirdness of the setup during his 'remoting' for new accounts, and he didn't notice the daily degraded raid arrays (mdadmin anybody or does windos not have something like that?), thus he simply would have gotten the call after the death of 2 disks instead of one.

  • cbciv (unregistered) in reply to kelv

    [quote user="kelv"]Petty, I know, but calling yourself a grammar nazi is asking for it: [/quote]

    Criticising someone else for being a grammer nazi and then spouting provably incorrect information is really asking for it.

    [quote user="kelv"]possesive nouns have an apostrophe, plurals do not, [/quote]

    Not to be rude, but is English your native language? If not, I suggest that you throw out your books on the subject and get some new ones.

    Possesive pronouns:

    my, mine your, yours his/her/its, his/hers/its our, ours your, yours their, theirs

    [quote user="kelv"]and the apostrophe is also used for contractions[/quote]

    That's the only correct statement thus far.

    [quote user="kelv"] it's - it possessive [/quote]

    No. The possesive is "its". This is a common misconception. I cite the following:

    http://dictionary.reference.com/browse/its

    (Aside: check http://www.m-w.com/ too. I tried to paste in a URL, but they've changed their site and don't seem to have a way to link directly to a particular definition.)

    [quote user="kelv"] it's - it is [/quote]

    This is correct.

    [quote user = "kelv"] its - not correct by any definition of english, google it :) You're welcome to my time[/quote]

    You should have checked an actual reference instead of googling. Far too many people make this mistake all of the time, along with confusing the words "here" and "hear", the words "there", "their" and "they're" and then of course there's the horrid use of "there's" for the plural (it should be "there're", e.g. "There're five dogs outside."). Just because you find an abundance of these mistakes online, doesn't mean that they're correct.

  • yalu (unregistered) in reply to Mark Ransom
    Mark Ransom:
    I know this varies between companies of differing sizes, but if my company lost 1/2 a days data, that's millions of dollars down the drain and someone's going to lose a job over it. Does the general feeling of "a day or two of data loss isn't so bad..." that has been somewhat prevalent on this thread strike anyone else as a little WTF'ish?

    Mark

    dude, if you're doing a daily (nightly) backup of all your data, 1/2 days of data is statistically what to expect if the server in question dies.

  • Bajs (unregistered)

    So the list of stupid things, besides being really really ugly, is:

    • raid degraded on a daily basis. When your raid is degraded you should feel a bit sick and worried. That's healthy. Running degraded every day isn't.

    • Unnessesarily wearing out your master HD. Yeah. not good, not horrible if rest of backup system works.

    • Disk-image level backup. Every restore would now become a simulated power-outage, unless you've managed to freeze the filesystem, flush the OS, raidcard, and HD controller buffers, and pray the HD buffers aren't lying about flushing (yes, they often do). What the hell? Read files, not disks.

    Point is, it could have gone worse. Much worse. And the fact that it didn't is NOT due to having a proper backup process.

    That said. Backups to disk are of course perfectly fine, but say you take a 3TB backup every month for transfer to a remote location. Sure, it's only four drives (expensive ones), but drop a drive from 10cm and while it's not obviously broken, you can't really rely on it. And a backup routine that can't be done perfectly every time is a backup routine that needs to be fixed.

  • Kuba (unregistered) in reply to eight days a week
    The difference is that they were abusing RAID. It's not meant to be used for backup, just redundancy. Otherwise you're right - there isn't much difference between backing up on tape and backing up on a hard drive.

    There are way too many people here who like things "just so", without any technical reason for the "just so" approach. Like having hardcoded data (storrary), etc -- I kinda loose faith in IT when I read all those "it ain't my way" comments. You may not like using RAID that way, but there's no serious technical reason for not doing it that way. Given that the details are correct (here they weren't - DUH).

    I'd do it with more than two hard drives, so that the redundancy wouldn't be lost due to everyday backup drive swapping. I'd also make sure that the caddies use connectors that were designed for 10+ kcycles. Most of the server caddy connectors these days aren't -- they are cheap plastic types that are easy to damage.

    I'd also look into getting caddies that have some shock-absorbing mounts for the HDs.

    Using the RAID controller is pretty clever as it'll likely make sure that the backup does happen, as long as the controller was configured to auto-rebuild, and won't loose that configuration (BTDT). With hardware RAID, you loose a couple of layers of "it can go wrong": the OS is outta the picture, a software upgrade (maybe an automated one?) won't make that odd warning when your script is run become an error (BTDT too), and so on. It's quite possibly the lowest overhead solution: the system bus won't even see all the backup traffic. Given hardware RAID, of course.

    The only point left is the how corrupt the filesystem is when you take those drives out. With a journal for data & metadata, it should be good enough to consider as a backup.

  • Kuba (unregistered) in reply to TimmyT
    Anyone who thinks that putting a RAID into critical state by replacing a drive and rebuilding is NOT a WTF is a friggin moron.

    No, you're the one by thinking that getting the array critical is a necessity in this scenario. Part of the WTF was that the array had at least one HD too few.

    Again, an example of someone who cringes just because things aren't done his way. Sad.

  • Kuba (unregistered) in reply to Coditor
    Coditor:
    • Tapes have less moving parts and are easier to fix if broken fysically than harddrives.

    I don't know, because I didn't check. I'll take one of our older LTO-II tapes and splice a couple inches out of it. I wonder if the drive'll still read it.

    An LTO-II drive, for example, is way more complicated, both mechanically and electrically, than a hard drive. Hard drive's operating environment is sealed against dust, doesn't wear so quickly, and so on.

    • Critical data is kept off-site for years and years.

    I hope that's on some ancient "standard" 9-track tape or something, because otherwise after those years and years the only source of a working drive to read that tape may be a once-in-two-months eBay auction that ends at $10k.

    Hard drives have the benefit of integrating a rather standard interface to access ever changing technology. As long as the drive works, you'll be able to read it. You might have a working tape, but last time I checked tapes don't come with SCSI or IDE connectors on them.

    And one'd think this is all obvious stuff . . .

  • anonymous (unregistered) in reply to Clod
    Clod:
    I seriously hope the company took the previous sysadmin to court, particularly since he was effectively screwing them into paying him twice.

    How so? If you're willing to pay, and I'm willing to accept the payment, how's any one getting screwed? Remember, the company always had the option to go to someone else, as they ultimately did.

    And you should talk to a lawyer about how those kinds of fees work - you'd really be surprised, though, at that point, you may have a point about being screwed over, yet again, by a landshark :0

  • Limpalot (unregistered) in reply to Jimmy Jones

    So what you are saying is: The raid didn't work as expected because they where using it as a backup, but because MOST of the data could be saved this was really ok? If they hadn't taken backups, EVER, they wouldn't have lost any data, mirroring (as used here) isn't backup, it's redundancy, one failed drive shouldn't cause any data loss AT ALL, that's the point with raid1. One more time: They wouldn't have lost any data if they hadn't used the raid-set as backup. It was lucky, btw, that they had their mail on this server so they actually discovered the error so fast, more data could have been lost if not....

Leave a comment on “RAIDing Disks”

Log In or post as a guest

Replying to comment #:

« Return to Article