- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Yeah, because that would be such a great investment in time and money, huh? Forget the fact that the company agreed to pay both his retainer and hourly rate. Forget the fact that said that they turned down an offer to inspect their systems.
Even if they tried to get their money back because the consultant didn't really perform his job, what hope do they have? Do you think they are going to collect? And how much money will it cost to go to court.
Some of you need to meet the real world where saying "I'm suing you!!" isn't the first response to any problem you face.
Admin
As I see it, it was the means that precipitated the failure, though the means also had it's own "recovery solution" at the cost of half a day's work.
Admin
I'm surprised someone named 'oven' doesn't like the fact that the server room is 90+ degrees.
Admin
Admin
Ummm... losing data is pretty much the definition of NOT working. RAID is meant to mitigate a hardware failure. It didn't in this case. So, it was a failure of concept, execution and pretty much everything. You shouldn't ever lose data, especially on a critical system.
Given what you think about "working" backups, I'm LOATHE to even think about using your "backup" software. Please don't ever touch a mission-critical system, mmkay?
Admin
Correct me if I'm wrong, but it seems odd that several people have complained about the wear and tear on the master disk. It seems like a good thing if you can predict which drive is going to be the first to go. If failure is predictable you can plan for it. Nothing lasts forever anyway.
If I understand the story, the only reason that company lost any data at all is that the previous sysadmin didn't fully train the guy who was switching the drives. If he had simply left Tuesday in, and replaced the (dead) master with Wednesday, and ordered a new drive to arrive before next Tuesday, they probably would have never needed to call the new guy. It's also possible that the old sysadmin did train them and they lost the knowledge or simply screwed it up by not paying attention.
Granted the issues about interrupting a write seem valid potentially valid, and I have doubts about the performance implications of this strategy, but I'll leave cost analysis to people who work in the field. (IANASA)
(captcha: muhahaha)
Admin
I really love reading my daily WTF here. Yes, these stories are all really, really horrible. Just that for a change, they happen to other people ;-)
Capcha: burned
Admin
That's a pretty manual implementation for split mirror backups :) For those who don't know - RAID can be used for backups and it is a valid way of doing things. Controller splits a mirror at a certain point of time, then the mirror is backed up to tape or remote location or whatever, and split mirror is added back into array. But it is done using special hardware/software and without mechanical rotation of drives of course.
Admin
You're kidding, right? Even cheap backup tapes are at least 10 times the price of audio tapes.
Which brings me to the original point of posting... I've been looking at buying 5 or so portable hard disks for backups at a school. The VXA-2 tapes they had crapped out after less than 5 runs each, and replacement tapes are more expensive than external hard disks of the same size. Reading up on VXA drives, it seems we aren't the only place to find them useless.
Admin
Okay, this is just one of my pet peeves, and since everyone makes this mistake (even me, if I'm not paying attention), I thought I'd just point it out again.
red is a color read (rhymes with red) is the past tense of the verb read (rhymes with deed)
lead (rhymes with red) is an element led -- not "lead" -- is the past tense of the verb lead (rhymes with deed)
Admin
Admin
The trick to making this sort of backup work (obviously not as your only backup method, but as a measure to quickly bring up a failed server) is to down the server, remove the drive gracefully and swap in another.
The array gets rebuilt and you have a working off-line copy of the entire server, it's OS and it's data. It means being able to quickly bring up a failed server rather than restoring data from tape in an emergency.
It's handy if the server blows up and you can get an identical server shipped in quickly to use the spare disk you keep in a fireproof safe.
Admin
Okay, this is just one of my pet peeves, and since everyone makes this mistake (sometimes I do, if I'm not paying attention), I thought I'd just point it out.
it's = it is
its = the possessive form of it
It's wrong here twice: "it's OS and it's data." Should be: "its OS and its data."
That writer gets it right in the next paragraph: "It's handy if the server blows up..."
Thanks for your time!
Admin
On a completely unrelated note. Anyone have any good recommendations of a good home system backup plan?
I would like to backup about 40G on a regular basis. (My photos, website, sourcecode) I've found a couple of inexpensive online solutions, but one has a limit of 30G. Other sites are just oo expensive.
Friends have suggested the DVD route.
Admin
Abusing raid isn't really the problem, although it is part of it. Since you are knowingly degrading the array every day, you are giving a window of opportunity to lose the system quite often. Any failure during the rebuild and the server goes down.
Secondly, those connectors aren't designed for switching drives in and out that often. I'm amazed that the drive failed before the connectors.
Third, you don't have a clean backup. By yanking out the disk while it is running, you could end up with a corrupted filesystem. There is no way to know if you have a good backup until you try to use it.
Fourth, drives don't like being transported around a lot. Since they are mechanical, the shock of taking them out and putting them on the counter daily greatly increases their risk of failure compared with just leaving it in the system.
Finally, other than the weekly offsite backup, their backup plan actually increases the risk of losing data over just using the raid1 the way it was intended. Just swapping a drive out weekly would give them much better performance, lower chance for failure and the same offsite backup without going through the motions of having a real backup.
Admin
I don't think using HDs as your backup is so bad. In fact I would rather use them than tapes. I've had horrible luck with tapes and hate them with a passion.
But that particular strategy should probably be adjusted a bit to avoid placing so much load on the master. The other two problems are these:
First it leaves the array in a purposefully degraded state for some period of time, during which a failure could lose data.
Secondly it does nothing at all to even preserve filesystem integrity, though with a journaled filesystem this shouldn't be a horrible thing. But if you have a database or something on there, it might be nice to be a little more careful about making sure everything is in a defined state before the backup.
Admin
The fact that you would even compare analog compact cassette technology, which uses 1/8" tape running at 1 7/8 ips and a stationary head, to modern digital tape technology like SDLT640 and LTO which uses rotating heads and media running at several FPS, indicates that you are wholly unqualified to even comment on backup methodologies.
I mean, that's like saying hard disks and dual-layer DVDs are crap because RL02 disk packs and CD-Rs are obsolete.
Admin
But unfortunately, there are just too many companies that would say, "that's how it's done!"
Captcha: Craaazy (very appropriate I might add!)
Admin
I would argue its more likely to have been an 'enthusiast' without ANY industry certifications. People with industry certifications have at least had to (in part) drink to kool-aid. They are more likely to follow what they were guided in.
This is what frightens me about so many of the Linux crowd. Many just knock things together and call them solutions without any idea of the down the track repercussions. (Many, not all, and the good ones, are GOOD.)
Admin
My head hurts ...................
There several things wrong here as far as I can determine:
Hardware level:
Organizationally:
Based on the above it is my impression that the customer's management are your typical PHBs and Ove's predecessor just your run-of-the-mill small-time operator who managed to get to close a better deal than most.
As for the tape-versus-disk controversy in the comments: the current trend is to go for disk-to-disk-backup even in smaller setups now. Disks are easier to handle. Granted: they cost more initially but are (in my experience) much more reliable and easier to verify.
My recommendation is that Ove's customer should invest in disk-to-disk backup (USB- or LAN/NAS-based) with additional off-site backups with an online storage provider. They have the money for that, too: Ove is not taking a monthly retainer. Ove should also be retained for two hours/week to check the backup integrity.
Have fun.
Admin
A network system designed for the Commodore Business Machines 8032, Super-Pet, etc. It used the IEEE-488 bus to allow more than one machine to access (slaves) the printers or disk drives. It was not designed for communication between computers (masters), but rather between multiple computers and external hardware. Without the supporting hardware you could still hook up more than one computer to a device(s) but the IEEE-488 had no support for more than one master trying to access at the same time.
For some reason a customer decided they needed to hook up some HP device (plotter?) to more than one PC. They got someone to modify the Commodore edge connector version of IEEE-488 to match the PC connectors which again were not the proper IEEE D-flat pin type.
Admin
Admin
RAID is not a backup method. It is for reliability, which was not existent in this setup. And remember that they lost one day of work.
Admin
Ummm....so you put the previous day's disk as "master" and there's all your files.
YMMV but personally I'd call that a "backup".
PS: It worked! They only lost half a day's files!
Note that nobody in the building was looking for red lights on the RAID array. If they'd done it "properly" then nobody would have noticed when one drive in the RAID failed, leaving them working with a single drive. When that went down, who knows what would have happened. As it was, somebody noticed the problem within a day and service was restored quickly/easily.
Compared with most companies out there, this is a dream backup solution.
Admin
This kind of customers can be really scary for a consultant.
If you dont stand up you will end up supporting stuff that will crash sooner or later, and when it crashes they will blame it on you.
"Done that been there". These days I would tell the customer that unless we are allowed to do it right we will not do it at all. Losing a customer is not nice, but I would rather lose a customer than wait untill brown stuff hits the fan and lose my own reputation.
Admin
I just a price calculation (yesterday!) and hard disks are now cheaper per gigabyte then blank DVDs (Not to mention re-usable).
Any backup solution which requires you to sit there inserting disks all afternoon will get done about three times before you start finding excuses not to do it.
A complete disk image is easy to create and easy to restore in case of disaster. You can fit several generations of "30Gb" images on an external disk, though I'd recommend cycling between two external disks just in case one of them fails.
Admin
Taking the contract without a previous on site or at least remote-in check-up of the situation is attempted suicide. What if he never solved the case and spent 24 working hours on the case? Do you think the client would have gladly payed his bill? NO WAY! About the retainer : we have the same system and it works just fine. The clients that want to be able to count on our time and experience and our spare parts (availability is what they pay for) can do so and choose if they want all working hours included or excluded. THEY can decide what risks to insure and what not. Good hardware and set-up? -> No incidents? They only pay the retainer... About the RAID set up... if you work in this business long enough you meet a lot of people without any budget or without any idea how much they rely on their data nowadays. It's up to us to earn our keep and inform them of the risks (then make them sign that they're aware of the risks with THEIR data that THEY are taking ;-)
Admin
Argh !
I had to work on a setup like that. When I said that this was definitively not a good way to backup, my boss just said to "f*ck off" and do my job... Well... After a month, I quit.
One year after that, I got the information by a friend working there that the setup failed and that they say it was my fault.
Admin
I love you guys! This is why I make so much money and get to tell WTF-stories to my friends. :)
To all you "it works, it's a backup" people... consider what would've happened if they HADN'T pulled the drives on a daily basis. The "master" would've died and the "slave" would've kept the array alive. Result, NO data lost, not even that half a day's worth. (Note that this does not replace a proper backup solution).
Disk based backups are an excellent alternative these days and I strongly recommend those in any environment. They should be used in conjunction with other forms, such as off-site storage or tape backups though. And, they should be set up in a proper way that ensures data integrity.
But, by all means, please keep abusing systems for your backups. I'll be there to pick up the pieces after you're gone. :D
"What do you mean, just use the tank cap? Replacing the tank works fine!"
Admin
My 2 cents...
A great WTF :D
Coditor
Admin
Well, just for the discussion, I once drafted a backup plan which has some similarities to this one. Anyone tell me if I'm completely wrong with it?
Well, Server has a RAID 5 and all that stuff you would expect from it..
Now, having had a look at the price tags for tape robots, I thought: "hey, hard disks are just sooo much cheaper, why not use them for backup..."
So my idea was to set up a dummy server (physically separated by at least two fire walls, and connected to the main server by a dedicated Gigabit Ethernet line) with no less than 7 hard disks.
Then set up Retrospect to back up to this server, and use a different disk each day.
Looked fine to me. Well, we had the budget for the tape robot, so we bought that, but it would have been worth a try. What do you guys think?
Da' Man
Captcha: Stinky
Admin
Actually, it's not odd at all. Assuming that the "slave" drives aren't damaged from the handling, they are each used one seventh of the amount of time the master is used, hence the master is seven times more likely to fail than any one of the slaves. The array is also subjected to increased wear and tear as it has to resync once a day. Assuming a 250GB array with a write-speed of 40MB/s, the array has to spend almost two hours every day resyncing. Compared to normal usage, this increases the risk of failure even more.
Under this scenario, any one of the slaves can fail without the data being lost, assuming the failure is detected and it's replaced. The master, on the other hand, is not only seven times more likely to fail than any one of the slaves, but is also subject to at least a two hour window every day where data will be lost if it would fail.
In reality, it's more likely that one of the slaves will fail due to the stress the hardware is put through, and the fact that there are seven of them but that still doesn't mitigate the fact that the master is more likely to fail due to this behaviour.
Admin
Okay, so the master fails and the slaves keeps right on ticking. Then what? We already know they don't pay attention to failed drives... if they did, they wouldn't have lost anything in the first place. So instead of noticing right away and losing a 1/2 day, they would run on a single drive, oblivious to the fact that their redundancy is gone, until the slave dies too. Then instead of 1/2 a day, it's a month.
I mean, if they're going to ignore the failure anyway, what difference does it make?
The problem is not the 'backup' scheme. The problem is they weren't paying attention... maybe better alarms would have fixed that. Or maybe they HAD better alarms, but disabled them because they were annoying.
If the guy had been PAYING ATTENTION, he might have noticed the light is RED today, when normally it's green. Or the flashing dialog box on screen. Or the beeping speaker. Or the incessant emails it should have been firing off.
The Real WTF is the people who blindly assumed it was ok to rip a drive out at any time, apparently with their eyes closed, ears plugged, and brain off.
Understand, I'm not advocating this as a good solution. BUT, as a BACKUP solution, it DID work. Daily backups, on average, should cause a loss of about a 1/2 day of work if they're ever needed. That's the price... their system met that price, so it's a success. If they'd somehow lost more than 1 day of work, it would be a failure.
Obviously it could be better (normal RAID-1, with a real backup scheme and attentive operators). But again, NO daily backup system could do better than an average of 1/2 day lost in case of failure. If you need to restore from a backup, that's the price you pay. Too high? Back up more often.
Normally, RAID is used to reduce the frequency at which restores are needed. If instead you use RAID as a backup solution directly (rather than augmenting a backup solution), normal backup solution rules/prices apply and you give up some of the benefits of RAID. Caveat emptor...
Admin
Not to mention the time the hardware was in a basement when one of the pipes burst, the tech (not me) found the server floating on one of the styrofoam packing pads, and the power was still on!
Bull shit. What kind of server and how much styrofoam? Im guessing, a light one and alot, otherwise its bull shit.
Admin
Sure it worked as does playing russian roulette as long as you don't bite the bullet.
There's a marginal difference between something that's working and something that's sane. :)
Admin
In my opinion, they didn't have any of the benefits of RAID. Instead of having a RAID, they essentially replaced it with a (poor) backup solution.
As I don't know the specifics, I can't say for sure, but I'd be willing to bet even money on the the RAID being less reliable than a single drive due to this kind of abuse (that is, the RAID alone would be less reliable - there'd still be a "backup" available).
And, yes, I agree that if both drives had failed, they would've been even worse off. That's why I wrote "this does not replace a proper backup solution." ;)
Admin
Lovely... :)
I have also noticed in my present place of work that the ex-employees are always blamed for the WTF's.
Admin
The only WTF in this story is that they are paying someone to backup data for them when they could learn how to do it themselves for free. Sounds like they don't even need to backup their network that often anyway as its mostly email.
I would recommend they use Outlook which has an auto-archive feature that is free and I find if I have data I need to backup like word documents or program coding files then I attach them to an email too before I archive them. Simple,cheap and I have never had such a level of problem mentioned in this story.
Also because email is networked this means your backups are spread far and wide across an entire network and not just on your single point of failure machine. You don't actually need to spend money on fads like tape drives and such which are hard to read anyway. As a last resort you can always back up your data on a 2GB USB pen (or even 10GB if you can spare extra). All computers have USB pens but I have only seen a couple with casette drives.
True backup is ultimately impossible though. Everytime you read data on a hard drive it spins round causing some of the bits to be permanently worn down by the reader head. Backing up data might replace some of this lost storage over time but most people wont be able to do it fast enough to make much difference and so you are going to lose all your data eventually anyway whether its two years or five years.
Admin
Assuming a rackserver weighing in at 5kg, 800mm deep, this would require a 1.3cm (5/(4.75*8)dm) thick styrofoam pad, assuming a solid pad covering the entire bottom of the server.
Plausible.
Heavier servers would, of course, require more packing. :)
Admin
"Tomb RAIDer" might have been a good title too :-p
Admin
Tapes are the enterprisey solution and disks are the small shop solution.
At my previous employer, we had a RS/6000 backing up all of our servers to tape. We were primarily sysadmins but I was doing some developer work to help a customer out. The customer accidentally deleted a file we were using and our backup/restore guys told him it would take 12 hours to restore. Everything was spread across different tapes and we were doing incremental backups (and the TSM guru was out of town that day). I had zipped up the file as my "backup" the previous day so I just unzipped that and we were good to go.
Admin
I like happy stories like this one. The good guy didn't have to suffer too much and the victim didn't, either.
Admin
For those saying that tape prices are comparable to HDD prices, I have some calculations that might interest you. Currently the backup I run does full daily backups to 200GB and 400GB tapes (14 day/10 tape cycle). Additionally there is a daily incremental backup onto 160GB tapes each day (5 tapes a week, generally requires 20 tapes a year). I also archive off a 200GB and a 400GB tape at the end of each month. Excluding the cost of the tape drives (which is very high), the running costs for this system are (approximately) as follows:
1st year: 22x 200GB, 22x 400GB, 20x 160GB = £1348 Following years: 12x 200GB, 12x 400GB, 20x 160GB = £908
That's a lot of money to spend on media, but if I were to do the same with hard disks:
1st year: £3509 Following years: £2226
In other words, a 400GB tape costs less than £22, and a 400GB HDD costs £70. Not to mention that tapes are more compact, and are designed as removable media.
Admin
The REAL WTF here is that a network support guy uses Fahrenheit to measure temperature. In 2007!
Admin
I just recently solved this problem for my home office.
I have about 120 GB of Data to Backup from 2 Window's XP systems. I concidered finding/writing some sort of FTP mirror process that would upload changed files to a private area of my webserver. The webserve has plenty of extra space and bandwitdh on it so this online backup would just take time.
But, what I ended up doing was getting a fat external hard drive for christmas. I wrote a couple of robocopy scripts that run @ 3am and mirrors the source data to the exteral drive. (Check out robocopy.exe, it's xcopy on steroids.)
It isn't what I would do for a critical application, but for my home office, it works well. Plus I have some nice extra space on an external harddrive.
Admin
External HD. I'm finding multiple units on Newegg that would do it for under $100. Going the DVD route you're talking 10 disks per shot, at the cheapest I've seen blanks that means you get only 50 backups for the price of a drive, not to mention that the DVD route means a lot of handling.
Admin
Schmed: In other words, a 400GB tape costs less than £22, and a 400GB HDD costs £70. Not to mention that tapes are more compact, and are designed as removable media.
Ok, the price of tapes has dropped relative to drives since the last time I priced them. The last I was looking the tapes were slightly less per gb but there was no way that would ever overcome the price of the tape drive.
Admin
Sure! Easy, reliable, nearly idiot proof, and requires no special software.
Get a hard drive the size you need, put it in a external case with a USB connection. When you want a backup, plug the USB cable in, turn on the drive and copy-past the info. For reduncency, use two external drives and cycle them. Since it is a copy-paste over the old data, deleted data never disappears from your backups.
Admin
Yeah, the only way I can justify the cost of three tape drives (the last one was £850) and the software to handle the backups is by using so many tapes that the £/GB savings completely outweigh the cost of the drives.
Admin
I know this varies between companies of differing sizes, but if my company lost 1/2 a days data, that's millions of dollars down the drain and someone's going to lose a job over it. Does the general feeling of "a day or two of data loss isn't so bad..." that has been somewhat prevalent on this thread strike anyone else as a little WTF'ish?
Mark