• (cs)

I looovveeee hazards! (NOT!!!)

Once upon a time there was this program that had two input parameters: U1 (update type 1) and U2.

Normally, the program was run with one or the other or both set to Y:

  U1=Y U2=N ==> Do update type 1
U1=N U2=Y ==> Do update type 2
U1=Y U2=Y ==> Do both updates in one pass



One day we ran the program, but we just wanted the report - no updates. That was when we discovered the result of the (normally unused) remaining combination:

  U1=N U2=N ==> Update all database rows with garbage



Turned out this was the gist of the code:

  for each database row
{

if (U1='Y') calculate_output_1();
if (U2='Y') calculate_output_2();

if (U2='N') column_1 = output_1;
if (U1='N') column_2 = output_2;
update_row();
}



Sigh.

• (cs) in reply to Zapp Brannigan
Zapp Brannigan:
Stuart:
This isn't the first company this has happened to; it must have happened at some of Sun's customers, because they built in anti-rm-rf/ protection in the OS.

One of the cool features of Solaris 10 is that "rm -rf /" refuses to work. The Sun guys said they knew people don't type this deliberately, but often scripts intend to do "rm -rf $var1/$var2" and forget to set var1 and var2.

That protection has been put into almost all unixes, including Linux. Log in as root and try it if you don't believe me.
HA! You don't know what you're talking abo%!$*%& [NO CARRIER] • Anonymous Coward (unregistered) in reply to heltoupee heltoupee: People always blame the messenger. It's in our nature. Thank you for bringing this to my attention. Guards, have Mr. Heltoupee executed. • (cs) in reply to EFH EFH: Y'know, being root on one machine doesn't give you any special access to an drive NFS mounted from another machine. But it does get you potential access, which is often enough. And I can't imagine why the script would become root to do the cleanup. Your imagination needs a great deal more experience, because it clearly has none. I enjoy a good story as much as the next guy, and I like the$var1/$var2 "hook", but I'm thinking this story was invented to go with the hook after somebody thought it up. Holy good night, why would he have to invent it, it happens frequently. Even if you don't know of the famous incidents --- this one comes to mind http://groups.google.com/group/comp.unix.admin/browse_thread/thread/f1834a4fa74980d4/af2749af87216d18 (many good stories worth reading, but search for "Have you ever" to see the one I'm thinking of) --- and why on earth would you think people don't make stupid mistakes in shell scripts? • (cs) in reply to Bim Job Bim Job: Pedantic: The original script developer is a moron, not the co-op, some sanity checks on the variables before a recursive, forced, rm would be ABSOLUTELY NECESSARY. if [[ -z$var1 || -z $var2 ]]; then echo -e "\aOnly a fool would forget to specify \$var1 or \$var2, I shall abuse you verbally." else rm -rf$var1/$var2 fi  I'm very glad you saw fit to explain that; I was just about to waste several precious seconds by either (a) trying to remember how to check a variable in a shell script or (b) typing "bourne shell tutorial" into Google. Otherwise: if you make something idiot-proof, somebody will come up with a better idiot. This is a superb WTF, with minimal verbosity. I particularly love the idea of "sudo su -", which has got to be the shortest possible sysadmin fuckup of all time. (Alternative candidates solicited...) The root thing is annoying: scripts should never need to run as root, just set up your users and groups intelligently. But the rm -rf thing is inexcusable. It's the exact same problem as above: they're too lazy to do it right. But rm -rf always goes astray at the worst possible time. Put that in an automated script, and you're guaranteed a failure like this. • Anon (unregistered) in reply to Once a scapegoat Once a scapegoat: Dan: I love how the outgoing/recently departed employee is the easy target. Not wanting my good name dragged through the mud, I gave a trusted colleague all of my saved unfinished project mail. Six months later over beers with former colleagues, I found out that I was being scapegoated for another division's lack of planning and pending disaster. My saved e-mail messages outlining the risks of not planning for that particular issue (written at least a year before I left) exonorated me and left egg on the face of the accuser/actual culprit. Passing blame to employees who have left the company is a sad tactic by bad management to get out of taking responsibility for their own screw ups. Same thing happened to me. Happened here too. Actually, it was the guy who just quit that did it. • DLJessup (unregistered) in reply to Dave Carrigan Dave Carrigan: The first rule of shell scripting is set -u The second rule of shell scripting is rewrite it in a non-write-only scripting language, such as Python or Ruby. FTFY • bored (unregistered) in reply to Zapp Brannigan Stuart: That protection has been put into almost all unixes, including Linux. Log in as root and try it if you don't believe me. We see what you are doing there, you almost got us. • (cs) in reply to realmerlyn realmerlyn: My favorite mistake (which I see far too frequently) is something like: cd /some/directory/which/we/presume/is/there rm -rf *  Yes, works ok when the directory is there, but what if it isn't? Always include some conditional on a cd: cd /some/dir || exit 1  or some other abort even if the unexpected happens. This reminds me a similar mistake I did on my DOS ver. 5 or so, early 90ies. C:>cd a:\dir\to\wipe C:>del . Are you frikking sure? Yes ... "Hmm... why does this take so long, but the diskette driver isn't working? OMFG, it is still in C: !" (And yes, I used to like keeping lots of files/games in C:\ at some age) • M (unregistered) Hey, it happens in Windows, too! If you're running a batch file from Task Scheduler, and do a 'del .' to clean up your work, and you didn't set the startup directory in Task Scheduler, your current working directory is c:\windows\system32. I fielded that call (worked MS support for malware) many times. Told many customers that nope, it wasn't a virus, it was your script that deleted your operating system. • (cs) in reply to Frank Frank: pjt33: What's an explitive? Yes, lets al b priks about speling. FTFY • Jay (unregistered) in reply to EFH EFH: Y'know, being root on one machine doesn't give you any special access to an drive NFS mounted from another machine. And I can't imagine why the script would become root to do the cleanup. I enjoy a good story as much as the next guy, and I like the$var1/$var2 "hook", but I'm thinking this story was invented to go with the hook after somebody thought it up. Yes, I just can't believe that any programmer in the history of the world has ever logged in as root rather than take the trouble to get all the permissions set so he could do what he needed to do as an ordinary user. Someone taking a short cut because doing it the right way would be too much trouble? That's just unbelievable. Why, next they'll be trying to tell us that employees sometimes call in sick when really they just don't feel like coming to work today, or that there are politicians who don't live up to their campaign promises. • Anonymous Coward (unregistered) in reply to Pedantic if [[ -z$var1 || -z $var2 ]]; then ... Ehm, no: var1="../../../../../../.." var2="." Oops... • (cs) Someone doesn't use unix very much. First off, nobody calls it "a Bourne shell script". Just call it a shell script, or maybe even a bash script. That's not a flaw in sudo, it's a feature. The flaw is when they set up sudo access, they added a line like "luser ALL = NOPASSWD: ALL", which is braindead beyond belief. A "system set up with NFS" doesn't grant access to the entire network. The people who set up those NFS exports were morons, because they would have had to export / in read-write mode on every single machine affected, with the no_root_squash option no less. Which is braindead beyond belief. In other words, the admins are fucking morons, and they are responsible -- not NFS -- for allowing a remote process to delete the root directory on every server. Anyone who writes a script that executes "rm -rf$var1/$var2" and fails to check the values of$var1 and $var2 should be fired instantly and never allowed nea ra computer again. His manager should be fired for not requiring code reviews. sshing into your own machine is retarded, and I'm not buying it. If you don't verify your backup plan, you get exactly what you deserve. • (cs) Wait a sec.... if the script is ssh'ing into each machine, why is NFS a factor? If it can remotely access files through NFS, why the need to ssh at all? If it can ssh, why bother with NFS? Something is fishy. • Matt (unregistered) It's so incredibly sad and reprehensible when I come across code that uses user-input variables and doesn't check the values for obvious things like empty strings or characters that don't match a certain data type. And don't get me started on the lack of range checking I've seen. People: sanitize your inputs. And not just the Bobby Tables inputs. Use lookup tables for things like states. Use pre-defined values for users to select if you can; instead of letting them enter values free form. I could go on and on. But Jakob Nielson has already beat this dead horse deader. If you're a programmer and you've never heard of him or at least Donald Norman, you've got some work to do. • Older & Wiser (unregistered) I was sure that most people had learned way back in the 1980s to first check for the folder/directory you want to work in. Then change your default to that folder/directory. Then verify you're actually in that folder/directory. And then happily delete away. Hell, we even knew this in VMS' hay days. • Tzafrir Cohen (unregistered) In recent Un*x varieties this no longer works. # rm -rf / rm: cannot remove root directory '/' rm checks for the explicit string '/' and refuses a recursive removal of it. rm -r '//' or '/*' (expanded by the shell to all of the top-level directories) will still work, but this still prevents screw-ups such as in the shell. IIRC this is a fairly recent change (c. 2004) of POSIX and it's Sun we need to thank for pushing the change through the standard comities. • (cs) Kudos for bloating a 30 year old one liner into a 800 word snoozefest! Maybe tomorrow we can get a 50,000 word set up for "to get to the other side". • Tzafrir Cohen (unregistered) in reply to realmerlyn Actually bourne shell's simplest error handling mode is "bomb out on error". This is done by setting '-e' ("set -e" or "-e" at the command line of /bin/sh). This means that basically every command that fails, causes the whole script to exit. The idea is that you don't want to spend too much time thinking about error conditions. This prevents errors such as that 'cd'. • (cs) in reply to Once a scapegoat Once a scapegoat: Passing blame to employees who have left the company is a sad tactic by bad management to get out of taking responsibility for their own screw ups. I worked for a small business once where the owner blamed me (behind my back) for various screw-ups on projects I wasn't even assigned to (i.e. "I think he messed up the Cornerstone database"). My last month of employment there, he refused to give me work to do, putting me off constantly with "in a minute" and "i'll be there after this phone call" and whatnot. Yes, an entire month. I made myself useful by replacing large swaths of a co-worker's job with some very small shell scripts (for which he thanked me profusely). When I ran out of things to replace with shell scripts, I played most of Half-Life 2. I decided not to show up to my last day of work. Gee, I wonder why. Former co-workers sent me messages every few months to let me know when the boss was blaming me for yet another thing I couldn't possibly have done... That same friend (for whom I wrote those shell scripts) quit a month later without notice; the boss had left vital steps out of a series of instructions, then blamed my friend for the Bad Things(tm) that happened as a result, saying something like "you should have just known to do that". There were a lot of WTFs at that place. I should post about it sometime... • AdT (unregistered) in reply to heltoupee heltoupee: People always blame the messenger. It's in our nature. We should crucify you for telling us that. • WC (unregistered) TRWTF is that someone thinks a documented feature is an 'unpatched flaw'. No, the idiot that wrote that code proved his idiocy many times over. SSH is not at fault in the least. • Anonymous (unregistered) This is exactly like that show Wordpress did a couple of weeks ago when they released a new version: rm -rf$TEMP_UPDATE_DIR/*

Guess what happens if $TEMP_UPDATE_DIR is null and you're running the httpd as root? Yeah, that's right. Charlie Foxtrot. • Erayd (unregistered) in reply to EFH It does if you have no_root_squash set in your export definition. Cleaning up as root would make sense if the test suite also runs as root - there are a few (although not many) tasks which would require that. • (cs) in reply to Bonce Bonce: This reminds me of an incident in my first ever job as a helpdesk tech. I got a panicked call from a lady in accounts saying that all her data had "just gone". "Vanished". "Help!". So I went to her desk and asked her to explain firstly what the data was from, where it was stored and then what she had been doing before it "vanished". She explained that some absolutely vital section of the company accounts were maintained on a legacy system on a standalone 286 PC. Because it wasn't networked, and because of the business-critical nature of the data it contained, my predecessor had taught her how to regularly back-up the data to a floppy disk, but she confessed that because it was year-end she hadn't had time to do the backup for a few weeks, so weeks worth of hard work had gone. And ironically, the data actually vanished whilst she was performing the backup! So I asked her to demonstrate the steps she'd been doing at the DOS prompt, but without pressing the enter key unless I said it was OK so that I could be sure of maximising my chance of an undelete. She showed me how she changed folder into the place where all the accounts data was stored, did a directory listing, and then copied all the files to her backup 3.5" floppy. "Oh, but I did some housekeeping first," she remembered. And then pointed at a file in the directory listing that she had been trying to get rid of when all the data went missing. "It's the one named dot, it's still there look!". She'd typed "del ." followed by an unthinking "Y". A little knowledge can be a dangerous thing! haha, at least you could make careful use of undelete and save the day... as long as you know what the first character of each filename is supposed to be! ... at least my backup script fuckup only managed to fill the remote filesystem and then corrupt it by powering the remote host down while it was still writing to the disk... all my data on the local machine was still present :) • Tzafrir Cohen (unregistered) in reply to Skizz Skizz: What? Are you going to activate the speech synthesiser and tell the user their mother was a hamster and their father smelt of elderberries? Skizz wget http://downloads.asterisk.org/pub/telephony/sounds/asterisk-extra-sounds-en-wav-current.tar.gz tar xzf asterisk-extra-sounds-en-wav-current.tar.gz play tt-monty-knights.wav (The tt-* sounds are sound files for the the "telemarketing torture". See extra-sounds-en.txt for the full listing). Homework: what does this file have in the French version of this sound sets? • scott (unregistered) We'll call it QuikProtect. http://www.dilbert.com/strips/comic/1995-09-17/ • (cs) in reply to EFH By default, NFSv3 (and earlier) map userid X on the client machine to userid X on the server; NFSv4 does the same, but with usernames not ids. That includes root, unless some option is set to prevent it. On Linux, its set by default (option root_squash), but it can be turned off (no_root_squash) and other Unix variants may not enabled that option by default, or have it at all. Or the sysadmin could have turned it off. • (cs) in reply to wee wee: Someone doesn't use unix very much. First off, nobody calls it "a Bourne shell script". Just call it a shell script, or maybe even a bash script. Bourne Shell != Bourne Again Shell There's nothing wrong with specifying the shell for the sake of the readers. • David Emery (unregistered) This reminded me of a similar problem, where (on a test machine, thankfully) someone coded a DCL command script that accidentally had an infinite loop in it. It was launched from an account that had the EXQUOTA (ignore all disk quotas) privileges. When we got in Monday AM, the machine had crashed pretty hard, because the logfile from this script used up -every byte- of free space on the system disk drive. dave (p.s. it wasn't me who coded the script, but it was me who figured out what happened :-) • Anon (unregistered) in reply to Heron Heron: Once a scapegoat: Passing blame to employees who have left the company is a sad tactic by bad management to get out of taking responsibility for their own screw ups. I worked for a small business once where the owner blamed me (behind my back) for various screw-ups on projects I wasn't even assigned to (i.e. "I think he messed up the Cornerstone database"). My last month of employment there, he refused to give me work to do, putting me off constantly with "in a minute" and "i'll be there after this phone call" and whatnot. Yes, an entire month. I made myself useful by replacing large swaths of a co-worker's job with some very small shell scripts (for which he thanked me profusely). When I ran out of things to replace with shell scripts, I played most of Half-Life 2. I decided not to show up to my last day of work. Gee, I wonder why. Former co-workers sent me messages every few months to let me know when the boss was blaming me for yet another thing I couldn't possibly have done... That same friend (for whom I wrote those shell scripts) quit a month later without notice; the boss had left vital steps out of a series of instructions, then blamed my friend for the Bad Things(tm) that happened as a result, saying something like "you should have just known to do that". There were a lot of WTFs at that place. I should post about it sometime... I think you just did. • A Gould (unregistered) in reply to campkev campkev: amischiefr: Wow, now THAT would be a great parting gift: remove all data from the entire company. How many of you out there wouldn't mind doing that as YOUR parting gift? Actually, I would mind. And, I would like to personally beat the crap out of anyone who has ever intentionally done this when leaving a job. Agreed. Although I wouldn't knock anyone for doing the mental exercise, just to convince themselves they could... • Elvis (unregistered) in reply to snoofle Actually I think the real worse than failure is turning such a buggy and complex shittastrifuck of a script loose on a production network. Oh and the ops guys not being bothered to test their backup strategy is just the icing on the cake. An old boss of mine gave me an excellent bit of advice. "No one give a shit about backups. The do however give a shit about restores." • A Muffin (unregistered) in reply to Anonymous Coward Anonymous Coward: if [[ -z$var1 || -z $var2 ]]; then ... Ehm, no: var1="../../../../../../.." var2="." Oops... It's meant to protect against accidents and stupidity, not malice. • (cs) in reply to Frank Frank: pjt33: What's an explitive? Yes, let's all be pricks about spelling. I hope that someday Alex manages to get the DWTF team into the late 20th century by teaching them how to use a spell-checker. • (cs) in reply to pjt33 pjt33: Frank: pjt33: What's an explitive? Yes, let's all be pricks about spelling. I hope that someday Alex manages to get the DWTF team into the late 20th century by teaching them how to use a spell-checker. I personally prefer a spelling checker to a spell checker, in view of the fact that I don't have many spells at my disposal, but frequent problems with my spelling... • Squid (unregistered) in reply to Frank Frank: What would REALLY piss me off is that even if you're a responsible person fixing someone else's major screw up (And this is in fact what happened), you would forever be remembered as the person who took down the system or couldn't restore. Nevermind that someone else didn't run backups correctly, verify that backups were stable and working condition, and that a script was running without any safeguards to ensure that the variables were in place (Call me paranoid, I rarely trust variables where file IO is concerned.. and that stems from a co-worker bringing a server down doing a recursive grep piped to a file - 800GB output, which was grepped to itself in a loop). Do I know You? • Carmen (unregistered) Jasper: Bosluis: What does SNAFU'd stand for? Here you go. Ah, the famous "why don't you Google it" comment on a forum! Now somebody already searching Google with "What does SNAFU'd stand for" will end up finding this site, telling them to look on Google. StackOverFlowError... • Jamison (unregistered) in reply to dpm dpm: EFH: Y'know, being root on one machine doesn't give you any special access to an drive NFS mounted from another machine. But it does get you potential access, which is often enough. And I can't imagine why the script would become root to do the cleanup. Your imagination needs a great deal more experience, because it clearly has none. I enjoy a good story as much as the next guy, and I like the$var1/$var2 "hook", but I'm thinking this story was invented to go with the hook after somebody thought it up. Holy good night, why would he have to invent it, it happens frequently. Even if you don't know of the famous incidents --- this one comes to mind http://groups.google.com/group/comp.unix.admin/browse_thread/thread/f1834a4fa74980d4/af2749af87216d18 (many good stories worth reading, but search for "Have you ever" to see the one I'm thinking of) --- and why on earth would you think people don't make stupid mistakes in shell scripts? Because they don't. It's not possible. I refuse to believe. • m0ffx (unregistered) in reply to Bim Job Bim Job: I particularly love the idea of "sudo su -", which has got to be the shortest possible sysadmin fuckup of all time. Possibly off-topic - what's the difference between sudo su and sudo -i ? • Stir (unregistered) in reply to wee wee: Someone doesn't use unix very much. First off, nobody calls it "a Bourne shell script". Just call it a shell script, or maybe even a bash script. That's not a flaw in sudo, it's a feature. The flaw is when they set up sudo access, they added a line like "luser ALL = NOPASSWD: ALL", which is braindead beyond belief. A "system set up with NFS" doesn't grant access to the entire network. The people who set up those NFS exports were morons, because they would have had to export / in read-write mode on every single machine affected, with the no_root_squash option no less. Which is braindead beyond belief. In other words, the admins are fucking morons, and they are responsible -- not NFS -- for allowing a remote process to delete the root directory on every server. Anyone who writes a script that executes "rm -rf$var1/$var2" and fails to check the values of$var1 and $var2 should be fired instantly and never allowed nea ra computer again. His manager should be fired for not requiring code reviews. sshing into your own machine is retarded, and I'm not buying it. If you don't verify your backup plan, you get exactly what you deserve. Try the other side of your bed tomorrow. • Haha (unregistered) in reply to David Emery David Emery: This reminded me of a similar problem, where (on a test machine, thankfully) someone coded a DCL command script that accidentally had an infinite loop in it. It was launched from an account that had the EXQUOTA (ignore all disk quotas) privileges. When we got in Monday AM, the machine had crashed pretty hard, because the logfile from this script used up -every byte- of free space on the system disk drive. dave (p.s. it wasn't me who coded the script, but it was me who figured out what happened :-) In my experience, the first to work out what happened is usually the one who (accidentally, no doubt) made the blunder. • Jimbo (unregistered) in reply to m0ffx m0ffx: Bim Job: I particularly love the idea of "sudo su -", which has got to be the shortest possible sysadmin fuckup of all time. Possibly off-topic - what's the difference between sudo su and sudo -i ? sudo su : executes the 'su' (switch user) command as root. if you use 'sudo su -' this will attempt to start a new shell as root (but you could specify any other user too) sudo -i: opens you a new shell as root sudo su - is much the same as sudo -i (which is roughly equivalent to sudo sh) • (cs) in reply to Anon Anon: Heron: There were a lot of WTFs at that place. I should post about it sometime... I think you just did. Oh, there were far more than I posted here... a small part of me wants to curl into a fetal position and cry whenever I think about it. The rest of me wants to start a competing business and do what they do but better, possibly by stealing their clients. • (cs) in reply to Dave Carrigan Dave Carrigan: The first rule of shell scripting is set -u The second rule of shell scripting is rewrite it in perl The first rule of Perl scripting is to wonder why it wasn't done in Bash. • vera (unregistered) the situation sounds familiar... almost the same thing happened at my work, except on a way smaller scale. We put out a new release of our document management system, with one little bug that caused one client's DB to get whipped... The default ID for a document under some circumstances was set to "0" which is also meant “the root”... so, when some one deleted a document with ID "0" it wiped their entire tree structure. Luckily we had a backup... • (cs) I've accidentally run "rm -rf" while in / before. It was on a server. I was root. I realised my mistake about 2 minutes in when I saw a pile of permission denied errors. Thankfully in my case it was actually a new server that I'd just partially configured... All in all I lost about 1 hours work. But it made me a heck of a lot more careful about running "rm". • m0ffx (unregistered) I rm -fred my system once. I was using a livecd to make a backup, and it went wrong, so I did rm -fr . to delete the backup. But I was in the mounted root partition, not the backup location. DOH! Reinstall time. • moz (unregistered) in reply to DLJessup DLJessup: Dave Carrigan: The first rule of shell scripting is set -u The second rule of shell scripting is rewrite it in a non-write-only scripting language, such as Python or Ruby. FTFY The last thing we need here is a readable script language. It would be far better in Perl (especially with something like Acme::EyeDrops), or compiled Java for which you had somehow lost the source. That way, it could never have been extended so much that it was no longer obvious that it needed to be told what$var1 and \$var2 were, simply because no-one (possibly including the author) would know how to hook the new code in.