- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
delete all where comment = first!
Admin
this is a real WTF, this so called consultant is crazy to do this mistake in a mission-critical system
and another WTF is that there is no QA involved or simulation tests to make sure everything is ok before production or deployment.
Admin
I forgot where I saw this --
Bosses want to make money between now and next week, not in a year or ten years. Quality is not something that people want to pay for. That's why our clothes are made by Chinese slaves and our code is made by Indian offshorers. Nobody cares about quality or long-term issues; immediate needs and return on investment is all that is taken into account. For example: Hard drives are horrible for bad quality; You can make a higher-quality drive and have it cost $5 more. Nobody will buy it, since it's more expensive than a competitor's drive with the same visible features. This means that the hard drive manufacturers are pressured into producing the lowest-cost, lowest-quality drives possible.
Admin
I suppose it could have been worse...
//TODO: Uncomment Later //targetSpoutBeforePouringMoltenSteel();
Incidentally, one would think that failure to receive a required signal would result in proper handling of the condition (the apparent purpose of the watchdog software) instead of a system crash... donchya love cascading WTFs?
Admin
#warning "TODO: Uncomment" !!
Admin
I'll probably feel sick the rest of the day.
Admin
Worst production failure?
I used to as a FIB tech in my first internship to do post-wafer modifications to chips (so the customer could test their hardware fixes before running another revision of the chip). The specimen (IC) was placed in a vacuum chamber on a platform controlled by 3D-joystick (all those years of River Raid on Atari 2600 paid off--take that, Mom & Dad!). At the push of a button, the software would insert a needle right up to the surface of the chip to spray gasses over it which would leave behind metal (for patches) or remove it (for cuts).
I was working a long job and had to break for the night, but the software saved the entire setup so I'd just have to reload in the morning.
In the morning I hit "restore settings" and the platform came up and jammed the IC into the needle, ripping a long scrape across the chip (a $40k mistake, by my reckoning). Luckily I didn't damage the FIB.
Turns out that the cleaning crew had come in the night before, cleaned the chamber, and that had moved everything around slightly.
Oops. From then on, I always manually lowered the platform and brought it up toward the needle very carefully.
Admin
Before: if (command=="self-terminate") { t2.speak("I cannot self-terminate..."); } t2.speak("Committing Hari-Kari");
After: //TODO: Uncomment //if (command=="self-terminate") { // t2.speak("I cannot self-terminate..."); //} t2.speak("Wtf?");
Admin
I did this:
GPS.long_deg = getNumber(); GPS.long_deg <<= 4; GPS.long_deg |= getNumber();
...
GPS.long_dec = getNumber(); GPS.long_dec <<= 4; GPS.long_dec = getNumber();
Worldwide recall.
Every 10ths digit of longitude was always a 0, for an error of about 800m on average. They had to re-tranquilize a bunch of tigers and redo hundreds of collars.
It slipped through my tests, production's tests, and even the tests sent out to a special-case client that said they were perfect. He happened to work where the tenths digit was 0.
Addendum (2007-03-14 12:10): The problem was noticed when a "wolf" researcher in "California" noticed that her wolves were spending an inordinate amount of time out in the ocean.
Admin
I'd say that a serious WTF is using a DB that doesn't support transactions.
Admin
i've done that SQL error... though with an UPDATE, instead of a DELETE. presto, all the records are the same!
i've never burned-out millions of dollars worth of heavy equipment, though. that's above my pay grade.
onomatopoeia!
Admin
I've often thought that WHERE clauses should be compulsory, and that they should introduce a WHEREVER keyword, pronounced similarly to "whatever!" as spoken by kids today (bah humbug, bring back national service).
Admin
No offense but I find that hilarious.
"Why the bloody hell are my wolves out in the ocean??"
Luckily you were able to fix it and it didn't endanger lives but... yeah that just strikes me as funny.
Admin
And BTW, it -is- a complete pain to try to debug with a watchdog timer running. You only get one breakpoint, and as soon as you resume the system resets.
Admin
Seriously, why did they have to re-tranquilize the animals? Unless they are animals with a range of <800m, I would think it wouldn't matter, as long as it was consistent (assuming the 'm' is for meters and not miles)
Admin
Also, note the semi-recent Side Bar thread "What's your biggest screw up?" for more stories of production systems taken down by oversights and good intentions... not quite on this scale, though. Impressive.
Admin
I didn't say wolves don't swim. It just that they don't spend 8-15 hours a day out dog-paddling. If they were piloting a boat, sure, I'd say that's reasonable, but they can't get licences because they're colour-blind.
You'd try changing the collar on a non-tranquilized tiger?
They eat meat.
People are meat that doesn't run very fast.
The problem, if you can visualize it, is when you're downloading the data later. Imagine all these GPS fixes imported into a GIS application to map the paths that these animals have taken. You can see where they've gone and when. Do they go out at night to the watering hole? Do they travel more than a few miles to chase prey?
The way the programming worked as sent meant that all the animals traveled in lines. Like there was a roped queue for the tigers to follow to the watering hole, and then they'd teleport 1.6 kliks away to catch the next meal.
That 800 meters was on average. They could be out as much as a mile. (1.6 kilometers)
Admin
My worst WTF wasn't mine, but I had to find it and fix it.
I used to work for a company that designed and built card printers. Most of our business went to companies that put together license printing systems for state DMVs. One such state DMV had a special twist on one of our features. We could read a pre-encoded magnetic strip on a blank card, and report that data to the application, which would then have to store and approve that number before printing. The state wanted this to be done with a barcode reader instead. The cards were scanned after printing in order to be matched with the correct mailer before stuffing into an envelope.
At some point, one of these printers would report a serial number that had already shown up. Every card thereafter was off by one number. This resulted in people getting mailed the wrong licenses! Obviously bad.
Now, the guy who wrote the firmware for the printer had about two weeks to learn the barcode reader's protocol, while being focused on a different project. In addition, the printer was mostly completed at the time of our buyout and cash infusion, so we cheaped out on things like source-level hardware debuggers. All we had was a serial output on the printer's mainboard. I spent days debugging this in the lab and was able to reproduce it, but management decided that politically it was better to send me halfway across the country to continue working on it.
It was a simple race condition. At one point, the firmware read from its barcode buffer before the reader had a chance to finish writing the new value. It could be fixed with an easy mutex. Unfortunately, there were 3 QA technicians for a company of 3400 people, and we started this project AFTER the due date to the customer.
Ugh....
Admin
I've done the DELETE/UPDATE without a WHERE myself, although it was never an oversight but accidently hitting Enter before finishing typing.
Lesson learned though, I now always force myself to first type the WHERE clause or use a SELECT first to change it into a DELETE only when I'm sure it's operating on the records I need.
Admin
Yeah, I didn't laugh too hard at this one. It's sort of funny now, but not at the time.
Part of the problem was that the download software hiccoughed on a 0 in any leading place. That bug got fixed with a better job of Format(), but it wasn't the bug we were really looking for.
I did endanger lives, just not human ones. Big difference, I know, but when you're talking about tigers, even a few makes a really big difference to the species.
Admin
No, because it was the watchdog itself that had the failure (compounded by the control system's failure).
The control device rebooted, which caused it to stop sending the heartbeat to the watchdog. The PROPER behavior would be for the watchdog to notice this, and trigger a shutdown/alarm bells/etc.
INSTEAD the watchdog was completely disabled, so it didn't notice any problem, so the lack-of-control system went unnoticed and things went haywire.
The watchdog is there to prevent problems with the control system's failure, but who watches the watchdog? This also shows that even if the watchdog sent out a heartbeat, it still could have this same detection stuff commented out and "appear" to act properly even though it doesn't.
CAPTHA: craaazy. indeed.
Admin
#ifndef _DEBUG
Admin
I worked for a steel company during most of the 90's. Besides writing most of the accounting software, I also wrote the payroll/HR application. It was used at 4 locations throughout the US. You haven't known REAL pressure until you have a 1500+ United SteelWorkers of America waiting for their paychecks.
Admin
That's the funniest thing I've read all day, thanks!
Admin
My boss one day:
Update tPalletData set iLanePosition = 2
16,000 pallets of cheese logically placed into the second position of a storage rack. Not a pretty thing.
Admin
Yeah, and I meant no offense. In your position, I would have had that sick feeling in the pit of my stomach over something like that. It's one of those things that you can laugh about now but at the time, it's absolutely horrible.
Back when I was just learning SQL (years ago), I was trying to run some updates on this table. I was an idiot and didn't use a transaction. I forgot to include the WHERE clause on my update and the entire table got updated...
It wasn't too big of a deal for this particular table, but I was sick about it. However, from that point on, the importance of transactions was burned into my memory and now I never run any data changing queries without a transaction, no matter how "trivial" they may seem.
Admin
I had a colleague once who was asked to troubleshoot a particular database problem at our facility in Columbus, Ohio. So, he logs into the OHCOL server and starts mucking around, running tests, starting and stopping jobs, and pretty much putting the system through its paces. Since the site was apparently non-functional, it didn't really matter what he did with it. However, he is unable to reproduce the problem.
Then and only then does he discover that we also have a facility in Columbia Station, Ohio....
Admin
Those weren't wolves, those were Nethack Tengus!
(Sorry, I couldn't find a better description.)
Admin
rm * ~
on a new module that hadn't been checked into version control yet
for a two-man company whose hosting was through a commercial isp that charged for recovery
half the fee came out of my next paycheck
Admin
Indeed. That friggin' cracked me up. It's one for the funny pages, that.
Admin
I find it hard to believe that a steel mill wouldn't test the component on a virtual system before deploying out to the real world, and that they wouldn't have noticed that the timer wasn't tripping in the virtual test. I guess that's why I make the big bucks!
Admin
My biggest screwup was presorting some Canadian Mail in 1994.
I added the last digit of the postal code to the beginning of the street address.
So, if a postal code was M5E 1E4, and the street address was 123 Main Street, the printed address became 4123 Main Street.
This resulted in 34000 pieces of undeliverable mail.
It cost me $12,000.
My customer asked "Who's going to pay for this?". I replied, it's my fault, I'll pay.
I have since many times recouped those $12,000. But it was sure hard to see the "oppurtunity" in the "problem" at the time.
Admin
Sometimes it's all you have. I spent years on one project saying "we need a development system, and a production system; the production system must never ever ever be edited live". Management didn't want to spend the $1500 on a new server, or waste any time installing and configuring a development system, or lose the ability to call development and say "I have a meeting in five minutes and our ROI figures don't look high enough, change the math NOW". They also didn't want to pay for backups and system administration, and thought source control was a stupid waste of time and money.
I can lead clients to reason, but I can't make them think.
Admin
It has been explained to me that dogs are NOT colour blind. They are weak in the red/green areas, but NOT colour blind.
I would suppose there's not much difference between dogs and wolves with respect colour vision. But with the red/green problem, whey definitely wouldn't be out nights cruising.
Admin
It has been explained to me that dogs are NOT colour blind. They are weak in the red/green areas, but NOT colour blind.
I would suppose there's not much difference between dogs and wolves with respect colour vision. But with the red/green problem, whey definitely wouldn't be out nights cruising.
Admin
My biggest failure was this.
Mid 80s...I'm 19..and a contract programmer to small businesses. I wrote an invoicing/billing system for a small parts distributor. I finally convinced them to upgrade to a nice shiny external 20MB drive. Back then an external drive was the size of a decent suitcase. And it had a 'write protect' button. The OS was Xenix (an MS/SCO variant of Unix of all things).
Customer told me they had backed right before I got there. So I formatted the new drive: Format Drive 1. Confirm that Drive 1 is the one you want to format. Yes. Format done. All is well. Copy my application + customer's data to the new drive. Push in the cool write protect button because I can.
So I decide to reinstall the latest OS while I am at it and upgrade some of the serial boards drivers (12 serial ports--woohoo). Hmmm..slight problem. Looks like some bad drive spots are showing. Crap..some hard drive errors. So I reformat drive 0 to mark the bad spots so I can finish the install. Format done. Install done.
Mount drive 1. What? Drive 1 is empty? WTF?
Empty sinking feeling. I hadn't formatted drive 0, I had formatted drive 1 by mistake. And for some reason the OS had installed fine the second time on drive 0.
no application, no data on drive 0. no application, no data on drive 1.
The 'write protect' button did not work.
Oh F*********K.
Backup..where is the backup. Turns out the customer had used the same set of 8" floppys for the past 2 years. So naturally one of them was bad...and the one that was bad? The one with the source (the system I used was interpreted) for order entry was hosed.
so the customer walks in the next morning to no order entry and partial order data.
I still lose sleep 20 years later over this one. Oh that pit of your stomach feeling...
Moral. Never decide to 'reinstall the OS' on a whim. Don't do mission critical stuff after being awake 25 hours. Don't trust the user to backup (hard to do with the big systems we have now). And don't trust the 'write protect' button.
--Steve
Admin
My code once generated estimated tax amounts that were too low. We didn't find the problem until AFTER the quarter ended and the data had been sent to the IRS... which could have resulted in massive penalties for our clients. Fortunately, the IRS and our company talked it over, and explained that the numbers were due to a glitch and would soon be re-submitted. Everything worked out, and nobody got a nastygram.
I don't recall the exact numbers any more, but I ran some calculations at the time. If I worked for minimum wage with the difference sent to the IRS, I'd have been able to work off the penalties in about three HUNDRED years.
Admin
Admin
CS students at my school were heavily recruited for internships (read free labor) to work at a Pittsburgh based steel company (you figure it out). Part of that recruitment was taking a test that was copyrighted in 1970, I only graduated in 2005 so it was a pretty old test.
Admin
My mistake was a very simple one: we were troubleshooting a developer's machine. A machine with the only complete copy of the source code of our flagship application in the process of being rewritten, to the tune of a couple thousand man-hours of work. I was reconnecting the hard drive while working under the desk, and didn't realize that those keyed hard drive power connectors are only "keyed" in the academic sense, and it sometimes doesn't take very much force to put one in backwards.
Fortunately, somebody managed to come up with an identical model of hard drive, and we swapped the controller board on the bottom and recovered the data. I'm much more careful about that now.
Admin
In the same vein, I did for i in *; do cd $i; rm -r *; cd ..; done last week. Fortunately the backup of my home directory was only two days old.
PS: For the shell challenged, the cd $i fails if it encounters a regular file and the "cd .." will climb up one level in your directory. The rm -r * will destroy everything then.
Admin
Like the rest I have done the UPDATE without WHERE - in fact, it happened just about 2 months ago. However, when I first escaped West Lafayette my first job was with a Machine Tool Company located in Cincinnati (93) - I worked on Plastics Injection machines.
Like the other software developers I had my own machines I helped through production. I was out on the floor helping with tests while everyone else was a lunch. I went to set up for the next test and fat-fingered a -1 in one of the fields where it was expecting a positive number. This number was used as the slope for the speed of the ejector plate.
This plate is several hundred pounds of cast metal. I started up the machine to run the dry test. After tonnage was released and it called for ejector movement the negative slope doomed me. The software was trying to reach point 0. Instead the slope calculations kept moving the plate backwards - thus it called for more speed to reach point 0 - which made it move farther backwards.
This ejector plate moved on 2 rails - which had large metal stops at the end. In a split second that ejector plate was moving so fast that it blew right through those stops, flew out the back end of the machine, past the open walkway behind the machine where it was eventually stopped when it slammed into some empty lockers (about 4-5 meters from the end of the rails).
Not to be outdone some of the bearings packed into the rig were moving fast enough that they acted as shotgun pellets into the lockers as well. Thankfully everyone was a lunch so nobody got hurt.
I still keep some of those bearings in my desk to remind me of that.
Admin
Don't trust backups, period; test them before you need them, if at all possible.
I was working for IBM, circa 1990. Had a PC RT (IBM's first RISC Unix workstation) running AOS (an IBM BSD port for the academic market). I was running out of room on my two 40MB hard drives, and there was a spare 70MB drive in the Room Of Misfit Equipment.
So, I backed everything up onto quarter-inch tape, took out one of the 40MB drives, installed the 70MB drive. Formatted them both. Installed AOS. Popped the first backup tape into the drive to do a restore.
None of the backup tapes were readable.
Fortunately, all my real, paid work was safe in RCS archives on our AFS fileserver. All I lost were my own pet projects, including a X11 window manager I had thrown together in my spare time.
That AFS fileserver often came in handy. Every night it backed itself up to a second disk partition, which it then exported read-only under /yesterday. One day one of my coworkers accidentally deleted a huge chunk of our sources; I just mounted /yesterday and restored them. Fortunately nothing in the erased area had been changed since last night.
-- Michael Wojcik
Admin
... "introduce a WHEREVER keyword, pronounced similarly to "whatever!" as spoken by kids today (bah humbug, bring back national service)." ...
I work with both SQL Server, and DB2 every day. A nice feature of the DB2 SQL query builder on the AS/400 is that if you ever write something with no where clause, it prompts you for an extra key press to confirm you really wan to run that...
of course, I'm used to using SQL server, so I already double, and triple check myself...
Admin
you two are supposing that:
you are new here, aren't you ?
Admin
Time for zsh evangelism! Had you been able to say "for i in *(/)", nothing would have gone wrong!
Admin
In my previous life(before working in IT) I worked in a steel mill. It was a horrible, dirty, dangerous place to work. Someone I knew died while I was working there.
Admin
Wow, that's like a scene straight out of an action movie.
But the real WTF is the lack of a check for invalid slopes that would have prevented a lot of grief.
Admin
Yet another case for enabling that thingy in SQL where the WHERE clause is required.
Admin
// This turbine keeps crashing. Hmmmmmmmmmmmm!!!!!!!