- Feature Articles
-
CodeSOD
- Most Recent Articles
- What a More And
- Hall of Mirrors
- Magical Bytes
- Contact Us
- Plugin Acrobatics
- Recursive Search
- Objectified
- Secondary Waits
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
i like this very informative error...*sigh* oh wait it's still reserved, probably for future version errors...
on the topic, we are maintaining an application that integrates our client's system to external systems. one day one section broke because an external vendor had placed null characters (0x00) on their file's fillers. i created a mapper which uses a format similar to the original file structure but without the fillers. it worked fine, the file is being parsed successfully. after several weeks, i noticed that i mapped the old file to the new one but forgot to use the new file for parsing.
upto now i haven't figured out why the hell it worked...unintentional workaround perhaps?
Admin
Well, many of you described 'ghost bugs' that disappear if you add some extra code/variables/dummy exception handling. My guess is you've got some stack-overflow issues (such as appear when incorrectly passing arguments to printf) which disappear if there is some extra memory allocated (for handling this extra code/variables/etc). I really seen that in my own code and it always was the problem of the stack.
Admin
There's a product that my company resells and supports -- I shan't name it to avoid upsetting my employer, but it's a "CRM" system -- that will occasionally pop up a message on startup complaining that there is not enough disk space even when there is plenty.
Through trial and error we discovered that this problem is caused by the amount of disk space being somewhere around a multiple of 2GB. We suspect, though the manufacturer will not confirm, that there is a check in there that fetches the amount of disk space into an unsigned integer and then sees if it's less than some arbitrary threshold. Since today's disks are generally much larger than the number of bytes that can fit in an unsigned integer this thing often fires false positives.
The workaround? The support techs always just tell the customer to find a big file and copy it a few times to move the disk space count out of the trouble zone. I did suggest that we simply patch the product to bypass the check completely, but apparently this is prohibited by the licence!
Admin
The Borland C compiler had some problems with code lines. I went to a C course around that time and used Borland C for looking at the example code. Most programs needed some line reordering to compile.
For example, the compiler did not like "exit" statements everywhere.
if (some_error) {
exit(1);
}
<some more code>
wold not work, but
if (!some_error) {
<some more code>
else {
exit(1);
}
worked fine.
I finally gave up trying to get the example programs to work, as fighting the faulty compiler was a bit too much for beginner C course.
Admin
Oh - That makes it all better!
Admin
The algorithm sounds like one based on UUID, which uses a combination of (hopefully) unique network identity and time to generate universally unique identifers - the problem arises from the fact that there is a fixed allocation rate of the identifiers. Assuming that the allocation routine is faster than the minimum possible timeslice, then you have two choices - block the requestor until a new timeslice becomes available, or rob peter to pay paul, i.e just steal the next tick - the former is safe, but can be annoying if a large number of requests show up in a short period of time - the latter is a sure receipe for disaster, because you're mucking about with the algorithms first principles - its actually a credit to Notes, in a twisted way, that they did this and didin't actually duplicate ids.....
The real WTF is that the solution for this problem has been known for years - UUID/GUID values are a function of time, i.e. even if you don't allocate the guid, any given guid is tied to a particular time and machine - if you ask for a guid at time T, you get it - if you don't, nobody gets it. The solution is simple, you put a pool allocator in front of the generator - pool holds 2x projected peak max allocation (or whatever safety margin you want) and refills at tick interval rate - all readers get their ID's from the pool - problem gone.
Admin
I've encountered the exact same problem. The issue was a memory leak in a local variable that was corrupting the return value for the function on the stack. It didn't happen for debug releases because additional data was stored on the stack -- that data was corrupted instead of the return address. Adding debug code or more variables prevented the program from crashing as well -- the local memory for the dialogue boxes was corrupted, not the return address on the stack.
Admin
The power-on timer (Attribute 9 raw value) on my Maxtor disk acts strange.
... - In Maxtor disks that use the raw value of Attribute 9 as a minutes counter, the hour time-stamps in the self-test and ATA error logs are calculated by right shifting 6 bits. This is equivalent to dividing by 64 rather than by 60. As a result, the hour time stamps in these logs advance 7% more slowly than they should. ...
from http://smartmontools.sourceforge.net/
You know, I'm with Maxtor on this; having 64 minutes per hour, 64 seconds per minute and 1024 milliseconds per second leaves us with the nice result of using a whole number of bits for milliseconds in an hour. An 8 hour workday is just 2^25 milliseconds.
Admin
Admin
A help authoring app I use had a similar problem. To be 'helpful', it would check the amount of free ram available, and if ram was low, pop-up a message box EVERY SINGLE TIME YOU SWITCHED BETWEEN TOPICS. Of course the problem wasnt too little ram, but too much ram (1 gig). However they were measuring it mustve been overflowing and returning a low value.
The 'workaround' was to buy a $250 upgrade to their latest version even though I didnt need it.
Admin
You've done this lots of times? I wouldn't. IMHO, if you need the current time to generate unique ID's, as Notes seems to do for "Replica ID's and document UNID's", you are doing it wrong. Generate your unique ID's some other more reliable way, and attach the appropriate timestamps (say, for transactions, using your example) to the appropriate ID.
Admin
That was my first thought, too, but a line-by-line inspection of the code path leading up to the crash showed that all buffers involved were being properly allocated and properly sized. Any buffer overflow would have had to have been in the C standard library.
Admin
Because 4096 was too small, obviously.
Admin
Yeah, don't you just wish people would use real applications. Preferably written in VB or some other real language.
And 'm not saying Domino is the best thing since sliced bread, but calling it crap "just because...". And I do spend some of my days cursing Notes/Domino to hell and back.
"This is a real Notes killer!", Microsoft 1991, 1993, 1999, 2001, 2003, 2004
Admin
Pick your favorite meaning of POS in the following scentence: A while back, I had the pleasure of working with a POS application that had the following attributes:
1. Written in Borland C++;
2. Required MS-DOS clients (hence couldn't use more than 640k);
3. Was so large that if you compiled with debug settings on, or attempted to use the debugger, it wouldn't work.
4. Used a proprietary and unwieldy app framework for generating menus, boxes, etc.
There were many, many times where I would get strange stack errors (due bugs in the framework) and couldn't correct the bug, so I'd put in a few calls to printf() to make the error go away.
Admin
(1) Agreed that a "unique timestamp" isn't really a timestamp, but Notes is using it as a timestamp -- the document creation time. Either Notes is actually incrementing the system clock or insisting on globally unique creation times, either of which is a WTF in my book.
(2) And yes, I do have a better idea. Or rather the IETF did.
Admin
I once did a cross-frame Javascript that would sometimes give a null error, and sometimes not. Adding alert boxes as breakpoints would remove the error.
Turns out that the a loaded frame may begin script execution while another is still loading and has no DOM tree yet.
Solution:
Admin
I have a WTF workaround experience and I'm experiencing it right now. There is this forum that I'm posting comments to, and the CAPTCHA is broken...
opps, I mean the CAPTCHA is "broken".
Admin
Admin
Yeah, but those binary watches that us geeks get to show off just how geeky we are, would be significantly LESS geeky. (I'm sure we'd just make watches based on a decimal system and show off how to convert from one to the other.)
Admin
<FONT size=6>Workaround2</FONT>
Makes you wonder, "how many are there?"
Admin
Why not <FONT size=4>42</FONT> ?
--Douglas Adams fan
Admin
Sage KHK: An bug-ridden, stupid incompetent business software based on Access.
We wanted to hide the prices for items/components in the bill and only show the total. They said we should just set the colour to white.
Admin
Not worth it in this case, but in a slightly more serious case it would be worth checking with a lawyer. AIUI in some countries you have a legal right to patch problems which the supplier is unable or unwilling to fix, regardless of what any licence says.
Practically speaking, why not:
Any supplier daft enough to start throwing lawyers at you in that situation probably isn't going to be around much longer anyway.
Admin
Heh! I had a similar experience a few years ago. I was developing the Windows client app for a new device that processed 8 channels of data in real-time from some VERY sensitive SPREETA sensors.
We thought it was cool the way you could watch the graphs fluctuate when a cloud passed by the window. Apparently, this wasn't a good thing as far as the intended use was concerned and, eventually, a mechanical shutter was placed on the device to keep light out.
This project had LOTS of other WTF! word-arounds (mostly related to mechanical side-effects) and eventually ended up in the "it seemed like a good idea at the time" scrap heap.
Admin
I suspect that the amount of free space is being returned into a signed (not unsigned) 32-bit integer. If there is two to four GB free, it will show as negative. This will also happen for six to eight, ten to twelve, etc.
Sincerely,
Gene Wirchenko
Admin
I've known a couple of companies use Notes very successfully - but in situations where things could be done with documents and forms and just a little bit of scripting. If you're handling numerous variable-sized pieces of text, it's a lot nicer for the user than things like Access with fixed-size form fields.
But if one needs much more than a couple of hundred lines of script I'd agree Notes is the wrong tool.
Admin
The real WTF is that Notes was using a timestamp as a BOTH a timestamp and a unique identifier. There's nothing wrong with using a timestamp as a basis for a unique ID, but you shouldn't use it as a timestamp then, because you'll have to make it unique (which will change the effective time). At least Lotus didn't just put in a delay loop to wait until the time changes before yielding a new ID.
I'm guessing that way back in the early days of Notes they decided to use the creation date as the unique ID. Then one day computers got fast enough that they started to get duplicates, so they decided that they had to have a function return the next time in the sequence so they don't repeat. At the point they wrote this function they should have realized that the ID has to be separate from the timestamp. The ID could just be a 32-bit counter in this case.
I have an application where many different computers submit jobs into a queue to be processed in chronological order. By naming the files in the queue with the timestamp (seconds since 1970), the earliest file will come up first and will be processed first without having to coordinate filenames across the computers. Of course multiple people could submit jobs at the very same second, so I also append the job name (123456789.Lisa). That way a dup could only happen if two people submitted the same job at the same second. At that point they're the same job, so it wouldn't matter if one overwrote the other anyway.
Admin
Hmmm...
I'm thinking of creating a software application which processes financial business transactions. One of the licensing models I've looked at is a capacity based scheme. Now this WTF gives me a greate idea:
50,000 transactions/hour = sleep(500) = $50,000
100,000 transactions/hour = sleep(100) = $100,000
250,000 transactions/hour = sleep(0) = $250,000
Admin
I have a better idea.... Don't use "Unique Time Values" to begin with. If you need a timestamp, that is fine but don't use that timestamp as a unique identifier. A timestamp in conjunction with another attrubute may be ok though.
Admin
This one's pretty boring, but it just happened today, so:
Debian Sarge is still using mysql 4.1.11. In fact, they just released a patch for it on the 22nd. 4.1.11 was originally released in April 2005, so it's over a year now, and it's 8 subversions behind the most recent mysql release: 4.1.19. freenet's #mysql got a big kick when they heard that.
We've recently started using mysql to replicate some of our databases. However, there's a long standing bug in mysql (over 3 years old now):
http://bugs.mysql.com/bug.php?id=352
If you restart a replication slave, it deletes all its temporary tables, which almost always causes problems when you restart, as it loses the context for any future queries on said temporary tables.
Well, our databases kept having replication failures because of missing temporary tables. Turns out mysql was crashing at 1:30 AM every morning, which triggered a restart, which caused mysql to dump it's all its temporary tables. Why was mysql dying at 1:30 AM? Well, that's when the backup process runs, which runs a "STOP SLAVE;" before dumping the slave database to a file. And in certain states, "STOP SLAVE;" is resulting in a crash.
The workaround? Don't "STOP SLAVE;" before backing up the database at night -- now we just lock the entire slave database for the duration of the backup instead.
Admin
[idea type="better"]
Try an incremental counter that resets to zero everyday. Append this counter to the date/time like so: "2006-05-26 14:55:01 - 69". You no longer have to worry about a transaction snatching the same "id" from the date/time alone.
[/idea]
This is just one of many ways to make this issue a less worthy WTF.
You'd think with all of those brain cells humming at "Big Blue", they'd squeeze out a clue every now and then. Please don't get me started on the argument that this is simply "Lotus garbage" and not IBM's fault. They bought 'em! They defi<FONT color=#d3d3d3>k</FONT>nitely do get the WTF award - or - at least a <FONT color=#d3d3d3>k</FONT>nomi<FONT color=#d3d3d3>k</FONT>natio<FONT color=#d3d3d3>k</FONT>n or ho<FONT color=#d3d3d3>k</FONT>norable me<FONT color=#d3d3d3>k</FONT>ntio<FONT color=#d3d3d3>k</FONT>n. :)
Admin
In a developer product I was working on, customers would get random and unreproducible crashes in the apps they wrote. This went on for months. To make a very long tale of woe short, it was finally narrowed down to some printing scenarios. After finding a short, reliable repro (division by zero in a printing scenario), the cause was found: print drivers from some companies were resetting the hardware error handling in the FPU for some unexplained reason. If a developer generates a div/0 error in their app (or usually something more complex than that), the FPU thinks the software will handle it, and when it doesn't: crash.
Workaround: sprinkle calls to _fpreset anywhere one might think the print driver has fiddled with the FPU.
Admin
When running Adobe Acrobat Professional under OS X 10.4, there's an issue where 17 error messages per second are generated to one of the system logfiles.
17 error messages every second that you are using the application.
The logfile is set to a max of 128k, and so it fills up in about 3 minutes.
Here is Adobe's official solution for the problem:
http://www.adobe.com/support/techdocs/331663.html
<p>
8 When you locate the first "Invalid color..." message, select the entire entry and then press the Delete key.
Admin
There was a time when a rank amateur like myself could read one book by Peter Norton and know just about all there is to know about PCs. That era had clearly ended when I migrated from Windows 95 to XP.
After installing my perfectly legal copy of Access 97, it wouldn’t execute, displaying an error message saying there was no license for it. Fortunately the workaround was documented in Knowledge Base Article #141373:
Step 1: Go to your font folder.
Font?
Step 2: Find the file for the Haettenschweiler font.
Haettenschweiler??
Step 3: Change the extension of this file.
How can this possibly be related to registering a license???
Step 4: Reinstall Access. You can revert the extension after installation is complete.
WTF????
It worked perfectly. But it was then that I realized that I had no hope of really understanding computers anymore.
--RA
Admin
This is an old one. The developer product is Microsoft Visual FoxPro. Some companies is at least Hewlett-Packard.
HP's drivers do not correctly indicate number of pages printed for my printer. How far along is that twenty page print job? It is on page 0. Some versions of their drivers have worked, but most do not.
I do not print on my home system very much, so I tend to keep my printer off. A number of times, I have tried to print something when my printer was off. The result is an error which hoses the printer handler task and requires a reboot. A printer not ready error would be a lot friendlier.
HP writes quality printer drivers: poor-quality. My next printer will almost certainly NOT be an HP. They did it to themselves. I used to love HP stuff. The same also applies to Microsoft. Sigh, and sigh again.
Sincerely,
Gene Wirchenko
Admin
IBM is constantly making up for its extremely broken software with lovely euphemisms - in this case, "Time creep". Notice how it is made out to appear as a legitimate action to take without acknowledging the (obvious?) underlying flaw. Would you like to see more? There's a reason I left that hole. I resent charging people thousands (more even) of dollars for software that I would never pay a single cent for myself - it's all (yes all) junk.
Admin
Well providing a workaround is better than what FileNet do when a bug is reported to them, one time they replied to a case saying that one of the buttons that should open a document in a certain screen is just poping a debug message without opening the document by: why do want to open the document from that screen.
Admin
I've come across C++ compiler problems where writing code like the following stops it from segfaulting (presuemably something is being incorrectly optimised away):
<font face="Courier New"> void int doSomething(...) {
...
return retval;
retval += 1;
}</font>
Admin
Hey man. They put the "Open this file" button there. But that doesn't mean you should use it!
Admin
I once ran into a similar issue. It turned out that in some very obscure way I smashed the stack, and that adding just a declaration of an volatile int i = 0; without touching it would fix it. After better debug tools were developed, I found the error finally and could fix it...
Admin
Actually when we stoped "wanting" to open the file using that button our problem was solved :D
Admin
What is "void" doing there?
Sincerely,
Gene Wirchenko
Admin
I had something very similar happen recently. Turns out the compiler (gcc) was bungling an optimization. The fix was to compile with "-fno-strict-aliasing".
Admin
I wanted to check if lotus notes client is logged in or if it's locked or not, none of the COM APIs considered that and the workaround was: use the window title to guess the status, but even this got us into much troubles(Oh yeah we had to go with this workaround) since many screens had the title "Lotus Notes" only.
Admin
Dangit, forgot the quote, here we go again:
I had something very similar happen recently. Turns out the compiler (gcc) was bungling an optimization. The fix was to compile with "-fno-strict-aliasing".Lee
Admin
A classic case is the "CR/LF" sequence, which may have originated with the need for a teletype to have time to get from the right margin to the left . . . one character was not enough.
Lee
Admin
I used to have a dual Athlon box, so the CPU Affinity thing is something I know well. I've had to use that trick since I first tried to play GTA: Vice City on that system.
RunFirst is the program I use. Most of my games have shortcuts that run it to run them. Pitty there's no easy Windows option for doing this...
Admin
Both a Win client and a Linux box access
a directory on a Win 2003 box. The Win
client writes a file. The Linux box
need to read the file. The Linux box
sees the dir via mount.cifs. The file
SHOULD be there.
It is not.
It can show up in 10 seconds or 2 hours.
The Linux admin assures me there is nothing
that can be done.
The Windows admin is silent.
I'm the Perl programmer.
I discover that rereading the directory will
force the file to appear.
So here is my workaround:
while (! -f $spooler_done_file){
$wait_pass++;
bug ("[$wait_pass]: No [$spooler_done_file] - waiting $SPOOLER_WAIT_TIME");
sleep($SPOOLER_WAIT_TIME);
dir_flush($spooler_done_dir);
}
sub dir_flush{
my $dir = shift;
bug("Flushing dir: [$dir]");
opendir(DIR, $dir) or my_dir ("Can't open dir for flush [$dir] - !");
my @files = readdir(DIR);
closedir DIR;
}
Admin
This reminds me of a certain game <cough>Tiberian Sun</cough> which refuses to run on Cyrix processors. My Mum's PC had a Cyrix CPU. She couldn't play it.
The beauty of this though is that you don't find out that it's not supported until you go to run the game and it says "Cyrix CPU not supported". No documentation on it whatsoever. Just bought the game? Have fun returning it.