The Daily WTF: Curious Perversions in Information Technology

DaveK · 2009-01-13 Reply Admin

Eyes Gouged Out:
In the mean time, do not click on this link. Or use longjmp.

AAAHHHHGGGGGGGGGGGG!!!!!! I was steeled for some tubgirl variant, but I was horribly horribly mistaken. Can I wash my forebrain out with soap??? The goggles - they did nothing!

Futurama narrator's 'Voice of God' over:
YOU WATCHED IT - - YOU CAN'T UN-WATCH IT.

But you can borrow my belt-sander for your retinas, if you want.

Kuba · 2009-01-13 Reply Admin

vinnybad:
I can see where goto can be useful...especially if the task is OS code...but long jump? where can a thing like this be useful?
The guys who make these instructions are smart...not dumb...can anyone think of a case where it can help?

Ever tried implementing cooperative multitasking using nothing but C?

Kuba · 2009-01-13 Reply Admin

Mcoder:
Now, about the WTF, and answering to Bob (238533). No optimizer can work well with goto statements. Most of optimizing theory simply stops being useful when you start using it.

Said a firm believer in cargo cult approach to computer science. What you say implies that there are no optimizations that span basic blocks. Even crappy compilers (my favorite: Zilog's ZDS II) can (or seem to) optimize across basic blocks. Popular compilers like gcc 4, msvc 2008, icc, an even less popular ones, like IIRC cmucl, can do it.

Kuba · 2009-01-13 Reply Admin

adiener:
Alex:
dfk: No. A 'goto' becomes just a branch, and is very fast on modern computers. The test would be part of an 'if'...
Dude, there's a test at the top of every for loop. Thus, test and branch.
Not every for loop, only ones for which the second expression (after the first semicolon) has something in it.

At least on "good" digital signal processors, simple loops across a vector/matrix are free, as in they don't use any clock cycles. DSPs typically have hardware loop units, so that you set up the loop by loading some configuration registers, and then the CPU loops automatically and transparently -- there need be nothing in the code to say "jump back to beginning" or "increment the index by 4". There are typically a couple loop units, so that you can loop across multiple dimensional structures.

This is possible even in presence of pipelines: the loop hardware feeds the "next" instruction pointer value to the first pipeline stage, so that fetches always come from the right location and "looping" is free, because the pipeline does not have to be cleared. This is in spite of there being (typically) no branch prediction on the same DSPs: a branch opcode is not predicted/anticipated and casuses a pipeline stall, but any hardware-executed looping is "known" in advance and addressed properly at the throat of the pipeline.

Cheers, Kuba

Kuba · 2009-01-13 Reply Admin

Helix:
Bobblehead Troll:
I so love the smell of cargo cult in the afternoon.
What about those programmers who actually write compilers, or library code? Hardly any of the C standard libraries are platform-optimized in any way and few compilers use more than 20% of the instruction set of the processor anyway, excepting some simple RISC machines. There is only one C compiler I know that does any kind of instruction-level vectorizing. With all the rest, guess what, you have to write it in assembler.

Really, it is rather amusing to read comments by people who have never written or even read any assembly code to complain how slow it is, as well as those quoting the holy words of Knuth and obviously skipping the first long word that is outside their own jargon and it just comes out as 'optimization is the root of all evil'.

In this example, yes... and there are many cases where people look at what the compiler made from their code and went 'WTF'.

Trust no one. Not even yourself.
Lemme guess, you have only used GNU compilers. You are far from the truth, and some people on here work for high quality compiler vendors.

Last time I checked, the Mersenne Project guy didn't yet abandon his codebase. He did precisely what no compiler, high quality or not, could deliver (to this day): he hand-optimized the FFT-based integer squaring routine, and got a speed improvement "by a couple times" over anything the compilers could produce. IIRC for some platforms, his code was almost an order of magnitude faster. The difference is between a primality check taking a month vs. it taking a year.

Of course, some newer architectures are more compiler-friendly, so perhaps any gains he'd get on most modern hardware would be small-ish, but with such long execution times, even a 20% improvement is worth fighting for.

One problem with "high quality" compiler vendors is that they typically cater to Fortran/C/C++ folk. There are no "decent" say Haskell or OCaml compilers out there, even if a typicall Haskell code is usually easier to optimize than C. Heck, if we're into less esoteric, let's take say Python: the only way to get it "optimized" is to use IronPython and have the CLR run its JIT on it. I knowingly exclude research projects like PyPy here.

So, if you're into somehting higher level than C/C++/Fortran, you're facing a shortage of decent compilers with exception of perhaps CLR or Java platforms. Heck, even those don't (AFAIK) really do the vectorization you'd expect for numerical code. As for their scheduling performance on "streaming" code that would nominally benefit from hand-optimization with platform (say Intel) documentation in hand, I'm not impressed with that either.

Cheers, Kuba

Alexis de Torquemada · 2009-01-13 Reply Admin

adiener:
Also, while is not a function, so it may be misleading to say it takes a bool. It evaluates the expression between its parentheses on each iteration and skips if the result of evaluation is false/0.

TRWTF is that while is not a function. In Haskell terms, it would be just that, e.g.

doWhileM_ :: Monad m => (a -> m Bool) -> m a -> m ()
doWhileM_ test action = do
  result <- action
  continue <- test result
  when continue (doWhileM test action)

(Following Haskell convention, the underscore indicates that the result type is (), i.e. void even though the result type of "action" may not be.)

or, for advanced (ab)users,

doWhileM_ test action = fix $ \d -> action >>= test >>= (`when` d)

Now we can define whileM_ with the slightly cumbersome type

whileM_ :: Monad m => a -> (a -> m Bool) -> m a -> m ()
whileM_ a test action = test a >>= (`when` doWhileM_ test action)

Usage example (assuming stopVar :: Control.Concurrent.MVar.MVar a)

whileM_ stopVar isEmptyMVar $ do
  putStrLn "It hurts."
  putStrLn "Make it stop."
  return stopVar

So we reimplemented two important C control flow statements in two lines of Haskell code (not counting the optional explicit type signatures)...

SCNR

Addendum (2009-01-13 03:03): The recursive call to doWhileM is missing an underscore.

2009-01-13 Reply Admin

Mcoder wrote: "No optimizer can work well with goto statements"

Are you sure? I seem to recall from that a standard compiler technique was to convert for/while-loops into gotos.

On the other hand, if the gotos are used in complex ways, optimization does become more difficult. But this is not an inherent property of the goto. In fact, I'm writing a compiler for a subset of C, for embedded systems, and it has no problems doing loop invariant hoisting, constant propagation, common subexpression elimination, and other stuff, even when gotos are used.

2009-01-13 Reply Admin

Ah yes, but I remember creating a multi-threading framework under DOS using Borland C++ and setjmp/longjmp; you can't do that with your wimpy loops :-).

2009-01-13 Reply Admin

Zylon:
clickey McClicker:
On one chip I programmed it had two very different jumps. One was for nearby jumps within like +-127 instructions and a long jump for going further.
6502? All conditional branch instructions on that chip were limited to +/-127 bytes. To jump to any arbitrary location you had to use the JMP or JSR instructions.

6502? Try almost any 8-bits risc processor ever made. I have worked on lots of different micro-processors (basically cpu+memory+simple I/O on one chip), and they al had these short jumps. Usually called short (relative) jumps. rjmp, sjmp, srjmp or simply jmp in their flavor of assembly.

2009-01-13 Reply Admin

Kuba:
Mcoder:
Now, about the WTF, and answering to Bob (238533). No optimizer can work well with goto statements. Most of optimizing theory simply stops being useful when you start using it.
Said a firm believer in cargo cult approach to computer science. What you say implies that there are no optimizations that span basic blocks. Even crappy compilers (my favorite: Zilog's ZDS II) can (or seem to) optimize across basic blocks. Popular compilers like gcc 4, msvc 2008, icc, an even less popular ones, like IIRC cmucl, can do it.

I think he may have confused goto's with conditional jumps. Optimizing over these is pretty annoying. But certainly not impossible or unheard of. Lint-like systems comes to mind.

Maybe I'm confusing semantic trees with execution-flows. It's been too long since my Compiler Construction classes. :)

2009-01-13 Reply Admin

Charles400:
I am long-jumping out of these comments.

Mod parent up.

2009-01-13 Reply Admin

DaveK:
I don't get their reasoning. I would have it the other way round.

Isn't it obvious? Management had heard that "goto's are evil", and therefore banned them. Dijkstra never wrote a paper named "longjmp considered harmful", and it wouldn't make as good a soundbyte anyway, so of course longjmp was not banned. It has nothing to do with common sense, it is just rote application of rules.

Andy Goth · 2009-01-13 Reply Admin

Kuba:
Ever tried implementing cooperative multitasking using nothing but C?

Yup, many times. It's kinda hard. All the "tasks" have to periodically return to the scheduler, and when they are rescheduled they themselves have to figure out how to pick up where they left off. As far as I know, longjmp() isn't suitable for this, since it blows away the execution stack. (Although I'd be delighted to hear of a way to make it work!) I've done it successfully using the {get,set,make,swap}context() functions in <ucontext.h>, but that is in System V and descendents, not standard C. But if you want to do it in standard C, go have a look at this.

2009-01-13 Reply Admin

Thief^:
Sir Twist:
No, "for(;;)" is used for endless looks because "while (1)" makes the MSVC compiler complain "warning C4127: conditional expression is constant."
I can't test it myself, but what about "while(true)"? While takes a bool after all.

jawdrops.... try reading the error again. What part of "constant" don't you understand?

2009-01-13 Reply Admin

Paul W. Homer:
Setjmp has to save the full "runtime" context (so it can return to it later), so it would always be way way more expensive then a loop or a simple goto. It's very strange that anyone using it would think this was somehow faster than just a simple loop (more work is always more time). Paul.

I'm not sure, but if the original code were on a single process machine, perhaps it would not be necessary to save so much context - just the registers? I haven't messed around at this level for 20 years at least, so I may be confused. Of course, this leaves such niceties as memory management up to the programmer as well. Ah, the good old days. :)

2009-01-13 Reply Admin

Hans:
Isn't it obvious? Management had heard that "goto's are evil", and therefore banned them. Dijkstra never wrote a paper named "longjmp considered harmful", and it wouldn't make as good a soundbyte anyway, so of course longjmp was not banned. It has nothing to do with common sense, it is just rote application of rules.

You know, given then number of programmers and managers who have an iron-clad mantra of 'goto is always evil under all circumstances, rewrite this even if it is 10 times more code', I could easliy see this being the case.

2009-01-13 Reply Admin

Dijkstra never wrote a paper named "goto considered harmful." What he wrote was "non-local goto considered harmful." He was specifically talking about longjmp-type control flow. For example, using a local goto to simulate a loop will most likely result in equivalent code because the intention is the SAME. These routines (as a old C programmer) do have their uses (error handling that allows you to save out before crashing with a catastrophic failure.) The essential thing is that it is not acceptable for control flow.

DaveK · 2009-01-13 Reply Admin

Terrier:
Dijkstra never wrote a paper named "goto considered harmful."

Comm. ACM:
Edsger Dijkstra (March 1968). "Go To Statement Considered Harmful" (PDF). Communications of the ACM 11 (3): 147–148. doi:10.1145/362929.362947. http://www.cs.utexas.edu/users/EWD/ewd02xx/EWD215.PDF

Terrier:
What he wrote was "non-local goto considered harmful."

http://www.cs.utexas.edu/users/EWD/ewd02xx/EWD215.PDF:
Copyright Notice The following manuscript EWD 215: A Case against the GO TO Statement was published as a letter entitled Go-to statement considered harmful in Commun. ACM 11 (1968), 3: 147–148. It is reproduced here by permission.

Terrier:
He was specifically talking about longjmp-type control flow. For example, using a local goto to simulate a loop will most likely result in equivalent code because the intention is the SAME. These routines (as a old C programmer) do have their uses (error handling that allows you to save out before crashing with a catastrophic failure.) The essential thing is that it is not acceptable for control flow.

How exactly is exception handling (what you are describing here) not "control flow"?

TopCod3rsBottom · 2009-01-13 Reply Admin

Ape-Inago:
return; might be a typo (forgot the return value)...
RETURN_NOTHING is unambiguous as to the intention.

The function is declared void, so RETURN_NOTHING is useless. If the function were not void, it would still be useless since it would be more unambiguous to return the correct something, ie. "return 0". No matter what, RETURN_NOTHING is noise.

2009-01-13 Reply Admin

Anonymous:
I see three major WTFs here: Mark Bowytz - willing Vista user.

I cried because I had no Linux. Then I met a man who had Windows Vista.

2009-01-13 Reply Admin

Alex:
Uh.... why did you even need to test which would be faster? A for loop compiles to a test and branch. A longjmp compiles to a freaking function call, one that has to inspect data structures and modify the return address on the stack.

Lots of things that seem obvious turn out to be false when studied scientifically. It's obvious to many people that heavy objects will fall faster than light objects, that mass is independent of speed, or even that the Earth is flat.

Captcha: "nulla". A female null.

2009-01-13 Reply Admin

DaveK:
longerjmp:
DaveK:
Yeah. Your colleague got burned. What does a week in Hawaii cost? Three grand? I'd have asked for 1% of whatever it saved them, ongoing. And then retired. To sit on a gold throne. On an island made entirely from beer, money and women.
Money? WTF do you need money for, in this situation?
Just to rub it in about what a lucky bastard I am... ;-)
Addendum (2009-01-12 15:35): EDIT: Also, I might need to buy a cushion. Solid gold is not always the comfiest thing to sit on.

That's what the women are there for.

tgape · 2009-01-13 Reply Admin

GB:
I'm not sure, but if the original code were on a single process machine, perhaps it would not be necessary to save so much context - just the registers? I haven't messed around at this level for 20 years at least, so I may be confused. Of course, this leaves such niceties as memory management up to the programmer as well. Ah, the good old days. :)

20 years ago, longjmp() wasn't that bad, as it only had to save the registers. Of course, it was still horrible in an overall sense, because it just sort of forgot everything that had happened between setjmp() and it.

Now, longjmp() is very slow, as it has to save ALL of the registers, and all of the register-like things (they may be called registers, too; I don't know - I don't really work at this level myself. The point is, there's hundreds, or possibly even thousands, of these things now.) Also note that you've also blown your pipeline, and you may blow significant chunks of cache. (Admittedly, most of that cache contains stuff you're not going to be using anymore, since you're going to be returning across several levels of function calls in any event. But it's still going to be stressful to the system.)

Zatanix · 2009-01-13 Reply Admin

DaveK:
Terrier:
Dijkstra never wrote a paper named "goto considered harmful."
Comm. ACM:
Edsger Dijkstra (March 1968). "Go To Statement Considered Harmful" (PDF). Communications of the ACM 11 (3): 147–148. doi:10.1145/362929.362947. http://www.cs.utexas.edu/users/EWD/ewd02xx/EWD215.PDF
Terrier:
What he wrote was "non-local goto considered harmful."
http://www.cs.utexas.edu/users/EWD/ewd02xx/EWD215.PDF:
Copyright Notice The following manuscript EWD 215: A Case against the GO TO Statement was published as a letter entitled Go-to statement considered harmful in Commun. ACM 11 (1968), 3: 147–148. It is reproduced here by permission.
Terrier:
He was specifically talking about longjmp-type control flow. For example, using a local goto to simulate a loop will most likely result in equivalent code because the intention is the SAME. These routines (as a old C programmer) do have their uses (error handling that allows you to save out before crashing with a catastrophic failure.) The essential thing is that it is not acceptable for control flow.
How exactly is exception handling (what you are describing here) not "control flow"?

While the "go to" statement may be considered harmful, the more popular "go from" statement should be considered even more harmful.

The "go from" statement ("gofrom" in short) is written at the destination of the jump rather than at the source, and its argument refers to the place in the code there should be jumped from.

The gofrom-statement is thus not written at the place in the code to be jumped from (as is the case of the goto-statement), but it is written at the code-line to be jumped to. An example of its use:

	// do something
	int a=7;
myCode:	int c=8;
	for (int i=0; i>-2; i*=-1);
	...
	...

	// somewhere else in another source file
	gofrom myCode;
	exit(1);

After the assignment "int a=7" has been executed, the program jumps to the exit(1) statement. The programmer probably intended for a usual infinite loop to occur, but instead he is left with a program returning to the operative system! This is a critical bug that would be hard to detect by just looking at the code.

(in this case it is simple enough, but imagine a whole program tangled with gofroms all over the place)

Since gofroms makes it very confusing to analyse the control-flow of the code, i propose that this statement should be removed from all languages (except -perhaps- machine code).

Thank you

2009-01-14 Reply Admin

Zylon:
clickey McClicker:
On one chip I programmed it had two very different jumps. One was for nearby jumps within like +-127 instructions and a long jump for going further.
6502? All conditional branch instructions on that chip were limited to +/-127 bytes. To jump to any arbitrary location you had to use the JMP or JSR instructions.

Also, 65xx has an unconditional branch.

dkf · 2009-01-14 Reply Admin

Zatanix:
The "go from" statement ("gofrom" in short) is written at the destination of the jump rather than at the source, and its argument refers to the place in the code there should be jumped from.

That construct has been known of for many years, and is conventionally written "COME FROM". If you're using a language that does not have the benefit of this statement, then you should switch to something more advanced, such as INTERCAL.

2009-01-14 Reply Admin

Jumpman Jr.:
Funny, the other day I was just reading about setjmp/longjmp in The Standard C Library by P.J. Plauger:
The C Standard legislates the kind of expressions that can contain setjmp as a subexpression. The idea is to preclude any expressions that might store intermediate results in dynamic storage that is unknown (and unknowable) to setjmp. Thus you can write forms such as: switch (setjmp(buf))....., if (2 < setjmp(buf)) ....., if (!setjmp(buf)) ....., and the expression statement setjmp(buf).
You can write no forms more complex than these. Note that you cannot reliably assign the value of setjmp, as in n = setjmp(buf). The expression may well evaluate properly, but the C Standard doesn't require it.

(emphasis added)

Well, you can get the same effect by creating a switch statement with every single integer in. And yes, on architectures where assigning an integer needs a temporary register, this will in fact have to generate code for each possibility to avoid breaking the C standard.

I've used setjmp before, but only in programs which were deliberately hard to read; and I tend to restrict uses to returning only 0 or 1, even then.

Zatanix · 2009-01-14 Reply Admin

dkf:
Zatanix:
The "go from" statement ("gofrom" in short) is written at the destination of the jump rather than at the source, and its argument refers to the place in the code there should be jumped from.
That construct has been known of for many years, and is conventionally written "COME FROM". If you're using a language that does not have the benefit of this statement, then you should switch to something more advanced, such as INTERCAL.

So it has already been invented..? :( Damn, i thought i had invented something beautiful and that i was the first to do so :'(

You are right it seems... (Direct wiki link: http://en.wikipedia.org/wiki/COMEFROM)

I will certainly give INTERCAL a shot! I can't believe i have been using inferior languages without this amazingly |3e+ construct until now!

Hopefully this will find its way into the next C or C++ standard. Imagine the fun of maintaining a code-base with gofrom as the main method of transfer of control. Bliss! :))

2009-01-14 Reply Admin

dkf:
Zatanix:
The "go from" statement ("gofrom" in short) is written at the destination of the jump rather than at the source, and its argument refers to the place in the code there should be jumped from.
That construct has been known of for many years, and is conventionally written "COME FROM". If you're using a language that does not have the benefit of this statement, then you should switch to something more advanced, such as INTERCAL.

ais523:
I've used setjmp before, but only in programs which were deliberately hard to read; and I tend to restrict uses to returning only 0 or 1, even then.

OK, this is getting weird. The program I was referring to was C-INTERCAL, one of the two foremost INTERCAL implementations around at the moment. (It contains a lot of deliberate WTFs as a joke, whilst trying to still keep it maintainable; it's a fine line. There are probably some non-deliberate ones in there too, although hopefully not in the code I wrote...) Anyway, strange to see two consecutive comments mention it!

DaveK · 2009-01-14 Reply Admin

Zatanix:
dkf:
Zatanix:
The "go from" statement ("gofrom" in short) is written at the destination of the jump rather than at the source, and its argument refers to the place in the code there should be jumped from.
That construct has been known of for many years, and is conventionally written "COME FROM". If you're using a language that does not have the benefit of this statement, then you should switch to something more advanced, such as INTERCAL.

So it has already been invented..? :( Damn, i thought i had invented something beautiful and that i was the first to do so :'(

You are right it seems... (Direct wiki link: http://en.wikipedia.org/wiki/COMEFROM)

In fact, it was invented some time before it finally gained an implementation in C-INTERCAL; it was first proposed in 1973. As a response to Dijkstra's GOTO paper, which I think brings us back to where we came in!

http://www.fortran.com/fortran/come_from.html

2009-01-14 Reply Admin

<quote> C'mon people,seriously, C is a low level language. If you want to call logic in another function why not, oh I don't know, just CALL THE BLASTED FUNCTION </quote>

Wouldn't this make more sense if it would say high level language instead of low level language (asm for example is low level)

2009-01-14 Reply Admin

No idea what all this "code" stuff is, but longjmp is always used when going to places like Altair IV and Aldebaran, while a simple goto is used in-system. If you're just nudging the ship around in dock, impulse power only - right Scotty?

2009-01-14 Reply Admin

I think that in this particular example the compiler can optimize the code A LOT. Has anyone tried to look at the EXE file to see if it really contains JNE/INC etc. loop instructions? Because if you do this in VC6:

int a=0; while(a<50000) a++;

the compiled and optimized output will be: MOV [something],49999

So only one istruction with the RESULT of the loop. I think that the loops in this example could be computed by the compiler itself.

2009-01-16 Reply Admin

Exactly, that is (i think) what longjmp is most useful for.

The source code for Lua, the small-and-fast scripting language that is written entirely in ANSI C, uses longjmp in exactly one place -- to jump to an error handler when an error is 'thrown' by calling 'lua_error'.

2009-01-16 Reply Admin

jan de vos:
Exactly, that is (i think) what longjmp is most useful for.
The source code for Lua, the small-and-fast scripting language that is written entirely in ANSI C, uses longjmp in exactly one place -- to jump to an error handler when an error is 'thrown' by calling 'lua_error'.

Damn. I forgot the quote. I was replying to a post about longjmp being used for doing things like exception handling in pure C.

2009-01-19 Reply Admin

Completely Agreed.

Premature optimisation is as embarrasing as premature .....

2009-02-08 Reply Admin

Actually a lot of C's function names hearken back to the early days of UNIX where (because of a linker limitation?) only the first six characters were significant. With this in mind, "setjmp" was likely preferable to "setjump" or "set_jump"... think of these being confused with something like "setjumble" or "set_justify", then what would happen if one of these was called instead before a longjmp(). As for names like "creat()" though, even the creators of UNIX admit that if they did it all over again they'd spell things better.

2010-05-19 Reply Admin

on my linux machine, the jumping code in fact runs faster.

Longjmp - FOR SPEED!!!

Leave a comment on “Longjmp - FOR SPEED!!!”