- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
This use of exceptions is documented in Joshua Bloch's excellent book, "Effective Java"
http://java.sun.com/docs/books/effective/
Rule 39, P. 171
In a comment for the sample code, he says "Do not use this hideous idiom..." :)
Admin
Some coders will use a new concept with out bothering to understand it
Admin
What's to understand? Do you think this is sophisticated?
Admin
I'm not arguing this case. I'm arguing that the pattern itself is just fine in its place. This is not the place for the pattern though.
What if the example was this?
try {
while(true) {
bar = foo.getNext();
if(checkSomething(bar)) {
foo.append(bar.uncheckedPart()); // the unchecked part can't be checked until after something else already in foo i checked.
}
}
except (endOfDataException e) {
// nothing
}
Not to mention that you are assuming java. That appears to be the case, but if it was C++ we can do realloc on the array in mid stream (this is ugly, but we can do it).
Even in java we can set the array larger than needed, and throw an out of bounds exception when we get an invalid data marker, treating it like a list with a maximum length. Sure you could check for all that in your code, but it makes for ugly code at this point - if you know you are checking latter anyway, why do it here as well? (Though I would agree that not using a simple linked list instead is a big WTF)
Remember, I'm defending the pattern as useful and better in some cases. It is clearly not a good solution for all situations. Sure an exception is slower. If you are getting into this code many times a second, you should optimize it. If this is typical code though, it accounts for less than 1% of the run time. In that case who cares that it isn't as fast as it could be, if it is more readable. (a for loop on an array is more readable, but that doesn't always applie)
Admin
Unfortunately, there are some situations where you have no choice but to use this hideous idiom. For example, when doing reflection, there's no other good way to determine whether a class exists than to do a Class.forName() and wait to see if it throws a ClassNotFoundException. Similarly, the best way to determine whether a class implements a method is to call getMethod() on the class object and see whether it throws a NoSuchMethodException. I suppose you could call getMethods() and sort through the returned array, but this seems much less efficient.
Admin
Generally, you'd have a method on foo called hasMore() or something and do:
while(foo.hasMore())
I think you are missing the danger in that code. If one of the methods you call in the loop thowns the exception but not on purpose, you'll just continue on as if everything is just fine. That the exception ever occured is lost in the ether.
I don't understand how you throw the exception on an invalid data marker without checking for it.
I don't think it's readable. If I see this, it will disupt my work as I try to figure out why the hell the code is catching an IndexOutOfBoundsException. It's a back-assward way to terminate a loop. Using exceptions as glorified break statements is just going to lend credence to the people who argue they are just 'gotos'.
Admin
Or just have getNext() return null if there ain't no more. There's lots of ways to skin a cat.
Admin
If it were simple, there wouldn't have to be this huge debate.
Actually the problem is more that people THINK they understand, but underestimate the underlying complexity and end up with vaguely-remembered half-truths. In this case, the question of performance leads straight to runtime optimizations done by a JIT compiler, which is pretty damn sophisticated indeed.
Admin
Not true. Try ClassLoader.getResource(). If it returns null then the resource (and hence the class) is not loadable. Now, Class.forName() will throw the exception if the resource is found but can't load, like when a static initializer fails, but at least you've eliminated an exception being thrown in the most likely (and therefore least exceptional) instance.
And I think the gist of "hideous" was that there was a much better and more efficient way to solve the problem. The exception handled was in no way exceptional. It was guarenteed to happen every time!
Admin
As far as I am concerned, the performance argument was only to dispel the myth that catching the IndexOutOfBoundsException was faster. I don't see it as an overwhelming reason to not do this.
I think the main reason not to do this is that it makes no sense. Why use an indirect method of loop termination instead of a direct one? Would anyone advocate this:
try {
foo.doSomething();
} catch (NullPointerException e) {
// foo is null
}
over:
if (foo != null) foo.doSomething();
This is definitely a case of using a sledgehammer to kill a fly.
Admin
It would still be a WTF. Every good iterator pattern has a means of divining the last element without inciting an exception.
And there is a very simple reason for this:
When you rely on exceptions, whether it's an endOfDataException or an indexOutOfBoundException, you've extended the control of your loop far beyond the minimum possible locality. If some code within the loop generates one of these magic exceptions and does not handle it, your loop would suddenly behave incorrectly.
The performance doesn't matter. The integrity of your control structure matters, and this anti-pattern does an excellent job of undermining that integrity.
Admin
Very interresting article, thanks for the informations
Admin
The array access has to check for the out-of-bounds condition anyway. Handling the exception is basically free (a jump out of the loop), checking the loop condition is extra work. I wouldn't hesitate to write such code, [bold]if[/bold] profiling shows the loop as the actual bottleneck. Which is unlikely.
Interestingly, writing the explicit loop might allow the compiler to elide the out-of-bounds check completely. This is quite hard to do and I'm not sure it is done in Java. However, an internal iterator (map, for_each or whatever it is named) makes the code safe and efficient at the same time.
Admin
I hope you'd also profile afterwards and notice that the code is now slower, because (as has been pointed out several times) handling the exception is anything BUT free.
It is done since version 1.4
Admin
It's not. Setting up the exception (aka entering the "try" block) is free, but actually throwing/catching the exception involves a lot of objects creation and transferts and is VERY expensive.
Admin
OK, so I wasn't entirely correct. My reasoning only applied to the HotSpot VM in server mode. The benchmark results for client mode (I slightly modified it, putting the loops in a function and running the whole thing 50 times instead of 10):
A - arr.length: 5481
B - int val: arr: 5676
C - store size: 5311
D - ugly try/catch: 4438
Server mode:
A - arr.length: 2951
B - int val: arr: 2705
C - store size: 3014
D - ugly try/catch: 5736
Bottom line: In server mode, the exception method is a LOT less efficient. In client mode, it is the most efficient way, but all the other arguments brought forth still apply (ugly idiom, "real" exceptions may be suppressed etc).
Admin
In .NET -- and this is information provided by Microsoft -- you should not use exceptions to control program flow, as they are very slow and result in a lot of overhead.
Of course, in this case, it makes the code so much clearer. NOT.
Admin
Some inner function is checking for the invalid marker, therefore it is redundant for anything outer to check as well.
That is because you are not used to the pattern. The first time I encountered a linked list (back as a frashman) it took me a while to understand what is going on. Now I don't have to think about it.
Likewise for(int i=0 ; i < FOO_END_OF_DATA;i++) is recognized as a whole statement by an good C/C++/Java programer, without any need to parse the whole thing. In fact just changing it to for(int loop_index = 0 ; loop_index < FOO_END_OF_DATA;loop_index = loop_index + 1) is harder for any good programer to parse just because it doesn't fit the pattern even though it is. (To mkae this even harder set your browser window to a eidth that it will have to wrap to the next line)
Remember, this code is not where the program is spending most of its time. So the fact that exceptions take a long time to deal with is irrelavant - this is not the bottle neck. If this code was a bottleneck where you were spending 90% of your time, AND THE PROGRAM IS TOO SLOW than you are fully justified in anything you can come up with to make it faster, and getting rid of the exception is a good place to start.
Don't make the mistake of assuming the indexOutOfBoundException is a standard exception in the language you are using. It should be a user defined class, so cannot catch it by mistake. Change the name if your language or library already has an exception of this name.
Admin
These patterns are called idioms and every language has some of them. C and its descentands are especially rich of idioms, because the language design is sparse. For example, there is no counting loop, but the "for (int i=0; i<foobar; i++)" idiom. There are no referential parameters, but the &outparam idiom.
Admin
I think you lost sight of the original debate. The point is that INTRODUCING the exception is supposed to make it fast, but is criticized on the grounds that it's bad style AND slower in most cases.
Admin
Not for ints, but in languages with operator overloading, the compiler may actually not be allowed to perform such optimizations, even if it could.
Writing x++ when ++x would do is, in my eyes, a bad habit, even if it may not hurt in that particular case. Writing ++x when the value is discarded anyway is never wrong (performance-wise and in terms of code clarity - taking temporary copies only to discard can be confusing), writing x++ often is. Preincrement should thus be the safe default under all circumstances.
Admin
This is an oversimplification. In the case of C++, for builtin types, the compiler is not required to perform side-effects (e.g. incrementing the integer) until the next sequence point (typically a semicolon). Even if a temporary is introduced, compilers can easily figure out it's never used after initialization, and optimize it away.
Overloaded operators are another story, and operator ++(int) typically must take an (in this case) superfluous copy, which the compiler most often can't optimize away. For example, if the operator is not inlined (though it's deplorable style, operator ++(int) may in theory do something completely different than operator ++(), and the only way for most compilers to know what it's doing is parsing the actual code), or if the copy's constructor or destructor have or may have additional side-effects, the compiler cannot remove the temporary without risking to alter program semantics.
In practice, postincrement on objects of class type will almost always be slower than the corresponding preincrement. It takes both a very smart compiler, and a lot of inline functions (or far-reaching global optimizations) to even achieve a draw. On the other hand, postincrement will never be faster. It requires exactly the same coding effort. In the words of Sutter and Alexandrescu, needlessly using postincrement on objects of class type - or objects that may have class type (templates anyone?) - is thus a premature pessimization.
Admin
<FONT face="Courier New" size=2>that's the coding equivalent of the nurse leaving the room to turn on the x-ray machine at the dentist office.</FONT>
<FONT face="Courier New" size=2>"don't change horses in mid stride." - origin unknown</FONT>
<FONT face="Courier New" size=2>"don't swap uniforms in mid stride." - denis diderot, 1782, france</FONT>
<FONT face="Courier New" size=2>"don't switch clothes in mid stream." -william blake, 1824, england</FONT>
<FONT face="Courier New" size=2>"don't stop till you get enough." - michael jackson, 1981</FONT>
<FONT face="Courier New" size=2>"don't switch badgers in mid stream." - charles kuralt, 1986, san francisco</FONT>
<FONT face="Courier New" size=2>"charlie, you're spooning my cat!" - frank zappa, 1986, san francisco</FONT>
<FONT face="Courier New" size=2>"don't realloc memory in mid stream." - emptyset, 2005, thedailywtf.com</FONT>
Admin
It is a bad habit, but as bad habits go, it's very mild.
It's roughly akin to putting on your socks and shoes like this:
Put on left sock
Put on left shoe
Put on right sock
Put on right shoe
as opposed to
Put on left sock
Put on right sock
Put on left shoe
Put on right shoe
Sure, it adds a few seconds to your morning routine. But unless you're a firefighter, it probably won't harm anything.
Admin
Nonsensical comparison. There are 1.3 billion people on Earth who can read Chinese fluently. But there is no one in the entire world who can read Perl fluently.
Admin
Well, read that Intel manual again, the DEC command conveniently sets (or resets) ZF, which means that a CMP command (or its equivalent) can be ommitted in most backwards-counting loops, because the test for zero comes for free with the counter decrement. Also, LOOP and REP can sometimes be useful, and they implicitely count backwards to zero. Though at the time I tested it, LOOP was slower than DEC ECX followed by a JNZ, so there doesn't seem to be much of a point in using it.
That said, in general, backward-counting loops are a case of premature optimization. It would need two things to justify the extra effort (I presume) of counting backward: First, a performance measurement that says the loop iteration time is actually of any importance. Second, another performance measurement that shows that backward iteration does actually have benefits (at least on that particular platform and compiler).
Admin
You forgot to mention that swinging a dead chicken over your desk will drive away bad spirits and improve the program.
Admin
You forgot to mention that swinging a dead chicken over your desk will drive away bad spirits and improve the program.
Admin
I sleep quite well knowing that true scotsmen eat their porridge without sugar, while those who are clueless Java evangelists at heart blame C/C++ programmers for WTFs committed in their own idolized language.
Admin
Ed,
The idx is incremented. What do you think the idx++ does?
Admin
Intel made LOOP slow on purpose, because older program used it to ajust their busy-waiting loops (for commands like "sleep") and they crashed when LOOP became too fast (division by zero errors etc.)
Admin
Note that I was responding to someone denigrating Java for encouraging to look at the big picture while promoting C++ for encouraging attention to details - while this WTF is exactly the result of misguided (and misinformed) attention to performance details. Of someone writing Java while thinking C, if you will.
So WHO is clueless? And is your comment based on anything except an inability to suffer any criticism of your favourite language?
Admin
I was told once that under certain circumstances this is faster than a for loop. Its purely hearsay , but I was told that IBM had done some tests (circa 1999 - maybe a 1.1 JVM) that showed this was more efficient for long cases.
It reminds me of the SQL "in" verus "join" or statement versus stored procedure debate. There are certain circumstances under which one is faster than the other, and obvious concerns about the maintainability of either solution. But if you've ever been tasked with finding efficiencies in code that is executed non-stop and is a bottleneck, you need any clicks you can get. I once found that checking an object for null improved runtime even though the object was never null (we removed it and the code ran slower, so we put it back in).
Admin
You misspelled heresy.
Admin
To shed some light on the preincrement vs postincrement discussion i made a test incrementing an iterator of a vector in a large loop. And sure enough with no optimization turned on the preincrement version was a bit faster. But once I turned on optimization (i.e. -O2 with gcc) there was no difference in speed and no difference in the size of assembly code generated.
Admin
The programmer shouldn't be taken out back, but SHOT right up front, por encourager les autres.
Admin
Somebody mentioned that this code might be more efficient than a for loop. Even if it is, there is a functional problem with this code: what happens if displayProductInfo (or code called by it) happens to screw up and reference an invalid array index? Suddenly, the program doesn't crash like it's supposed to! Brilliant!
Admin
No no no. Brillant.
Admin
That's why long threads grow infinitely once the reach a critical length: the same observations are posted again and again, because noone wants to read 200+ postings before posting himself.
Admin
<FONT face="Courier New" size=2>maybe that's the consequence of the forum software.</FONT>
Admin
"How large was your loop? The try-catch overhead is large, but happens once. The redundant check overhead happens once per array item. For sufficiently large loops, it's provable that this "WTF" code is faster. This limit may be larger than java's MAX_INT array size limit, however. This also assumes that the JVM doesn't perform complex analysis to remove the redundant array bounds checks."
The limit was 40 (Fourty) on Java 1.1.
This code is just optimized code for the early JVMs. Nothing wrong.
Admin
I call bullshit. NO WAY is 40 int comparisons going to be slower than building a stack trace. To confirm it, I actually went and got the oldest JVM that Sun still has, which is 1.1.6_009. I ran this code:
public class ExceptionLoop
{
public static void main(String[] args)
{
int iterations = 10000000;
byte[] arr = new byte[40];
long start = System.currentTimeMillis();
for(int j=0; j<iterations; j++)
{
try {
int idx=0;
while (true) {
arr[idx++]=42;
}
} catch (IndexOutOfBoundsException ex) {}
}
System.out.println("1st took "+ (System.currentTimeMillis()-start));
start = System.currentTimeMillis();
for(int j=0; j<iterations; j++){
for (int i=0;i<arr.length; i++){
arr[i]=42;
}
}
System.out.println("2nd took "+(System.currentTimeMillis()-start));
}
}
Result:
1st took 16500
2nd took 781
While for the JDK 1.5.0_04 server JVM, it's
1st took 2031
2nd took 281
So while the older JVMs may have been slower at array accesses they were ALSO much slower at exception handling. And BTW, the numbers can not be extrapolated to the 1st version being faster for arrays of a few thousand elements; in fact, the 2nd version is still faster for an array size of 4 million elements - on BOTH JVMs.
Admin
When exceptions become logic, don't they, by definition, cease to be exceptions?
what do you have to say about the following code
say i got a resultset from database "rs"
try {
Long value = new Long( Long.parseLong(rs.getString("Column_Name") )
}
catch(SqlException sqe) {
//do something
}
catch(NumberFormatException nfe)
{
//do the processing when i got string (non numeric data)
}
the column "Column_Name" can contain either string or number and i need to identify when its is a number and proceed accordingly.....
So i have to rely on an exception here for my logic???? what say??
Admin
I say why the fuck do you have a DB column containing strings that MAY need to be parsed as numbers? That's a WTF all right, but in your DB design, not in that of the language or API.
Admin
Get it from the recordset as a string. Run the string through /^\d+$/ - if it succeeds, it's a number, in which case, make a number out of the string.
Admin
That doesn't shed much light on the debate since vector iterators can in theory be plain pointers. And in some implementations, they actually are. In most other implementations, they are little more than thin wrappers around pointers (this allows users e.g. to use std::vector<whatever>::iterator::difference_type instead of the more general std::iterator_traits<std::vector<whatever>::iterator>::difference_type). If you are going to test this, at least use some "real", non-trivial iterator type, like those of std::map, or a complex, user-defined one.
Or try a "big integer" class, with, say, some hundred bytes of storage (e.g. for iterating through a range of cryptographic multi-kbit prime numbers). In this case, preincrement will most likely be a constant-amortized operation, independent of integer length, while postincrement may (unless the compiler figures out how to optimize the operation) be linear in the length of the integer.
It's simply ridiculous how some people defend the practice of useless postincrement. Preincrement takes the same effort to write as postincrement. Postincrement is never faster, but can actually be slower in numerous situations. It may also require temporarily allocated dynamic memory, which means that it can throw std::bad_alloc in situations where preincrement is "nothrow". There is simply no defense for using postincrement where preincrement will do, and it doesn't take more than a few braincells to realize that all those stories of "I tested it for some types and it wasn't slower in these particular cases" are worth exactly zilch when it comes to writing templates.
Admin
Admin
Admin
Oh well... this is a lot faster!
Comparing with zero is faster than any other number :P
Try my challenge if you dare! :))))
int i = x;
for (; --i > 0;) { statements; }
Admin
Makes sense. Many processors have a "jump on compare to zero" set of opcodes. A comparison between i and x required loading two registers (assuming they can't simply stay in the register throughout the whole operation, while the comparison between i and zero only requires the load of one register. Additionally, if the register were already holding something that needed to be preserved, then there would be additional overhead averted.