The Daily WTF: Curious Perversions in Information Technology

2007-03-16 Reply Admin

ST ST:
uhm...
^aCollection asSet.

Smalltalk wins

I can't get less than one line but I can make it more obvious. In Ruby:

my_array.uniq

(or my_array.uniq! if you wish to modify the array in place instead of generating a new array).

kanet77 · 2007-03-16 Reply Admin

The real WTF is that the original code finds the unique elements (at great cost) and then throws it all away without returning any information.

2007-03-16 Reply Admin

Michael:
guy:
What part of the name "Hashtable" suggests that there is a tree involved?
Um, the "Hashtable" part? Seriously, how do you think a hashtable is implemented? Here's a hint: it's not a list.

Doh! Now I'm the idiot.

Ok, hashtable's don't use binary trees, I was thinking of different data structures. Even still, the lookup time from a hashtable is faster than a full iteration, O(1) on average, O(n) at worst.

2007-03-16 Reply Admin

Michael:
guy:
What part of the name "Hashtable" suggests that there is a tree involved?
Um, the "Hashtable" part? Seriously, how do you think a hashtable is implemented? Here's a hint: it's not a list.

LOL. It's sad when people try to look smart and just end up exposing their ignorance. Let me give you a hint. It's not a tree either. Maybe it has something to do with a table...

2007-03-16 Reply Admin

I saw same things in java so I realised that any algorithm can be expressed by only try throw catch statements.

:-) Here is the simple proof: http://99-bottles-of-beer.net/language-java-866.html

2007-03-16 Reply Admin

Happy:
Michael:
guy:
What part of the name "Hashtable" suggests that there is a tree involved?
Um, the "Hashtable" part? Seriously, how do you think a hashtable is implemented? Here's a hint: it's not a list.

LOL. It's sad when people try to look smart and just end up exposing their ignorance. Let me give you a hint. It's not a tree either. Maybe it has something to do with a table...

Yeah, already self corrected, but thanks.

jgreen · 2007-03-16 Reply Admin

Since I also believe that exceptions should only be involved when there is a good reason for it, I've come up with a simple solution. Every time an exception is thrown for our website, it EMAILS THE ADMIN (me in this case) what the exception was, etc. etc.

To keep myself from getting spammed all the time, I make sure to have good, clean code that doesn't throw exceptions.

And if an exception is uncaught, I get a nasty gram from the application...

2007-03-16 Reply Admin

doesn't every non unique item take you one frame down the stack?

2007-03-16 Reply Admin

Eduardo Habkost:
DOA:
To get a bit biblical... He who has never used exceptions in this manner, cast the first stone
first_stone = (Stone)stones[0];

Thanks. That was the only response in this whole topic I felt was really worth reading.

2007-03-16 Reply Admin

You could use a set. I dont know about .NET but in Java you could do:

Set deduped = new HashSet(ListOfDuplicatedValues);

A set, in its nature contains only unique values.

2007-03-16 Reply Admin

To be fair it is an unedited reader submitted article and others already pointed out the problem with that method.

This is not a reflection on CodeProject, it's a reflection on the author of the article not revising it to improve that method.

After reading some of the comments on here I can see there are a lot of little wtf'ers waiting to happen. :)

This type of thing is not up the the usual WTF standards, seriously there is no worse code than this out there in actual production use or is this site just running out of steam?

2007-03-16 Reply Admin

Eduardo Habkost:
DOA:
To get a bit biblical... He who has never used exceptions in this manner, cast the first stone
first_stone = (Stone)stones[0];

Not quite:

try {
  first_stone = (Stone)stones[0];
}
catch(NullPointerException e) {
  /* stones array is always uninitialized, so perform business logic here. */
  throw new FirstStoneException();
}

2007-03-16 Reply Admin

Mcoder:
The 'contains(value)' solution isn't good either, now your code uses O(n^2) operations, when the original uses just O(n).

Ummm, individual hash operations are O(n), so using a hash insert operation N times is O(N^2). It doesn't actually matter whether you use contains(value) first, because 2 * O(n) is still O(n). The lookup cost is the same for insert as it is for contains, since the hash table must determine if the value is already contained so it can behave correctly (whether that be to overwrite the contained value, add a second contained value, throw an exception, return FILE_NOT_FOUNd, etc).

The reason why a hash operations are O(n), not O(1) as most people would like to believe, is because of the linear search once you've found a hash bucket. In practice, most hash buckets have only one element, so if carefully used a hash table gets O(1) amortized performance; however, if you're in an "enterprise" situation you might find that there's no bucket size small enough and your hashes will get all O(n) on you.

2007-03-16 Reply Admin

In .Net, as with other languages, i goes out of scope as soon as the for loop is exited. So:

FindUniqueElements(i+1);

would cause this method to not even compile. For this to work the try catch block would have to be on the inside of the for loop.

2007-03-16 Reply Admin

This is the actual implementation in .NET 2.0:

private struct bucket { public object key; public object val; public int hash_coll; }

public virtual bool ContainsKey(object key) { uint seed; uint incr; Hashtable.bucket bucket; if (key == null) { throw new ArgumentNullException("key", Environment.GetResourceString("ArgumentNull_Key")); } Hashtable.bucket[] buckets = this.buckets; uint num3 = this.InitHash(key, buckets.Length, out seed, out incr); int num4 = 0; int index = (int) (seed % buckets.Length); do { bucket = buckets[index]; if (bucket.key == null) { return false; } if (((bucket.hash_coll & 0x7fffffff) == num3) && this.KeyEquals(bucket.key, key)) { return true; } index = (int) ((index + incr) % ((ulong) buckets.Length)); } while ((bucket.hash_coll < 0) && (++num4 < buckets.Length)); return false; }

2007-03-16 Reply Admin

Zygo:
however, if you're in an "enterprise" situation you might find that there's no bucket size small enough and your hashes will get all O(n) on you.

"Enterprise" here having one of two meanings:

Your data set is so huge that the hash table involves disk pages, and linear searches through disk pages suck. Actually even O(1) searches through disk pages suck. This is why most DB's use some kind of balanced tree--it keeps similar values together, so you can keep a bunch of related ones each page.
The hash container you are forced to use was written by weasels, and all data values end up in a single hash bucket, producing slightly worse results than linear search, but consuming more memory.

AnthonyG · 2007-03-16 Reply Admin

Throws stone

2007-03-16 Reply Admin

call me crazy - but whats wrong with the way i do it:

sort the list
iterate through the sorted list comparing each element to the previous one - if they are different then add it to your list of uniques.

?

im not a java/.net/whatever coder, but isn't there also a terrible problem possible with stack overflow from the recursive nature of the WTF?

ruijoel · 2007-03-16 Reply Admin

This WTF is especially meaningful to me, because I have been there no more than a couple months ago.

I wrote a small utility class to parse an ini file. It was C# with .NET 2.0, btw.

So, I was using exceptions to handle syntax errors. Like, each time I found a syntax error, I'd throw an exception, catch it, log it, then analyse the rest of the file. Then I wrote an ini file full of possible mistakes to test. There were about 15 syntax errors to catch, and one exception for each one. Execution time : 47 to 50 seconds.

It was too much, so I decided I'd try to anticipate all error causes (like duplicate key, malformed line, etc...) and return false with an error message. Execution time : less than what a Datetime object can measure

Lesson learned : NEVER USE EXCEPTIONS, unless you really have to. And when most often you think you need it, you actually don't.

2007-03-16 Reply Admin

The quote is, "Let he who is without sin, cast the first stone."

if (!(he & SIN)) { first_stone = (Stone) sones[0]; }

Crash Magnet

PSWorx · 2007-03-16 Reply Admin

josh:
call me crazy - but whats wrong with the way i do it:

sort the list

iterate through the sorted list comparing each element to the previous one - if they are different then add it to your list of uniques.

?

im not a java/.net/whatever coder, but isn't there also a terrible problem possible with stack overflow from the recursive nature of the WTF?

Looks to me like a good way to avoid the "enterprisey" situations mentioned above. If my math is correct, your algorithm would be O(n log n) (sorting: O(n log n), iteration: O(2n) = O(n)). So IN THEORY you would be a good bit faster than the O(n^2) of the hash tables. But since in practice the buckets contain only few elements and the runtime often nears O(n) - faster than you again - that propably just matters in extreme cases.

Feel free to throw exeptions at me if I'm wrong...

2007-03-16 Reply Admin

ruijoel:
This WTF is especially meaningful to me, because I have been there no more than a couple months ago.
I wrote a small utility class to parse an ini file. It was C# with .NET 2.0, btw.

So, I was using exceptions to handle syntax errors. Like, each time I found a syntax error, I'd throw an exception, catch it, log it, then analyse the rest of the file. Then I wrote an ini file full of possible mistakes to test. There were about 15 syntax errors to catch, and one exception for each one. Execution time : 47 to 50 seconds.

It was too much, so I decided I'd try to anticipate all error causes (like duplicate key, malformed line, etc...) and return false with an error message. Execution time : less than what a Datetime object can measure

Lesson learned : NEVER USE EXCEPTIONS, unless you really have to. And when most often you think you need it, you actually don't.

You obviously used Exceptions the wrong way.

Throwing fifteen exceptions DOES NOT TAKE 50 seconds.

2007-03-16 Reply Admin

Actually, were you running under a debugger? It could have spent that time loading symbols when the first exceptions were being thrown. Outside of a debugger, that wouldn't happen; it would still be on the order of a millisecond or two.

2007-03-16 Reply Admin

Here is the source code for the 2.0 version of Hashtable.ContainsKey. As you see, it does operate as a hashtable.

Public Overridable Function ContainsKey(ByVal key As Object) As Boolean Dim seed As UInt32 Dim incr As UInt32 Dim bucket As bucket If (key Is Nothing) Then Throw New ArgumentNullException("key", Environment.GetResourceString("ArgumentNull_Key")) End If Dim buckets As bucket() = Me.buckets Dim num3 As UInt32 = Me.InitHash(key, buckets.Length, seed, incr) Dim num4 As Integer = 0 Dim index As Integer = CInt((seed Mod buckets.Length)) Do bucket = buckets(index) If (bucket.key Is Nothing) Then Return False End If If (((bucket.hash_coll And &H7FFFFFFF) = num3) AndAlso Me.KeyEquals(bucket.key, key)) Then Return True End If index = CInt(((index + incr) Mod CULng(buckets.Length))) Loop While ((bucket.hash_coll < 0) AndAlso (++num4 < buckets.Length)) Return False End Function

bobday · 2007-03-16 Reply Admin

ruijoel:
This WTF is especially meaningful to me, because I have been there no more than a couple months ago.
I wrote a small utility class to parse an ini file. It was C# with .NET 2.0, btw.

So, I was using exceptions to handle syntax errors. Like, each time I found a syntax error, I'd throw an exception, catch it, log it, then analyse the rest of the file. Then I wrote an ini file full of possible mistakes to test. There were about 15 syntax errors to catch, and one exception for each one. Execution time : 47 to 50 seconds.

It was too much, so I decided I'd try to anticipate all error causes (like duplicate key, malformed line, etc...) and return false with an error message. Execution time : less than what a Datetime object can measure

Lesson learned : NEVER USE EXCEPTIONS, unless you really have to. And when most often you think you need it, you actually don't.

Just because you have no idea how to use exceptions doesn't mean no one else should.

Beyond putting Sleep() statements in the exception constructors, I really can't think of a way to make 15 exceptions take more than a couple milliseconds. Is this how you managed it?

2007-03-16 Reply Admin

Yartz:
Lastchance:
Ah, the old "use exception handling for non-exceptional logic" trick. That's the third time this week!
Yes, I just got my "Get Smart" DVD set.

Well, it's not such a big deal in many languages. Just because .Net exceptions are SO EXPENSIVE, does not mean catching exceptions in mainstream logic is so horrid.

No they're not; that's a myth.

Also, exceptions shouldn't be used for mainstream logic because they are supposed to occur in exceptional circumstances (hence the name) and disrupt standard control flow through the code. If you're reading code like this and see a try-catch clause, you automatically think that if the catch block is entered, something has gone wrong. Ignoring this pattern just makes it less intuitive.

ParkinT · 2007-03-16 Reply Admin

I take exception to today's submission!

PSWorx · 2007-03-16 Reply Admin

Sounds like a few experiments with exeptions could be useful.

How long do the following snippets run on your machine?

1:

try {throw new Exeption();}
catch (Exception) {}

2:

for (int i=0; i<1000; i++) {
   try {throw new Exeption();}
   catch (Exception) {}
}

3:

private static void stackTest(int depth) {
   if (depth < 1000) {
      stackTest(depth + 1);
   }
   else
   {
      throw new Exeption();
   }
}

// ...

try {
   stackTest(0);
}
catch (Exception) {}

Addendum (2007-03-16 14:38): oops, was referring to:

This WTF is especially meaningful to me, because I have been there no more than a couple months ago.
I wrote a small utility class to parse an ini file. It was C# with .NET 2.0, btw.

So, I was using exceptions to handle syntax errors. Like, each time I found a syntax error, I'd throw an exception, catch it, log it, then analyse the rest of the file. Then I wrote an ini file full of possible mistakes to test. There were about 15 syntax errors to catch, and one exception for each one. Execution time : 47 to 50 seconds.

It was too much, so I decided I'd try to anticipate all error causes (like duplicate key, malformed line, etc...) and return false with an error message. Execution time : less than what a Datetime object can measure

Lesson learned : NEVER USE EXCEPTIONS, unless you really have to. And when most often you think you need it, you actually don't.

2007-03-16 Reply Admin

ruijoel:
This WTF is especially meaningful to me, because I have been there no more than a couple months ago.
I wrote a small utility class to parse an ini file. It was C# with .NET 2.0, btw.

So, I was using exceptions to handle syntax errors. Like, each time I found a syntax error, I'd throw an exception, catch it, log it, then analyse the rest of the file. Then I wrote an ini file full of possible mistakes to test. There were about 15 syntax errors to catch, and one exception for each one. Execution time : 47 to 50 seconds.

It was too much, so I decided I'd try to anticipate all error causes (like duplicate key, malformed line, etc...) and return false with an error message. Execution time : less than what a Datetime object can measure

Lesson learned : NEVER USE EXCEPTIONS, unless you really have to. And when most often you think you need it, you actually don't.

IIRC the original XPAT validator/xslt engine used a similar method to process an XML file. The key difference is that XPAT used events instead of exceptions when it found something "interesting". "Interesting" to XPAT was a new node, "interesting" to you would be a syntax error. Not saying you are WTF to do the return false thing, this just looked familiar and I wondered to myself if an event structure would have given you the same logging ability without the stack hit of an exception. Because logging is always good, amirite?

2007-03-16 Reply Admin

Eduardo Habkost:
DOA:
To get a bit biblical... He who has never used exceptions in this manner, cast the first stone
first_stone = (Stone)stones[0];

first_stone = (cons stones)

2007-03-16 Reply Admin

verisimilidude:
Eduardo Habkost:
DOA:
To get a bit biblical... He who has never used exceptions in this manner, cast the first stone
first_stone = (Stone)stones[0];
first_stone = (cons stones)

or actually

(let first_stone (cons stones) ... )

2007-03-16 Reply Admin

For integer.Parse it seems to take 100x times longer to catch an exception than it would have taken if it completed successfully. In the IDE it takes 4000x times longer.

2007-03-16 Reply Admin

ruijoel:
...I wrote an ini file full of possible mistakes to test. There were about 15 syntax errors to catch, and one exception for each one. Execution time : 47 to 50 seconds.

Do you mean to suggest it takes 50 seconds to throw 15 exceptions. You are mistaken, I believe.

ruijoel:
less than what a Datetime object can measure

What exactly do you believe to be the smallest accuracy of the DateTime structure?

ruijoel:
Lesson learned : NEVER USE EXCEPTIONS, unless you really have to. And when most often you think you need it, you actually don't.

Hogwash. Pure hogwash.

It's a Feature · 2007-03-16 Reply Admin

WTF!!!

This is gotta be one of the stupidest list of comments I've ever seen.

Yes, it's absolutely moronic to write a process that actually depends on exceptions to work.
Hasn't any of the above commentators figured what would happen if the last element WASN'T unique?

2007-03-16 Reply Admin

Eduardo Habkost:
DOA:
To get a bit biblical... He who has never used exceptions in this manner, cast the first stone
first_stone = (Stone)stones[0];

Nice! You get a star :) </srsly>.

2007-03-16 Reply Admin

Ã¿:
ruijoel:
...I wrote an ini file full of possible mistakes to test. There were about 15 syntax errors to catch, and one exception for each one. Execution time : 47 to 50 seconds.

Do you mean to suggest it takes 50 seconds to throw 15 exceptions. You are mistaken, I believe.

ruijoel:
less than what a Datetime object can measure

What exactly do you believe to be the smallest accuracy of the DateTime structure?

System dependent. It's probably about a millisecond. Nothing wrong with using the return false thing internally, but you'd probably want to throw an exception at the api level.

2007-03-16 Reply Admin

verisimilidude:
verisimilidude:
Eduardo Habkost:
DOA:
To get a bit biblical... He who has never used exceptions in this manner, cast the first stone
first_stone = (Stone)stones[0];
first_stone = (cons stones)
or actually
(let first_stone (cons stones) ... )

But since LISP isn't explicitly typed, you're not really "casting" it, right?

2007-03-16 Reply Admin

System dependent. It's probably about a millisecond

You might want to look at the ticks property of the DateTime structure. You're not just wrong, you're four orders of magnitude wrong.

2007-03-16 Reply Admin

Jim:
There's no Set object in .Net? In Java we just do:
Set unique = new HashSet();
unique.addAll(Arrays.asList(strObj));
Done.

Oh, really? In Ruby we just do:

strObj.uniq!

Done.

2007-03-16 Reply Admin

Chris:
In .Net, as with other languages, i goes out of scope as soon as the for loop is exited. So:
FindUniqueElements(i+1);

would cause this method to not even compile. For this to work the try catch block would have to be on the inside of the for loop.

What? Why? That variable is a static global.

2007-03-16 Reply Admin

Whether hash operations are O(n) or O(1) depends on the the exact type of hash table you use. Using a fixed number of hash buckets will definitely get you O(n). Using a varying number of hash buckets (as .NET apparently uses), or using open addressing gives you O(1) expected time, but not, as you mentioned, actual O(1). However, using cuckoo hashing gives you O(1) time, guaranteed.

Mio · 2007-03-16 Reply Admin

dave:
No. no no no. exceptions do not belong as part of the main flow of a program. just dont do it kids.

Especially in .NET

Throwing Exceptions are expensive operations in .NET people! They're not meant to be used for things that you know will happen.

real_aardvark · 2007-03-16 Reply Admin

ST ST:
uhm...
^aCollection asSet.

Smalltalk wins

Dear lord, language wars again.

Your statement is clearly absurd, as measured against three possible yardsticks:

(a) It isn't concise enough. I'm sure that APL only needs, at most, five characters to do the same thing. (Plus you get lazy evaluation for free!) (b) It uses far too many individual characters. (I can see a WTF recursion coming at me right now...) Whitespace would be far better suited to the task. (c) It only uses fourteen different keystrokes. Your keyboard will get bored. Parts of it will jam up, just when you need that elusive "x".

It's hardly in the spirit of things to wonder what the underlying mechanism to "^aCollection asSet" might be. After all, it would detract from the many wonderful arguments we're all having about whether hash tables use trees, or lists, or have duplicates, or might optionally have duplicates, or are anyway a suitable .Net 2.0 implementation compared to Dictionary; which might in turn use trees, or lists, or have duplicates, or... God knows. It might even be implemented in terms of quantum computing by now.

There is nothing wrong with Smalltalk programmers. They tend to be academics, or else the people responsible for bringing the replacement HR system of America's third largest auto manufacturer crashing ignominiously to the ground; but there's nothing wrong with Smalltalk programmers.

Good Smalltalk programmers use Smalltalk appropriately and wisely. But remember:

In Soviet Russia, Smalltalk uses you!

2007-03-16 Reply Admin

verisimilidude:
verisimilidude:
Eduardo Habkost:
DOA:
To get a bit biblical... He who has never used exceptions in this manner, cast the first stone
first_stone = (Stone)stones[0];
first_stone = (cons stones)
or actually
(let first_stone (cons stones) ... )

pedantic

(let ((first-stone (first stone)))
...)

Of course, it would be easy to be even MORE pedantic here...

real_aardvark · 2007-03-16 Reply Admin

Michael:
Michael:
guy:
What part of the name "Hashtable" suggests that there is a tree involved?
Um, the "Hashtable" part? Seriously, how do you think a hashtable is implemented? Here's a hint: it's not a list.

Doh! Now I'm the idiot.

Ok, hashtable's don't use binary trees, I was thinking of different data structures. Even still, the lookup time from a hashtable is faster than a full iteration, O(1) on average, O(n) at worst.

Nice to know that there's some thinking going on, given the recent history of WTF (standard Usenet/shashdot style degeneracy: there's no way to avoid it without ruthless moderators).

However, the only sense in which hashtables operate at between O(1) at best and O(n) at worst is that they operate at O(1). Trumpets please:

John:
Whether hash operations are O(n) or O(1) depends on the the exact type of hash table you use. Using a fixed number of hash buckets will definitely get you O(n). Using a varying number of hash buckets (as .NET apparently uses), or using open addressing gives you O(1) expected time, but not, as you mentioned, actual O(1). However, using cuckoo hashing gives you O(1) time, guaranteed.

Knuth famously claims that there are only seven different categories of sorting algorithms. (I think he's quoting Aristotle here, or perhaps he's just indulging in that well known geek humour that we all love so much.) I can think of about four different hashing mechanisms, but they only really differ in terms of how they cope with collisions.

Collisions are a higher order problem.

Lots (not, obviously, all, so don't yell at me) of posters on this site seem to miss the point of hash tables.

Hash tables do one thing, and they do it well.

Basically, you take your key, and you take your value. The key (K) may or may not be related to the value (V). This is entirely irrelevant.

You apply a hashing algorithm to K and use it as an index into, basically, a vector/array/whatever. A perfectly decent algorithm in many cases is to XOR every byte in the key. Or every short int. Or whatever. (If you're insane enough to roll your own, I recommend co-primes.)

You now have (given more advanced usage as described by John, above) O(1) insertion and O(1) deletion and O(1) location. It is not possible to beat this.

What you lose is the ability to iterate in any meaningful manner. You lose sorting, you lose ... well, I'm getting bored. But you still have O(1) for what you want.

Even if you don't use C++, I strongly recommend that you look at the documentation for the STL. It makes all of this stuff fairly clear, and you can probably translate the ideas into your own code.

2007-03-16 Reply Admin

DOA:
To get a bit biblical... He who has never used exceptions in this manner, cast the first stone

I throw boulder.

2007-03-16 Reply Admin

I'm pretty sure you were running the code in a debugger. Compile for release, run it outside the debug environment and again I'm pretty sure you will find that exceptions do not in fact take 3 seconds per on average. Consider also the fact that .NET programs all over the world are throwing and catching exceptions all the time and do you think it would be common knowledge by now if .NET exception handling was truly slow?

2007-03-16 Reply Admin

Erik:
System dependent. It's probably about a millisecond

You might want to look at the ticks property of the DateTime structure. You're not just wrong, you're four orders of magnitude wrong.

Eric is right, however what the OP probably meant is 'less than what the computer can measure' and not 'less than what a datetime can store'. The resolution of time measurement done by a PC depends on the RTC frequency which I believe is set to 18.2Hz. So yeah, 55milliseconds is correct. Unless you want to be pedantic :)

luke727 · 2007-03-16 Reply Admin

Tell me about it. One time I threw a bunch of exceptions but they found their way back home and killed my dog. I will not rest until those fucking exceptions are caught.

2007-03-16 Reply Admin

For what it's worth, Michael, there are languages which do optimize small hashtables to linear lists and somewhat larger ones to (possibly balanced) binary trees, only actually building a true hash tables when the size and certain other properties of the table indicate a good probability of getting a good tradeoff for the extra space consumed versus the running time.

(Yeah, that means that, if the hashtable starts with fewer than ten elements in its declaration and the compiler can't find initialization code that makes it big enough to be worth it, it could at different times during runtime have each of three different internal representations.)

(Oh, and I seem to remember that one implementation of the language actually added some other internal representations including trie and such.)

That doesn't mean, of course, that O(n^2) is going to be the typical search time for hashtables in said language, but if you've seen such languages you have an excuse for seeing things differently.

Unique Ways to Play Catch

Leave a comment on “Unique Ways to Play Catch”