- Feature Articles
- CodeSOD
-
Error'd
- Most Recent Articles
- Secret Horror
- Not Impossible
- Monkeys
- Killing Time
- Hypersensitive
- Infallabella
- Doubled Daniel
- It Figures
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Mmm nothing like optimized, efficient code :-)
Admin
it is amazing what you find while looking at legacy code
Admin
That hack seems incredibly illegal. I guess the guy didn't want to get fired, but he should have said something about the missing cents. Maybe the company wouldn't have gone down so fast ...
Admin
Oh well, at least the story has a happy ending... oh wait ...
Admin
Qsort runs in n^2 time itself on already-sorted data... The author would have had his new code be practically instantaneous if he'd used something more appropriate. :)
Admin
Depends on your implementation of qsort. If you choose the first element as the first pivot, then you are correct. If you choose a random element as the first pivot, you end up doing a lot better with already sorted elements.
Admin
It's interesting how "under half a second on a 486" is not considered practically instantaneous...especially when dealing with file I/O operations.
-ds
Admin
The qsort implementation I coded in VB was based on code from an old Pascal book I read in high school. This book recommended selecting the middle element from the array of elements to sort as the pivot, specifically to try to take advantage of partially sorted data. I used qsort because that was the only fast algorithm explained in the book - I didn't know about mergesort until later.
Admin
I did raise the issue about the whole purpose of the hack, that the proper place to fix it was in the mainframe, not in VB. But my bosses mentioned something about not messing with programs without proper approval - I was only authorized to fix the VB hack, not the COBOL program. I can only think the guy who wrote the original VB hack raised the exact same issue, and was met with the exact same objection.
Admin
If I were the pope, I'd promote Alex Villacis Lasso to sainthood.
Fregas
Admin
I once worked as a junior programmer in a once important bank in Ecuador, South America (which bank, I won't say, but its name starts with an "F").
but I will :
<FONT face=Verdana color=#660000 size=5>Filanbanco S.A. </FONT>
Admin
I recently saw a posting on a Tech-Help website, and by the question asked (and the un-edited code examples) I could tell it was a guy that worked at my bank.
This scares the crap out of me since the code gave away account numbers and the question was asked by an obvious moron.
I think its time to get another bank.
Admin
I looked this up as well, and came to the same conclusion. It's hard to see clues and not attempt (write or wrong) to solve the puzzle.
.jc
Admin
Just curious - did the hack written by you gave (correct) numbers in output, or was it still generating excel formulas? :)
Admin
qsort is O(n^2) in worst case, O(nLog n ) in average.
http://en.wikipedia.org/wiki/Quick_sort
Admin
Gotta love low-bitcount mantissa and the cumulative errors that ensue. I think we all would have been much better off if BCD had been the prefered standard with ultra-optimized FPUs, preferably on CPU from the first day they moved the ALU on to the CPU.
And someone invent a time machine and shoot whoever came up with banker's rounding.
Admin
*sigh* Well in the tradition of TDWTF, you didn't read what you quoted. I never denied that.
Admin
Big O notation is worst-case.
Sincerely,
Gene Wirchenko
Admin
It still generated the excel formulas, by explicit request. Recalculating the values would have been straightforward, but the program was supposed to be a drop-in replacement for the previous one, and the users *expected* to load the output in Excel and have it recalculate the values itself. In other words, I was forced to preserve the outer behavior of the program (Excel formulas and all), except for the new output file format.
Admin
but it´s not a matter of implementation, given any implementation one can give you an input that makes your qsort O(n^2), the O(nLog n) comes from empirical tests. qsort is always O(n^2) regardless the impl.
Admin
I hope you are not considering that floating-point quantities should compare exactly. That would be Bad. I, too, would like to see more BCD.
Banker's rounding is used to reduce bias in rounding. Otherwise, there is a bias to rounding up since 5 alway rounds up. By alternating which way the change is made, the bias is reduced.
Sincerely,
Gene Wirchenko
Admin
You're right :D I forgot that :#
Admin
Always remember: "It isn't bad news unless it happens to you"
Admin
I was hoping you would still go back and read what I was replying to originally. I was referring to what causes the worst case. If you choose the first element as the pivot, the worst case is realized in sorted data. If you have sorted data and you choose the middle element as the pivot, it will not perform in O(n^2), but rather closer to its average, O(nlogn).
I never meant that a specific implementation would do away with the worst case, I just mentioned that what causes the worst case is implementation specific.
Admin
Given that the input is 'mostly' sorted, if the qsort key direction was the opposite of the input direction (thus creating an almost completely unsorted list), then he could have achieved n Log n almost every time.
Admin
Whoops, I was thinking of heapsort.
Admin
Better known as WTFilanbanco.
Admin
<FONT face=Arial size=2>Someone must have killed all of the bank examiners and auditors. Idiots!</FONT>
Admin
People often say "were they being paid by the line." Based on your boss's response of "that can't be right, you must be missing something because it's so fast," they might have been getting paid by the time it'd take to run.
Admin
My, ummm, associate attended a security course, of which SQL injection was covered. Guess what, he tried it on his own bank, and voila! He promptly got arrested! Ha no, he brought it to the attention of the bank, but I did not hear how it turned out.
On another note, perhaps the boss didn't want the mainframe code to be fixed, 'cuz it was funnelling money into his account!
Admin
If I remember correctly, insertion sort goes to O(n) (and/or Omega(n), if you're feeling picky today) for nearly sorted data, which is probably better than trying qsort and forcing the average case (Omega (n lg n)). Aside from that, it's probably easier to code.
I like easy to code B
Cheers.
Admin
That'd scare the crap out of me too. I'd probably look into reporting the bank or directing proper authorities to the post (but then since I know a bank auditor I'd know right where to go).
Admin
That's just a straight shooter with upper managment written all over him.
Admin
All that quicksort talk reminds of an article about sorting algorithms (this was, years, years, years back, when ZX Spectrum was a "the norm"). Anyway, this article did some introduction on the sorting methods, and the even had some example code, that the author used to measure how good the each method is. His conclusion - Heap sort was quicker that Quicksort. And the culprit? Oh well, as we all know in theory, quicksort can use any element, as a pivot, and that is exactly what his code did. using Spectrum's notoriously "fast" and lousy Rnd function to get the pivot. Changing that to the usual middle element, changed made all the difference in the world.
Anyway, I'm sure all of you are just as amused by this story as I was (yeah right)...
Admin
And after obliterating all the copy paste errors
Admin
This conclusion is true for large arrays, and the worst case for the heap sort is better: O(n*ln(n)) vs. O(n²). On the other hand, heap sort is harder to understand.
Admin
Big Oh notation is an asymptotic upper bound for a given function. It is perfectly reasonable to discuss average running time as O(nlogn) if this gives you greater insight into the algorithm.
Admin
Correct. There are also modifications to quicksort that make Quicksort true O(n log n) worst case (by finding the true median value, which is takes O(n). However this is usually an "expensive" O(n) so in practice it's better to just pick a random pivot and be done with it.) Lastly we should probably be using "Theta" notation instead (Big O with a line in it, which is the asymptotically tight bound, as opposed to Big O which is an asymptotic upper bound...)
Admin
Interesting how the discussion turned to sorting algorithms... I thought the WTF in the case was the hack that solved the problem in the wrong place, compounded by a terrible I/O bound design (possibly because the previous culpr^H^H^H^H^Hprogrammer had never heard of ReDim), and a policy that prevented a more correct fix from being approved. The fact that I coded a quicksort as part of the "improved" solution was just a side note.
Admin
Welcome to The Daily WTF [:)]
Where the real WTF is that no one stays on topic [;)]
Admin
Believe me, man, your fix certainly isn't the WTF. Sorting algorithm debates are just one of one of those things people fall into.
Admin
Unless, like myself, you're stuck on a platform with a short call stack, in which case Shell is the only way to fly (bubble and simple insertion being the only alternatives, really, since recursion is a no-no). All we can argue about is the interval set....
Admin
Not to mention that you stated that your were a junior developer at the time, and that you chose an algorithm that you could reliably implement. A good algorithm implemented in a bug free manner trumps an optimal algorithm whose implementer did not understand some subtle nuance.
And finally, if you have fewer than 1000 elements, you probably could have done a bubble sort and still been plenty fast.
Admin
Yep, I don't see how something that is IO intensive can be significantly slowed down by your sorting algorithm of choice. Plus, I only fully understand bubble sort.
Admin
No. Big O notation is about eliminating implementation-specific constant factors. What cases you look at is a completely different matter. "Worst case" is the most common, but you can do "average case" just as fine. The problem is that defining "average" in a useful way is often not possible. On the other hand "worst case" often gives a wrong picture as well when talking about data structures, where if the worst case does occur, it cannot occur again for a long time.
The most useful measure is actually neither "worst case" nor "average case" but a combination of the two, where you take the average over a worst-case series of operations. This is called "amortized complexity".
This can't be used for quicksort, but still, for all practical purposes it's O(n*log(n)) if you use a random pivot element, because the input data cannot systematically cause worst case behaviour. It can only happpen by chance, and the change that it does happen decreases with the size of the input data.
Admin
Heap sort would do the trick too, unless I'm missing something. Actually, on http://www.azillionmonkeys.com/qed/sort.html, Paul Hsieh finds it to be somewhat faster than quicksort on medium-sized amounts of data on x86, through a clever implementation. His implementation of heap sort is at http://www.azillionmonkeys.com/qed/sorttest.c, about half way down. It's only iterative, not recursive, and has nice boring O(n log n) behaviour whatever input you give it.
Admin
This just occurred to me. So many people here are using big-O to argue about which sorting algorithm would be fastest. Big-O tells us how an algorithm scales, not how fast an algorithm is. Of course, there is some set size that the better algortihm (as measured by Big-O) will be faster than the worse algorithm, but that could be ten elements or ten-bllion elements. Big-O does not tell us that.
Basicly everyone is arguing about the top speed of a car, but using the argument "This car accelerates faster 0-60." The two measures are not necessarily related.
For small data-sets an algorithm with a worse Big-O measure can be faster than and algorithm with a better Big-O measure. Big-O is only a measure of relative speed as n gets very large. I think we were told that n was under 1000. Please confine future performance arguments to relative speed of algorithms while n is between 1 and 1000.
Admin
Yes, Heap would be usable. In real-world data sets I am almost always dealing with partially-sorted data (largish contiguous portions of the data have been processed before -- a nightmare for Qsort), and Shell is faster in that case. (New, never-sorted data sets are small enough that bubble wouldn't add significantly to the overhead of fetching the data.) Horses for courses -- I knew there was a reason we went Shell.
Admin
PETER
So when the subroutine compounds the interest, right, it uses all these extra decimals places that just get rounded off. So we just simplify the whole thing and we just round it down and drop the remainder into an account that we own.<p>
JOANNA
So you're stealing.<p>
PETER
Ah, no. No. You don't understand. It's, uh, very complicated. It's, uh, it's, it's aggregate so I'm talking about fractions of a cent that, uh, over time, they add up to a lot.<p>
...<p>
MICHAEL
Ok! Ok! I must have, I must have put a decimal point in the wrong place or something. Shit. I always do that. I always mess up some mundane detail.
Admin
In such cases, quicksort is O(1). Heapsort is O(1). Mergesort is O(1). WTFSort still has unbounded runtime.