• Kevin (unregistered)

    Mmm nothing like optimized, efficient code :-)

  • (cs) in reply to Kevin

    it is amazing what you find while looking at legacy code

  • root (unregistered) in reply to BlackTigerX

    That hack seems incredibly illegal.  I guess the guy didn't want to get fired, but he should have said something about the missing cents.  Maybe the company wouldn't have gone down so fast ...

  • (cs)

    Oh well, at least the story has a happy ending... oh wait ...



  • Jonathan Ellis (unregistered)

    Qsort runs in n^2 time itself on already-sorted data...  The author would have had his new code be practically instantaneous if he'd used something more appropriate. :)

  • (cs) in reply to Jonathan Ellis
    Anonymous:
    Qsort runs in n^2 time itself on already-sorted data...  The author would have had his new code be practically instantaneous if he'd used something more appropriate. :)


    Depends on your implementation of qsort.  If you choose the first element as the first pivot, then you are correct.  If you choose a random element as the first pivot, you end up doing a lot better with already sorted elements.
  • (cs) in reply to Jonathan Ellis
    Anonymous:
    The author would have had his new code be practically instantaneous if he'd used something more appropriate. :)


    It's interesting how "under half a second on a 486" is not considered practically instantaneous...especially when dealing with file I/O operations.

    -ds
  • (cs) in reply to Ytram

    The qsort implementation I coded in VB was based on code from an old Pascal book I read in high school. This book recommended selecting the middle element from the array of elements to sort as the pivot, specifically to try to take advantage of partially sorted data. I used qsort because that was the only fast algorithm explained in the book - I didn't know about mergesort until later.

  • (cs) in reply to root

    I did raise the issue about the whole purpose of the hack, that the proper place to fix it was in the mainframe, not in VB. But my bosses mentioned something about not messing with programs without proper approval - I was only authorized to fix the VB hack, not the COBOL program. I can only think the guy who wrote the original VB hack raised the exact same issue, and was met with the exact same objection.

  • (cs) in reply to a_villacis

    If I were the pope, I'd promote Alex Villacis Lasso to sainthood.

    Fregas

  • joykiller (unregistered)

    I once worked as a junior programmer in a once important bank in Ecuador, South America (which bank, I won't say, but its name starts with an "F").

    but I will :

    <FONT face=Verdana color=#660000 size=5>Filanbanco S.A. </FONT>

  • (cs)

    I recently saw a posting on a Tech-Help website, and by the question asked (and the un-edited code examples) I could tell it was a guy that worked at my bank.

    This scares the crap out of me since the code gave away account numbers and the question was asked by an obvious moron. 
    I think its time to get another bank.

  • (cs) in reply to joykiller
    Anonymous:

    I once worked as a junior programmer in a once important bank in Ecuador, South America (which bank, I won't say, but its name starts with an "F").

    but I will :

    <font color="#660000" face="Verdana" size="5">Filanbanco S.A. </font>



    I looked this up as well, and came to the same conclusion.  It's hard to see clues and not attempt (write or wrong) to solve the puzzle.

    .jc
  • (cs)

    Just curious - did the hack written by you gave (correct) numbers in output, or was it still generating excel formulas? :)

  • (cs) in reply to Ytram
    Ytram:
    Anonymous:
    Qsort runs in n^2 time itself on already-sorted data...  The author would have had his new code be practically instantaneous if he'd used something more appropriate. :)


    Depends on your implementation of qsort.  If you choose the first element as the first pivot, then you are correct.  If you choose a random element as the first pivot, you end up doing a lot better with already sorted elements.


    qsort is O(n^2) in worst case, O(nLog n ) in average.

    http://en.wikipedia.org/wiki/Quick_sort
  • (cs)

    Gotta love low-bitcount mantissa and the cumulative errors that ensue.  I think we all would have been much better off if BCD had been the prefered standard with ultra-optimized FPUs, preferably on CPU from the first day they moved the ALU on to the CPU.

    And someone invent a time machine and shoot whoever came up with banker's rounding. 

  • (cs) in reply to ZeoS
    ZeoS:
    Ytram:
    Anonymous:
    Qsort runs in n^2 time itself on already-sorted data...  The author would have had his new code be practically instantaneous if he'd used something more appropriate. :)


    Depends on your implementation of qsort.  If you choose the first element as the first pivot, then you are correct.  If you choose a random element as the first pivot, you end up doing a lot better with already sorted elements.


    qsort is O(n^2) in worst case, O(nLog n ) in average.

    http://en.wikipedia.org/wiki/Quick_sort


    *sigh* Well in the tradition of TDWTF, you didn't read what you quoted.  I never denied that.
  • (cs) in reply to ZeoS
    ZeoS:
    qsort is O(n^2) in worst case, O(nLog n ) in average.

    http://en.wikipedia.org/wiki/Quick_sort


    Big O notation is worst-case.

    Sincerely,

    Gene Wirchenko

  • (cs) in reply to elfz
    elfz:
    Just curious - did the hack written by you gave (correct) numbers in output, or was it still generating excel formulas? :)


    It still generated the excel formulas, by explicit request. Recalculating the values would have been straightforward, but the program was supposed to be a drop-in replacement for the previous one, and the users *expected* to load the output in Excel and have it recalculate the values itself. In other words, I was forced to preserve the outer behavior of the program (Excel formulas and all), except for the new output file format.
  • (cs) in reply to Ytram
    Ytram:
    ZeoS:
    Ytram:
    Anonymous:
    Qsort runs in n^2 time itself on already-sorted data...  The author would have had his new code be practically instantaneous if he'd used something more appropriate. :)


    Depends on your implementation of qsort.  If you choose the first element as the first pivot, then you are correct.  If you choose a random element as the first pivot, you end up doing a lot better with already sorted elements.


    qsort is O(n^2) in worst case, O(nLog n ) in average.

    http://en.wikipedia.org/wiki/Quick_sort


    *sigh* Well in the tradition of TDWTF, you didn't read what you quoted.  I never denied that.

    but it´s not a matter of implementation, given any implementation one can give you an input that makes your qsort O(n^2), the O(nLog n) comes from empirical tests. qsort is always O(n^2) regardless the impl.
  • (cs) in reply to Xepol
    Xepol:
    Gotta love low-bitcount mantissa and the cumulative errors that ensue.  I think we all would have been much better off if BCD had been the prefered standard with ultra-optimized FPUs, preferably on CPU from the first day they moved the ALU on to the CPU.


    I hope you are not considering that floating-point quantities should compare exactly.  That would be Bad.  I, too, would like to see more BCD.

    And someone invent a time machine and shoot whoever came up with banker's rounding.


    Banker's rounding is used to reduce bias in rounding.  Otherwise, there is a bias to rounding up since 5 alway rounds up.  By alternating which way the change is made, the bias is reduced.

    Sincerely,

    Gene Wirchenko

  • (cs) in reply to Gene Wirchenko
    Gene Wirchenko:
    ZeoS:
    qsort is O(n^2) in worst case, O(nLog n ) in average.

    http://en.wikipedia.org/wiki/Quick_sort

    Big O notation is worst-case.
    Sincerely,
    Gene Wirchenko


    You're right :D I forgot that :#
  • (cs) in reply to Jens

    Jens:
    Oh well, at least the story has a happy ending... oh wait ...



    Always remember: "It isn't bad news unless it happens to you"

     

  • (cs) in reply to ZeoS
    Zero:
    but it´s not a matter of implementation, given any implementation one can give you an input that makes your qsort O(n^2), the O(nLog n) comes from empirical tests. qsort is always O(n^2) regardless the impl.


    I was hoping you would still go back and read what I was replying to originally.  I was referring to what causes the worst case.  If you choose the first element as the pivot, the worst case is realized in sorted data.  If you have sorted data and you choose the middle element as the pivot, it will not perform in O(n^2), but rather closer to its average, O(nlogn).

    I never meant that a specific implementation would do away with the worst case, I just mentioned that what causes the worst case is implementation specific.
  • (cs) in reply to ZeoS
    ZeoS:
    Ytram:
    ZeoS:
    Ytram:
    Anonymous:
    Qsort runs in n^2 time itself on already-sorted data...  The author would have had his new code be practically instantaneous if he'd used something more appropriate. :)


    Depends on your implementation of qsort.  If you choose the first element as the first pivot, then you are correct.  If you choose a random element as the first pivot, you end up doing a lot better with already sorted elements.


    qsort is O(n^2) in worst case, O(nLog n ) in average.

    http://en.wikipedia.org/wiki/Quick_sort


    *sigh* Well in the tradition of TDWTF, you didn't read what you quoted.  I never denied that.

    but it´s not a matter of implementation, given any implementation one can give you an input that makes your qsort O(n^2), the O(nLog n) comes from empirical tests. qsort is always O(n^2) regardless the impl.


    Given that the input is 'mostly' sorted, if the qsort key direction was the opposite of the input direction (thus creating an almost completely unsorted list), then he could have achieved n Log n almost every time.


  • (cs) in reply to Anonymoose

    Whoops, I was thinking of heapsort.

  • (cs) in reply to joykiller
    Anonymous:

    I once worked as a junior programmer in a once important bank in Ecuador, South America (which bank, I won't say, but its name starts with an "F").

    but I will :

    <font color="#660000" face="Verdana" size="5">Filanbanco S.A. </font>



    Better known as WTFilanbanco.
  • (cs)

    <FONT face=Arial size=2>Someone must have killed all of the bank examiners and auditors. Idiots!</FONT>

  • (cs) in reply to rogthefrog

    People often say "were they being paid by the line."  Based on your boss's response of "that can't be right, you must be missing something because it's so fast," they might have been getting paid by the time it'd take to run.

  • (cs) in reply to Sweets
    Sweets:
    I recently saw a posting on a Tech-Help website, and by the question asked (and the un-edited code examples) I could tell it was a guy that worked at my bank.

    This scares the crap out of me since the code gave away account numbers and the question was asked by an obvious moron. 
    I think its time to get another bank.


    My, ummm, associate attended a security course, of which SQL injection was covered.  Guess what, he tried it on his own bank, and voila!  He promptly got arrested! Ha no, he brought it to the attention of the bank, but I did not hear how it turned out.

    On another note,  perhaps the boss didn't want the mainframe code to be fixed, 'cuz it was funnelling money into his account!
  • (cs) in reply to Anonymoose

    If I remember correctly, insertion sort goes to O(n) (and/or Omega(n), if you're feeling picky today) for nearly sorted data, which is probably better than trying qsort and forcing the average case (Omega (n lg n)). Aside from that, it's probably easier to code.

    I like easy to code B

    Cheers.


  • (cs) in reply to Sweets
    Sweets:
    I recently saw a posting on a Tech-Help website, and by the question asked (and the un-edited code examples) I could tell it was a guy that worked at my bank.

    This scares the crap out of me since the code gave away account numbers and the question was asked by an obvious moron. 
    I think its time to get another bank.


    That'd scare the crap out of me too.  I'd probably look into reporting the bank or directing proper authorities to the post (but then since I know a bank auditor I'd know right where to go).
  • dwayner (unregistered) in reply to Anonymoose
    Anonymoose:


    On another note,  perhaps the boss didn't want the mainframe code to be fixed, 'cuz it was funnelling money into his account!


    That's just a straight shooter with upper managment written all over him.
  • mike5 (unregistered) in reply to Pete

    All that quicksort talk reminds of an article about sorting algorithms (this was, years, years, years back, when ZX Spectrum was a "the norm"). Anyway, this article did some introduction on the sorting methods, and the even had some example code, that the author used to measure how good the each method is. His conclusion - Heap sort was quicker that Quicksort. And the culprit? Oh well, as we all know in theory, quicksort can use any element, as a pivot, and that is exactly what his code did. using Spectrum's notoriously "fast" and lousy Rnd function to get the pivot. Changing that to the usual middle element, changed made all the difference in the world.


    Anyway, I'm sure all of you are just as amused by this story as I was (yeah right)...

  • mike5 (unregistered) in reply to mike5
    Anonymous:
    All that quicksort talk reminds of an article about sorting algorithms (this was, years, years, years back, when ZX Spectrum was "the norm"). Anyway, this article did some introduction on the sorting methods, and even had some example code that the author used to measure how good the each method is. His conclusion - HeapSort was quicker that QuickSort. And the culprit? Oh well, as we all know, in theory quicksort can use any element as a pivot, and that is exactly what his code did. Using Spectrum's notoriously "fast" and lousy Rnd function to get the pivot. Changing that to the usual middle element, made all the difference in the world.

    Anyway, I'm sure all of you are just as amused by this story as I was (yeah right)...



    And after obliterating all the copy paste errors

  • (cs) in reply to mike5
    Anonymous:
    His conclusion - Heap sort was quicker that Quicksort.


    This conclusion is true for large arrays, and the worst case for the heap sort is better: O(n*ln(n)) vs. O(n²). On the other hand, heap sort is harder to understand.
  • Wes (unregistered) in reply to Gene Wirchenko

    Big Oh notation is an asymptotic upper bound for a given function.  It is perfectly reasonable to discuss average running time as O(nlogn) if this gives you greater insight into the algorithm.

  • ByteJuggler (unregistered) in reply to Wes
    Anonymous:
    Big Oh notation is an asymptotic upper bound for a given function.  It is perfectly reasonable to discuss average running time as O(nlogn) if this gives you greater insight into the algorithm.


    Correct.  There are also modifications to quicksort that make Quicksort true O(n log n) worst case (by finding the true median value, which is takes O(n).  However this is usually an "expensive" O(n) so in practice it's better to just pick a random pivot and be done with it.)   Lastly we should probably be using "Theta" notation instead (Big O with a line in it, which is the asymptotically tight bound, as opposed to Big O which is an asymptotic upper bound...)  
  • (cs) in reply to ByteJuggler

    Interesting how the discussion turned to sorting algorithms... I thought the WTF in the case was the hack that solved the problem in the wrong place, compounded by a terrible I/O bound design (possibly because the previous culpr^H^H^H^H^Hprogrammer had never heard of ReDim), and a policy that prevented a more correct fix from being approved. The fact that I coded a quicksort as part of the "improved" solution was just a side note.

  • (cs) in reply to a_villacis
    a_villacis:
    Interesting how the discussion turned to sorting algorithms... I thought the WTF in the case was the hack that solved the problem in the wrong place, compounded by a terrible I/O bound design (possibly because the previous culpr^H^H^H^H^Hprogrammer had never heard of ReDim), and a policy that prevented a more correct fix from being approved. The fact that I coded a quicksort as part of the "improved" solution was just a side note.


    Welcome to The Daily WTF [:)]

    Where the real WTF is that no one stays on topic [;)]
  • Eric the .5b (unregistered) in reply to a_villacis

    Believe me, man, your fix certainly isn't the WTF. Sorting algorithm debates are just one of one of those things people fall into.

  • (cs) in reply to Eric the .5b
    Anonymous:
    Believe me, man, your fix certainly isn't the WTF. Sorting algorithm debates are just one of one of those things people fall into.


    Unless, like myself, you're stuck on a platform with a short call stack, in which case Shell is the only way to fly (bubble and simple insertion being the only alternatives, really, since recursion is a no-no). All we can argue about is the interval set....
  • (cs) in reply to a_villacis
    a_villacis:
    Interesting how the discussion turned to sorting algorithms... I thought the WTF in the case was the hack that solved the problem in the wrong place, compounded by a terrible I/O bound design (possibly because the previous culpr^H^H^H^H^Hprogrammer had never heard of ReDim), and a policy that prevented a more correct fix from being approved. The fact that I coded a quicksort as part of the "improved" solution was just a side note.


    Not to mention that you stated that your were a junior developer at the time, and that you chose an algorithm that you could reliably implement.  A good algorithm implemented in a bug free manner trumps an optimal algorithm whose implementer did not understand some subtle nuance.

    And finally, if you have fewer than 1000 elements, you probably could have done a bubble sort and still been plenty fast.
  • (cs) in reply to RevMike
    RevMike:
    a_villacis:
    Interesting how the discussion turned to sorting algorithms... I thought the WTF in the case was the hack that solved the problem in the wrong place, compounded by a terrible I/O bound design (possibly because the previous culpr^H^H^H^H^Hprogrammer had never heard of ReDim), and a policy that prevented a more correct fix from being approved. The fact that I coded a quicksort as part of the "improved" solution was just a side note.


    Not to mention that you stated that your were a junior developer at the time, and that you chose an algorithm that you could reliably implement.  A good algorithm implemented in a bug free manner trumps an optimal algorithm whose implementer did not understand some subtle nuance.

    And finally, if you have fewer than 1000 elements, you probably could have done a bubble sort and still been plenty fast.

    Yep, I don't see how something that is IO intensive can be significantly slowed down by your sorting algorithm of choice. Plus, I only fully understand bubble sort.
  • (cs) in reply to Gene Wirchenko
    Gene Wirchenko:
    ZeoS:
    qsort is O(n^2) in worst case, O(nLog n ) in average.

    http://en.wikipedia.org/wiki/Quick_sort


    Big O notation is worst-case.


    No. Big O notation is about eliminating implementation-specific constant factors. What cases you look at is a completely different matter. "Worst case" is the most common, but  you can do "average case" just as fine. The problem is that defining "average" in a useful way is often not possible. On the other hand "worst case" often gives a wrong picture as well when talking about data structures, where if the worst case does occur, it cannot occur again for a long time.

    The most useful measure is actually neither "worst case" nor "average case" but a combination of the two, where you take the average over a worst-case series of operations. This is called "amortized complexity".

    This can't be used for quicksort, but still, for all practical purposes it's O(n*log(n)) if you use a random pivot element, because the input data cannot systematically cause worst case behaviour. It can only happpen by chance, and the change that it does happen decreases with the size of the input data.
  • Nick (unregistered) in reply to Stan Rogers
    Stan Rogers:
    Anonymous:
    Believe me, man, your fix certainly isn't the WTF. Sorting algorithm debates are just one of one of those things people fall into.


    Unless, like myself, you're stuck on a platform with a short call stack, in which case Shell is the only way to fly (bubble and simple insertion being the only alternatives, really, since recursion is a no-no). All we can argue about is the interval set....


    Heap sort would do the trick too, unless I'm missing something. Actually, on http://www.azillionmonkeys.com/qed/sort.html, Paul Hsieh finds it to be somewhat faster than quicksort on medium-sized amounts of data on x86, through a clever implementation. His implementation of heap sort is at http://www.azillionmonkeys.com/qed/sorttest.c, about half way down. It's only iterative, not recursive, and has nice boring O(n log n) behaviour whatever input you give it.
  • (cs) in reply to brazzy
    brazzy:
    Gene Wirchenko:
    ZeoS:
    qsort is O(n^2) in worst case, O(nLog n ) in average.

    http://en.wikipedia.org/wiki/Quick_sort


    Big O notation is worst-case.


    No. Big O notation is about eliminating implementation-specific constant factors. What cases you look at is a completely different matter. "Worst case" is the most common, but  you can do "average case" just as fine. The problem is that defining "average" in a useful way is often not possible. On the other hand "worst case" often gives a wrong picture as well when talking about data structures, where if the worst case does occur, it cannot occur again for a long time.

    The most useful measure is actually neither "worst case" nor "average case" but a combination of the two, where you take the average over a worst-case series of operations. This is called "amortized complexity".

    This can't be used for quicksort, but still, for all practical purposes it's O(n*log(n)) if you use a random pivot element, because the input data cannot systematically cause worst case behaviour. It can only happpen by chance, and the change that it does happen decreases with the size of the input data.


    This just occurred to me.  So many people here are using big-O to argue about which sorting algorithm would be fastest.  Big-O tells us how an algorithm scales, not how fast an algorithm is.  Of course, there is some set size that the better algortihm (as measured by Big-O) will be faster than the worse algorithm, but that could be ten elements or ten-bllion elements.  Big-O does not tell us that.

    Basicly everyone is arguing about the top speed of a car, but using the argument "This car accelerates faster 0-60."  The two measures are not necessarily related.

    For small data-sets an algorithm with a worse Big-O measure can be faster than and algorithm with a better Big-O measure.  Big-O is only a measure of relative speed as n gets very large.  I think we were told that n was under 1000.  Please confine future performance arguments to relative speed of algorithms while n is between 1 and 1000.
  • (cs) in reply to Nick
    Anonymous:
    Stan Rogers:
    Anonymous:
    Believe me, man, your fix certainly isn't the WTF. Sorting algorithm debates are just one of one of those things people fall into.


    Unless, like myself, you're stuck on a platform with a short call stack, in which case Shell is the only way to fly (bubble and simple insertion being the only alternatives, really, since recursion is a no-no). All we can argue about is the interval set....


    Heap sort would do the trick too, unless I'm missing something. Actually, on http://www.azillionmonkeys.com/qed/sort.html, Paul Hsieh finds it to be somewhat faster than quicksort on medium-sized amounts of data on x86, through a clever implementation. His implementation of heap sort is at http://www.azillionmonkeys.com/qed/sorttest.c, about half way down. It's only iterative, not recursive, and has nice boring O(n log n) behaviour whatever input you give it.


    Yes, Heap would be usable. In real-world data sets I am almost always dealing with partially-sorted data (largish contiguous portions of the data have been processed before -- a nightmare for Qsort), and Shell is faster in that case. (New, never-sorted data sets are small enough that bubble wouldn't add significantly to the overhead of fetching the data.) Horses for courses -- I knew there was a reason we went Shell.
  • Drew Robinson (unregistered)

    PETER
    So when the subroutine compounds the interest, right, it uses all these extra decimals places that just get rounded off. So we just simplify the whole thing and we just round it down and drop the remainder into an account that we own.<p>

    JOANNA
    So you're stealing.<p>

    PETER
    Ah, no. No. You don't understand. It's, uh, very complicated. It's, uh, it's, it's aggregate so I'm talking about fractions of a cent that, uh, over time, they add up to a lot.<p>

    ...<p>

    MICHAEL
    Ok! Ok! I must have, I must have put a decimal point in the wrong place or something. Shit. I always do that. I always mess up some mundane detail.

  • javascript jan (unregistered) in reply to RevMike
    RevMike:

    For small data-sets an algorithm with a worse Big-O measure can be faster than and algorithm with a better Big-O measure.  Big-O is only a measure of relative speed as n gets very large.  I think we were told that n was under 1000.  Please confine future performance arguments to relative speed of algorithms while n is between 1 and 1000.


    In such cases, quicksort is O(1). Heapsort is O(1). Mergesort is O(1). WTFSort still has unbounded runtime.

Leave a comment on “A Careful Balancing Act”

Log In or post as a guest

Replying to comment #:

« Return to Article