• (disco)

    TIL Java's Vector isn't formally deprecated... Still, TR :WTF: is them not changing it to ArrayList since (as far as I can see) there's no need for synchronization on every single operation. I think that would be a good boost in performance, right?

  • (disco)

    With the recent influxes of sort-every-time articles, I wonder if there's an sortedAdd() function to speed things along? Granted, it would be all :wtf: as anything (i.e. probably assumes the array is already sorted), but it should work just fine, assuming that's the only way you start adding things to the list, right?


    Filed under: Obfuscating code by hand

  • (disco)

    This is pretty special. :wtf:

  • (disco) in reply to Tsaukpaetra
    Tsaukpaetra:
    the recent influxes of sort-every-time articles

    I knew I'd seen something like this recently.

  • (disco) in reply to Tsaukpaetra

    I think there's a SortedSet interface?

  • (disco) in reply to PleegWat

    Yep, IIRC TreeSet implements it.

  • (disco)

    http://thedailywtf.com/articles/sorting-cabinets

  • (disco)

    This, children, is why you do code reviews.

  • (disco) in reply to Vault_Dweller

    huh.... i thought that story looked familliar....

  • (disco) in reply to janvdbergh
    janvdbergh:
    This, children, is why you do code reviews.
    Vault_Dweller:
    http://thedailywtf.com/articles/sorting-cabinets

    Also, article reviews, apparently :trollface:

  • (disco)
    Vault_Dweller:
    http://thedailywtf.com/articles/sorting-cabinets

    Oh wow... um.... Sorry, I totally didn't notice when this was farmed out to me. I've informed my editor.

  • (disco) in reply to Yamikuronue

    Maybe we should build an article management app for him/her/it :P

  • (disco)

    Apart from the less-than-efficient use of types, this just seems like a simple case of a misplaced sort. Easy to do, it's a simple brainfart, but difficult to track down unless you're looking for it, because the code actually works. If the team is particularly busy and the workload is high, it's a fairly common-or-garden mistake. All in all, looks like a fairly successful deployment, where the team is alert for problems, is evaluating the performance, and have caught and fixed a performance issue in very short order. Job done, good work team.

  • (disco) in reply to Yamikuronue
    Yamikuronue:
    I've informed my editor.

    Nice umbrella! Is it home made?

  • (disco) in reply to Luhmann

    Yes, thank you, I stitched on the sequins myself before I coated it with shit-resistance spray :)

  • (disco) in reply to Yamikuronue

    That sounds like a glory market hole!

  • (disco) in reply to Yamikuronue

    I guess you filed the tip off as well so it doesn't get accidentally stuck in the fan?

  • (disco) in reply to LB_
    LB_:
    I think that would be a good boost in performance, right?
    Depends on how large that list gets. LinkedList might be preferable to avoid array resizing shenanigans... Still doesn't sort the unecessary sort.
    LB_:
    Yep, IIRC TreeSet implements it.
    I was having a problem with multi million sized datasets. It was faster to use a hashset and then convert to a sorted structure in one go later on.
  • (disco) in reply to Tsaukpaetra
    Tsaukpaetra:
    With the recent influxes of sort-every-time articles, I wonder if there's an `sortedAdd()` function to speed things along?
    I can't remember the exact code, but there are bugs in Bugzilla regarding the speed at which Thunderbird updates some sorted array somewhere. Even though the list starts off sorted, it's still quicker to append all the new elements and then re-sort. That is, always assuming your sort algorithm hasn't been "optimised" to use bubble sort on an almost sorted array...
  • (disco)

    Isn't the real WTF that nobody looked at the "clever" intern's code before it got deployed to production?

  • (disco) in reply to Tsaukpaetra

    I think pretty much every first-year programming class on Java or C# is making a List/Vector/whatever class that stays sorted as shit's inserted.

  • (disco)

    Hmmm...should have gone with the bubble sort.

  • (disco) in reply to PleegWat

    But SortedSet has a different semantic from a sorted List (in particular with regards to preserving duplicate entries)...

  • (disco) in reply to Hasteur
    Hasteur:
    Isn't the real WTF that nobody looked at the "clever" intern's code before it got deployed to production?

    It compiled. It ran. It worked[1]. What more do you need?

    [1] Okay, it worked on the test data, which was conveniently a very small set which was already sorted, but why would you want to test with anything more than that?

  • (disco) in reply to RandomStranger

    Then you want a sorted multiset. A sorted list doesn't make sense because a list is, by definition, sorted by index, not value.

  • (disco) in reply to Tsaukpaetra

    Let's say you have an array-backed list and you're adding a billion integers to it in reverse sorted order. You could:

    • Add all billion elements (O(n)) and then sort the list (O(n log n))
    • Add all billion elements (O(n)), sorting after each add (O(n2 log n))
    • Add all billion elements (O(n)), but do a binary search each time (O(n log n)) and insert the element where it would go in sorted order (O(n2))

    Now let's put it in numbers:

    • 30897352853
    • 29897352854986261130
    • 1000000030897352853

    So it looks like your plan is slightly more efficient than the one used in the beginning of the article, but many orders of magnitude less efficient than the one at the end.

  • (disco) in reply to ben_lubar

    Still useful if (any of):

    1. you expect to consult the container significantly more often than you add elements

    2. you use a sorted container that has an interface to add a whole container

    3. you choose not to use an array-backed container but something more like a b-tree.

  • (disco) in reply to Lawrence
    Lawrence:
    1) you expect to consult the container significantly more often than you add elements

    It's still O(n + log n) for inserts, whereas a TreeSet is O(log n).

    Lawrence:
    2) you use a sorted container that has an interface to add a whole container

    What exactly do you sort a set of sorted sets by?

    Lawrence:
    3) you choose not to use an array-backed container but something more like a b-tree.

    In which case it's not a list, unless you're making a really weird list implementation that uses binary trees. And you'd still need to update the index of everything after the index you inserted.

  • (disco) in reply to ben_lubar
    ben_lubar:
    What exactly do you sort a set of sorted sets by?

    Well, you can sort-of sort the set of sets by the sort of the sets.

    [/silly]

    Or by the cardinality, or the maximum element, or something like that. Whatever makes sense. (Java lets you specify a custom comparator.)

  • (disco) in reply to ben_lubar
    ben_lubar:
    - Add all billion elements (O(n)) and then sort the list (O(n log n)) - Add all billion elements (O(n)), sorting after each add (O(n2 log n)) - Add all billion elements (O(n)), but do a binary search each time (O(n log n)) and insert the element where it would go in sorted order (O(n2))

    That last one should be O(n2 log n), shouldn't it?

  • (disco) in reply to Dreikin

    You calculate the position using binary search, but you don't have to recalculate it once you know it. You just have to shift up to n elements one space to the right.

  • (disco) in reply to Dreikin
    Dreikin:
    That last one should be O(n2 log n), shouldn't it?
    uhh....

    doo an n logn and an n^2 operation n times.....

    i think that would be an n^3 operation wouldn't it?

    or am i misreading the algorithm spec?

  • (disco) in reply to ben_lubar
    ben_lubar:
    You calculate the position using binary search, but you don't have to recalculate it once you know it. You just have to shift up to n elements one space to the right.

    Right, so (n * log n * n = n2 log n), right? That is, you put them together as (add * search * insert)?

  • (disco) in reply to Dreikin

    You have to do (search+insert) add times.

  • (disco) in reply to ben_lubar
    ben_lubar:
    You have to do (search+insert) add times.

    Ah, okay.

  • (disco) in reply to Quite
    Quite:
    Apart from the less-than-efficient use of types, this just seems like a simple case of a misplaced sort. Easy to do, it's a simple brainfart, but difficult to track down unless you're looking for it, because the code actually works. If the team is particularly busy and the workload is high, it's a fairly common-or-garden mistake. All in all, looks like a fairly successful deployment, where the team is alert for problems, is evaluating the performance, and have caught and fixed a performance issue in very short order. Job done, good work team.

    http://thedailywtf.com/articles/sorting-cabinets

    "Even better- that call to sort happens before the object is added to the collection, which means the list returned will always be sorted except for the last element in the list. So not only is it inefficient, but it doesn’t actually work."

    So no... the code doesn't work. It seems to work, but in all honestly, it doesn't. (Like the aforementioned article submission process)

  • (disco) in reply to WernerCD
    WernerCD:
    (Like the aforementioned article submission process

    Oh, the submission worked fine. The article was well and truly submitted. Yup.

  • (disco) in reply to Onyx
    Onyx:
    Maybe we should build an article management app
    I'm sure Discourse can do it! <as well as it does everything else, at least>
    Quite:
    Job done, good work team.
    Also this.
  • (disco) in reply to ben_lubar
    ben_lubar:
    Add all billion elements (O(n)) and then sort the list (O(n log n)) Add all billion elements (O(n)), sorting after each add (O(n2 log n)) Add all billion elements (O(n)), but do a binary search each time (O(n log n)) and insert the element where it would go in sorted order (O(n2))

    The optimum is to do the first option, assuming sorting is inevitable. But I wondered if this is SQL based and, if so, why not have the sort engine return them ordered (think: index).

Leave a comment on “Sort, Then Add, Forever”

Log In or post as a guest

Replying to comment #458020:

« Return to Article