- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
I love the use of qsort while completely ignoring the fact that the memory barrier is one of the main reasons why performance of low level functions degrade. I rate this two small rubber ducks out of ten.
Addendum 2023-04-18 07:02: I know this is C++ and there is not a shuffle function available by default, but in modern languages there are (like Random.Shuffle() in C#). However as always with random things someone has to be very, very careful with pseudo-RNG or simple algorithms like Yates, they are not suited for real randomness because both feature predictable distribution features, so doing a real shuffle should be left up to people that implement proper crypto libraries :-)
Admin
Custom compiler best compiler. Because custom compilers WTFs don't HALT.
Admin
for each position in the array, swap its element with an element at a random position of said array. rinse and repeat 3 times for extra randomness. </LOL>
Admin
No, it isn't.
int n = rand() % nb
means there are onlynb
possible weights, so there will (very likely) be collisions and elements with the same weight won't get shuffled. E.g. ifnb = 2
, 1/2 the time the weights will either both be 0.0 or both be 0.5 and qsort won't do anything. Hence you'll get [0, 1] three times as often as [1, 0].Using
rand() / (double) RAND_MAX
as the weight (or better yet, skip the division and leave as an int) would do a lot better. (At least untilnb
gets to a reasonable proportion ofsqrt(RAND_MAX)
and you start chancing collisions again.)Admin
Maybe this code runs really efficiently after being built by their compiler? (No, I don't think so either - it's probably even worse than it being built by a good compiler.)
Admin
Nah, this is impossible. First you have to allocated the area which is a non-zero operation, then you have to copy the area with is On, then you have qsort it which is O(n*log n) and then you have to copy it which is again On, then you have to free the allocated area which is again not a no-zero operation.
In comparison a Yates shuffle is a simple On operation.
So it is literally impossible that this algorithm will be faster or more resource efficient on any binary computer :)
Admin
This said, I'd love to peek under the hood of that compiler. It's been a while since I did something like that.
Admin
Yes, I completely agree. I was just trying to make a joke out of both the presented code and the compiler that this type of programmer would come up with!
Admin
Let's not. I enjoy the posts on this site, but I have a limit and one glance at today's code shut me down . . . I get enough of that at work.
Admin
Why do so many programmers when confronted with a problem that clearly is not a new one (like shuffling an array of numbers) jump to "I need to code my own solution to this from first principles" instead of "there's probably an algorithm that already does this"?
Admin
If you interested in modern compilers, there are actually a lot competent out there, like gcc and the Roslyn .net compiler:
If you are interested in bad compiler implementations ehm those are harder to find :-)
Admin
That depends on the random number source. If that's good, Fisher-Yates is optimal in that every possible permutation has the same probability. (That algorithm also minimises memory usage and number of swaps; it's really very good indeed.)
If you're using a junk RNG, that's a separate problem.
Admin
The worse the better.
Admin
The real crime here is the use of rand()
Admin
Ideally I'd want a C++ compiler (because I like C++) that technically obeys all the standards to the letter BUT violates any and all unwritten rules including the principle of least astonishment, common sense and common decency.
It should abuse every possibility to make the code perform. This includes generating code which whilst technically compliant is as inefficient as possible without breaking said compliance.
And I don't just mean memory mishandling or extra operations here. I mean like how modern compilers take advantage of knowing how modern CPUs work and their unique quirks and features to optimize toward their architecture and doing the absolute opposite by optimizing away from these.
Also it should generate code that is as spagetified as possible by deliberately abusing long jumps as much as possible to make decompilation for the purposes of debugging as painful as possible. This should be marketed as a security feature.
Finally bonus points if it can also bloat the size of compiled .exe files by sheer inefficiency (no cheating like padding allowed).
Admin
"they are not suited for real randomness"
How many applications need "real" randomness?
Admin
Kinda reminds me of those ways to generate a "random order" of an SQL table's rows by sorting on a random value. Uh, yeah. Right.
Admin
Quick sort is an unstable sort. If two numbers have the same weight, it's just as fair that they may swap positions each time it's evaluated.
If you want to make sure the numbers won't swap positions despite having the same weight, you need to use a stable sort.
Admin
Smithers gets it. The code shown, with the random number taken modulo nb, will have collisions in the variable being sorted. The collisions will cause a noticeable bias in the resulting permutations, because qsort() presumably defaults to "do nothing" if the sort variables are compared as equal.
Admin
I suspect the following would be better than the example code while being worse than every correct implementation. Thoughts?
f_cmp_valweight(const void * pv1, const void *pv2) { return Rand() % 3 - 1; }
Admin
I think an unstable sort just means it isn't defined how it will handle tied values, rather than it being 50/50 each time. E.g. when sorting two 0's it will either always swap them, or never swap them.
Admin
Another WTF is naming a type *_t. This is done a lot in POSIX, and guess what, POSIX actually reserves *_t "to the implementation".
Admin
For a couple seconds I believed somebody had submitted some part of the code I Inherited (still there after more than 12 years). Fortunately it avoids the bias Smithers noticed. JiffyPop’s idea is funny for esthetics, but it is also probably slower than the original, because it calls Rand() more often.
Admin
That would be undefined behavior, which is worse than the code in the article. Your version can return that a < b, b < c, and c < a, or even that a < b and b < a, both of which are impossible.
Admin
Or sometimes it will do it and sometimes it won't. The algorithm simply isn't required to be consistent at all (and might not be if it is doing things in multiple phases). In a stable sort, it's as if you have the original index of the value as an auxiliary key so that values are never truly tied.
Admin
Forgive my ignorance (I'm not a developer) but what's wrong with this?
Admin
LOL, I had the same idea.
Serious question: are there any sorting algorithms which rely on stable comparison results, and algorithm may result in an infinite loop?
Not a problem. Just cache the results of each comparison in a map.
Admin
This isn't even the correct use of rand(). As described by the ANSI C committee, the low-order bits are much less random than the high-order bits, so the sequence generated by taking the remainder after division by n is not uniformly distributed. To get a number from 0 to n-1, correct usage is:
((unsigned long long)rand() * n) / (RAND_MAX+1)
And if there isn't an srand() call to change the seed, you'll get the same sort order every time you run the program.
This quote from On-line Numerical Recipe in C https://www.unf.edu/~cwinton/html/cop4300/s09/class.notes/LCGinfo.pdf is enlightening:
"Be very, very suspicious of a system-supplied rand() that resembles the one just described. If all scientific papers whose results are in doubt because of bad rand()s were to disappear from library shelves, there would be a gap on each shelf about as big as your fist."
Admin
Actually this may come from a very old submission of mine under a different name (would be sorry to learn that someone has to deal with a similar piece of crap): the context really rings a bell.
So assuming this comes from my program, here is what I can tell: this is a niche language targeted for non-programmers. The core implementation was made some 26 years ago by an intern, who did the best they could with the help of one of the Dragon books. Over the years additional developpers added warts upon warts to get additional features.
Of course it is full of bugs, limitations, quirks and mishandled cases, but fortunately the users are actually quite smart and they know what to avoid, although this has a cost in boilerplate code.
At last the powers that be have allowed a refactoring of the code. We cannot start from scratch because of the tremendous amount of work that relies on the program, so this is going to take a while, but with good unit tests it should work nicely in the end.