The Daily WTF: Curious Perversions in Information Technology

2008-03-11 Reply Admin

Erik:
All I can say is... good job Sony!

It's a trick, it's not a Sony...

2008-03-11 Reply Admin

snoofle:
Hi there - OP here. I can't believe nobody mentioned that

Because what you say below is wrong:

snoofle:
no routine that depends upon a random number generator can ever really be any more random than the underlying generator being used. In other words, assuming that the routine is implemented perfectly (as many have pointed out, it's not), it can never be any more random than: Math.random()

What do you mean by "more random"? Is Gaussian distribution or Poisson distribution "less random" than uniform distribution? Why? By what measures?

N.B. For simulations, I've written code (well... adopting/copying from pseudocode of pretty standard and known to be correct and fast algorithms) that generates random numbers with Gaussian distribution and Poisson distribution (as well as others). This is a very typical use of non-uniformly distributed random numbers. These algorithms use a uniformly distributed random number generator as a source. So, the quality of the generated numbers is delegated to quality of the source generator, and pretty good ones can be found. (On Linux, this can be based on /dev/random -- a true source of randomness.) Are you saying that an exponentially distributed random number generated algorithmically from a uniformly distributed random number source (of high quality) is "not very random"?

Thief^ · 2008-03-11 Reply Admin

Raw:
Hi there - OP here. I can't believe nobody mentioned that no routine that depends upon a random number generator can ever really be any more random than the underlying generator being used. In other words, assuming that the routine is implemented perfectly (as many have pointed out, it's not), it can never be any more random than: Math.random()

Actually, it can, and this algorithm actually does it, in a extremely backwards way.

No, it won't be more random, because the source generator generates a specific sequence of numbers, and using successive calls to build one number guarantees that parts of the number are based on other parts.

Not to mention that calling random so many times in a row for every random number you produce massively shortens the time until repetition.

Putting these together means that even if implemented optimally, doing this is just a tradeoff between length of random number and the "randomness" of the number.

If you want random numbers from -inf to +inf, better to just use a random generator that outputs 64 bits of random and reinterpret it as a double (handling NaNs etc as necessary).

2008-03-11 Reply Admin

You want blazing inefficiency? Try this if you want random numbers on a website(it just popped into my head):

Create a table in a database with only one column and make it an integer. Set the value of the first row to 0.

For each person who enters you website and needs a random number take the value of that row and assign it to them and then increase the value by 1. Once it reaches the maximum value reset it to 0.

Obviously if you want to get a number in the range 0->X then divide the value in the row by the maximum value and multiply by X.

Ultra Random!

2008-03-11 Reply Admin

I remember with pretty good privacy, to generate a private key, you were given the option of mashing the keyboard to create a random number. If I recall correctly, it used the milliseconds(modulo some number < human reaction time) to seed a random number generator.

I can recall that a few cracks came out for rsa, because it was possible to reduce the search number space by a few orders of magnitude by exploiting limitations in random number generators of the machine that made the private key. This is why pgp implemented this random number generator. For typical statistical and engineering programming, though, I think the standard random functions would be fine.

2008-03-11 Reply Admin

RobIII:
TraumaPony:
akatherder:
Random() returns increasingly randomer numbers the more you call it.
Actually, depending on the algorithm, the more times you call it, the more predictable it becomes.

You have GOT to be kidding me! Where's your sarcasm detector?

DUDE...

It went to sleep four hours before I posted.

2008-03-11 Reply Admin

You pay peanuts, you get monkeys - sometimes literally.

They are cheap, right?

2008-03-11 Reply Admin

You want blazing inefficiency? Try this if you want random numbers on a website(it just popped into my head):
Create a table in a database with only one column and make it an integer. Set the value of the first row to 0.

For each person who enters you website and needs a random number take the value of that row and assign it to them and then increase the value by 1

I think that's what we call a "page counter"

2008-03-11 Reply Admin

I think they should just use my 4dc5 sequence:

http://dudegalea.co.uk/articles/4dc5.html

which is clearly random by definition, and therefore brillant. :-)

2008-03-11 Reply Admin

Anonymous:

bramster:
I like to sort my random numbers.

And don't forget to use the "random sort" algorithm:

void sort(List numbers) {
    do shuffle(numbers);
    while (!sorted(numbers));
}

void shuffle(List numbers) {
    // typical algorithm to shuffle the list of numbers,
    // calling rand() to make it result random
}

boolean sorted(List numbers) {
    // check if the list of numbers is in sorted order;
    // if so, return true; else return false
}

:D

http://en.wikipedia.org/wiki/Bogosort

2008-03-11 Reply Admin

Thief^:
Not to mention that calling random so many times in a row for every random number you produce massively shortens the time until repetition.

Of course if they are worried about repetition then simply implement a RNG that has a larger period. I would recomment the Mersenne Twister MT19937(http://en.wikipedia.org/wiki/Mersenne_twister) with its period of 2^19937 - 1. Then they could generate as many of these dooper-random numbers without repetition (at least in this universal lifetime).

The code for implementation is available in a number of different languages. Not cryptographically secure in its basic form, but ideal for large sample monte carlo simulation type uses.

This comment brought to you by the society to ruin the joke by being serious ;)

2008-03-11 Reply Admin

And just in case it isn't enough to compound Random(), re-entrancy issues will probably introduce their own unpredictability in this...

2008-03-11 Reply Admin

SozzlyJoe:
You want blazing inefficiency? Try this if you want random numbers on a website(it just popped into my head):
Create a table in a database with only one column and make it an integer. Set the value of the first row to 0.

For each person who enters you website and needs a random number take the value of that row and assign it to them and then increase the value by 1

I think that's what we call a "page counter"

Not after I send my application off to the USPTO!

2008-03-11 Reply Admin

In case anyone missed it, the major wtf here is that the distribution represented by this class is NOT the uniform distribution (which most people expect -- be as likely to return "1" as "100001"). In particular, shorter numbers are more likely to occur than longer ones (because the length is chosen from a uniform distribution and then each digit is chosen independently from a uniform distribution).

While the programming isn't stellar, the real problem here is the author's complete and utter misunderstanding of "randomness". Hopefully they're not doing anything dependent on good random numbers!

2008-03-11 Reply Admin

Lollollollollllloooll If they aren't random then why use them?

2008-03-11 Reply Admin

This does not change the randomness but the probability density function. If Math.random() is uniformly distributed, this function returns logarithmic distributed random values (which is probably not intended by the author).

vidar712 · 2008-03-11 Reply Admin

Why don't they just do this?

int Random() { Return Random(); }

2008-03-11 Reply Admin

vidar712:
Why don't they just do this?
int Random() { Return Random(); }

B'cos it doesn't compile! :)

2008-03-11 Reply Admin

DaveK:
Well, since the range is so huge, I would have thought that just taking the result modulo ten should give a sufficiently even distribution; it's not like taking a 7-bit random from 0-127 modulo ten where each of the numbers 0-7 get 'one extra chance' to be picked than the others.
However looking at their code, I can only assume they'd generate a random number between 1 and 10 by calling the random routine and repeatedly looping until they got one that fell in range.....

... that could take a little while.

Not as long as you might think. The distribution of getRandomNumberOfDigits is heavily biased towards smaller numbers, so you're actually more likely to get a number from 1 to 10 than from 11 to 100, which is more likely than a number form 101 to 1000, and so on. So you see, it's really an optimization to make sure their looping algorithm runs faster. After all, who knows how long it'd run if they just did "while (x > 10) x = Math.Random()*10000000000;"?

2008-03-12 Reply Admin

AC:
DaveK:
However looking at their code, I can only assume they'd generate a random number between 1 and 10 by calling the random routine and repeatedly looping until they got one that fell in range.....
... that could take a little while.

Not as long as you might think. The distribution of getRandomNumberOfDigits is heavily biased towards smaller numbers, so you're actually more likely to get a number from 1 to 10 than from 11 to 100, which is more likely than a number form 101 to 1000, and so on. So you see, it's really an optimization to make sure their looping algorithm runs faster. After all, who knows how long it'd run if they just did "while (x > 10) x = Math.Random()*10000000000;"?

Who knows? Of course, man (not just God) knows.

Let Y be the no. of time that Math.Random() gets executed in a call of this code fragment. Assume the precondition x>10 upon entry. Assume further that Math.Random()*10000000000 returns an integer between 0 and 999999999 (inclusively) with even distribution.

Then, Y is a geometrically distributed random variable, with P(Y=1) = 1e-9, P(Y=2) = (1 - 1e-9) * 1e-9, ... In general,

P(Y=y) = 1e-9 * (1 - 1e-9)^(y-1)

The mean of Y is E(Y) = 1/1e-9 = 1e9 = 1000000000. i.e. the body of the loop runs 1000000000 times on average, under the assumptions stated above.

2008-03-12 Reply Admin

I love the comment next to the standard Java constructor. Yeah, dude, your constructor is TOTALLY there for parallelization and you're not just throwing that in to look badass. Suuure.

2008-03-12 Reply Admin

Especially when you call it at random!

2008-03-12 Reply Admin

I think it's for their next step.... the Random namespace, which includes the Random class, so they can add it all to their Random app.

2008-03-12 Reply Admin

I take it the coder hasn't heard about random.org.

2008-03-12 Reply Admin

why not use a secure random number generator if it matters that much?

2008-03-13 Reply Admin

Randomness haiku:

what will be will be

randomness is illusion

perception's limit

2008-03-14 Reply Admin

at least your offshore guys write better code than mine. I would be proud if mine could write that.

2008-03-14 Reply Admin

The best part is that it runs in parallel. So parallel streams will have potentially similar seeds in the random number generator meaning high correlation between the parallel random number generators.

2008-03-15 Reply Admin

It does not return logarithmically distributed function! Didn't you see the graphs tchize brought: http://img142.imageshack.us/my.php?image=density1ia8.png

???

The probability density function is fractal-shaped, because numbers that have less digits are more probable to appear: like 0,1 0,2 .. on the graph.

Even if you only square a continuous random variable with even probability density function, you will not get logarithmic distribution, but 1/x shaped distribution..

real_aardvark · 2008-03-15 Reply Admin

snarky:
I take it the coder hasn't heard about random.org.

I tried typing that in to my browser, but the letters got all muddled up.

Can I borrow an infinite number of monkeys, please?

2008-03-15 Reply Admin

why don't you fill an array of numbers generated by a prng, pick an index within this array by an prng, use this the number at this index as seed for your prng, and still be as random as your prng is ?

2008-03-16 Reply Admin

This must be a spoof. What offshore code has a comment? Much less one that well written?

2008-03-17 Reply Admin

Suxx0r:
It does not return logarithmically distributed function! Didn't you see the graphs tchize brought: http://img142.imageshack.us/my.php?image=density1ia8.png
The probability density function is fractal-shaped, because numbers that have less digits are more probable to appear: like 0,1 0,2 .. on the graph.

Even if you only square a continuous random variable with even probability density function, you will not get logarithmic distribution, but 1/x shaped distribution..

Actually, it is sorta kinda logarithmic. I used this algorithm as way to learn how to use some data analysis software -- I generated about 100000 samples and plotted a histogram of the results. When you look at the results on a log-log scale, it looks remarkably linear.

However, looking more deeply, a pattern emerges that should have been obvious from looking at the code. The numbers are uniformly random with each decade (i.e. uniform between 1 and 10, and between 10 and 100, and between 100 and 1000, etc.). However, the probability density within any given decade is 10 times higher than the next decade, i.e. the density of numbers between 1 and 10 is ten times greater than the density between 10 and 100, which is in turn ten times greater than between 100 and 1000, etc. As I said, this should have been obvious from the start, but I didn't realize it until I started playing with the analysis software and looking at details of the distribution.

Note: The distribution is not comparable to squaring a continuous random variable: random*random is not the same as (random)**2. Nor is the distribution fractal-shaped -- again, it's stepwise linear with the steps being logarithmically distributed along both axes.

2008-03-22 Reply Admin

This is actually not as bad as it seems - inefficient as all hell, but assuming your seed your RNG properly (it isn't in the listing, but it can be done outside this code easily), then you get this:

getRandomNumberOfDigits() is pretty stupid as no matter how many times you are calling Math.random() you'll only get 6.6 bits of entropy, but it doesn't matter much:
its hard to analyze the output of getRandomDigitString() as the amount of entropy you get out of it varies from call to call, but assuming that on average you'll get 50 digits then the result of random() would have on average 2*(50*3.3) = 330 bits of entropy, which isn't half bad considering Math.random() generates a predictable 53 bits of entropy per call.

The problem is of course (again, aside from the inefficiencies) that you are going to get a random amount of entropy from each call (a random amount of randomness if you will) which can be very little - as low as a few bits, especially with the first few calls if you haven't seeded your RNG - which is not what you want when you want to be secure in your randomness

2008-08-12 Reply Admin

Nah, they'd just pick up the 3 and 7 and smack them together a bunch.

2008-08-17 Reply Admin

hmm...let me guess, that was a Debian Maintainer?

Random Stupidity

Leave a comment on “Random Stupidity”