- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
There is no need for that. You have the sun tutorials and you have the API. They are thorough and contain all the info you need. You don't require any books for producing good and clean java code. They are just expensive and are basically a list of implementations that will be outdated in a few years anyway. It only comes down whether you can program or not, which IMHO is much more than understanding APIs and syntax - it's a matter of experience, talent and understanding of software design itself. Not even all the programming language related books in the world can help you there as soon as you learned the basics of a given language. I assume the person who wrote that code simply lacked the understanding of the topic and didn't read into it while thinking that XORing a bunch of seemingly "random" values will do the trick. I bet no book would have kept him from producing this wonderful WTF.
Admin
I use the Examplets from the Java Developer's Almanac: http://java.sun.com/developer/codesamples/examplets/index.html
Admin
So like the Perl Cookbook, then.
Admin
As a starting point for getting entropy on a box, it's not that bad. I wouldn't call this a WTF. There are a few things that would come out in a good security review, but it's not nasty. He should be mixing in other real-world environmental things, if possible, but the practical issue may be that his runtime is sandboxed to the point that getting real sources of entropy is hard.
I'd redesign this to have an entropy sink of some kind that other callers could stuff values into (e.g., mouse movements, ticks when keypresses are made, etc.). And to bar more than one call to this during the application's run (since it doesn't really measure stuff that's terribly different, run-to-run -- even the time is pretty predictable after a crash).
Admin
Random can take a seed value, if one wanted to throw the system time or some other (long) value in there...
...likewise I didn't know SecureRandom existed. Have used some pseudo-random number generators though.
Admin
} // No idea where beginning of the block is. Method's too big.
Admin
Spoken like someone who's never heard of knuth.
Admin
I do the "comment at the end of a long loop" thing as well sometimes. Mostly when writing python, which has no curlies. And since we require tabs to be just two spaces, unless your font is large, it can sometimes get a little hard to tell where things are.
Admin
I personally don't consider all integer constants to be magic numbers, unless you really need to be able to handle the case where the number of bits in a byte will change.... The important differentiator for me is whether [or not] it is painfully obvious as to what the constant represents.
As for 512, there's probably a constant somewhere that could have been used in it's place.
Admin
A number of years ago, a person was able to beat the Keno machines in a Montreal, QC Casino. I think he analyzed the play, and was able to determine which random number generator was being used, and what the seed was.
I don't think the code-snippet is a WTF at all. Byte Magazine, back in 1989, (May, I think), had a really good article about random-number generators, with code to show how to have three generators running at the same time, each with its own seed, and then combining the results.
And of course, to really use random numbers effectively, one should be able to sort them.
Admin
Had he actally created a good seed, he could probably implement a Blum-Blum-Shub PRBN, which cannot be broken nearly as easily as a typical linear congruential generator that's usually used in a rand function for most libraries.
Admin
Admin
ALthough I think that was the intention. I'd have entitled this "Looking for entropy in all the wrong places"
Admin
Not quite: (nBits+7)/8 rounds to the next number of whole bytes; the addition is important to the implied integer rounding.
In 25 years, I've never seen anyone do this for a fractional number of bits, but then, this is TDWTF!
Admin
For the most part, the WTF here, is that he's reinvented the wheel, poorly. SecureRandom, on its own, is just as secure, if not more secure, than this poor imitation of SecureRandom.
Admin
In a previous life I was working on some stored procedures in MSSQL. The stored procedure would create a session ID using a whole bunch of rands one after the other. The problem was that we were getting session ID collisions very frequently considering that the session ID was a 32 byte character string. Turned out the code went something like
seed (time)
id=some_manipulation_of rand()
id+= some_manipulation_of rand()
id+= some_manipulation_of rand()
id+= some_manipulation_of rand()
id+= some_manipulation_of rand()
The issue is that rand() on mssql server is/was 16 (actually 15) bit (32768 values). I wrote a user procedure that called the stdlib version rand (32 bit) to generate the ID but was told to take it out because there was nothing wrong with the original code. I couldn't convince my boss that because rand() was determinate, the above manipulations would never produce more than what would be obtainable from a single rand() (i.e 16 bit) and could actually make things worse. In the end, I ended up just having to check for collisions and issue a new if it was already taken. Just as long as we never had more than 32768 customers I guess... Thing is, the guy was not a WTF guy otherwise. I guess he was just emotionally invested in that function...
Rich
Admin
Seems to be a rather typical case of NIH: http://en.wikipedia.org/wiki/Not_Invented_Here
Admin
This is actually standard practice for controlling the rounding of integer math. If you want to divide a positive integer x by some positive integer n rounded up, you do (x + n - 1) / n. It's the integer equivalent of the floating point ceil(x / n) (if x and n were floating point values).
Admin
And in Ruby, it'll work in one line:
def getRandomBits; Array.new(32) { rand 256 } end
Admin
So emacs and vi are really both less than 10 years old, and not over 30 years old each as I thought they were?
I learn new things every day at this web site. Not correct things, of course, but new ones... ;-)
Admin
Ok, I get it now, I was only thinking of the fact that 7/8 with int division would be 0 in java, not about the the other cases where the addition puts the sum over the next multiple of 8.
Can I have my first post stricken from the record? :)
Admin
Kids today...
Admin
I dunno about you, but I comment code like that whenever I can remember to, and God bless anyone else who does the same... it makes the job of maintaining the code INFINITELY easier...there are MANY wtfs in this code... commenting the end braces isn't one of them.
Admin
The best way to get randomness, I have heard, is in hardware... have the software "listen" to the noise generated by a resistor and use that.
Admin
Preach it, mah brothah. I hate having to read code that uses the aforementioned style of blocks. Brackets should line up, goshdarnit.
Admin
the WTF here is all the commenters who don't realise this code is OK.
This creates a random seed from all the available enthropy sources present. It is not meant to be called multiple times. Just at the beginning, to seed the random generator. (maybe re-seeding every hour...)
Admin
J.S. just made his application less secure. Great job!
Admin
You can do it in one line of Java too:
public static byte[] getRandomBits() { byte[] rand = new byte[32]; random.nextBytes(rand); return rand; }
:-)
Admin
Actually, I don't think that is true.
Your contension seems based on the idea that the value returned by one call is used as the seed for the next, so that if 12345 is followed by 23456, then that sequence will always appear in order in the values returned by rand(), with the only effect of the seed being when that sequence appears.
But I doubt that is the case. Having looked at the code, I can say that it's not the case for .Net Random object. (I don't have the source code for SQL's rand() available, so I can't say for sure)
So, 12345 may be followed by 23456 given one seed, and 34567 given another, then that code can produce more that 32768 different values --- granted, probably only 32768 * 5 different values, but that's still more.
Also, that assumes that all calls to rand() are done 5 at a time. If there's a single call to rand() elsewhere then that changes the sequence and you get another 32768 * 5.
Admin
At first I was like "Ok, someone reinvented a random funciton. Maybe they needed data that was more random than what the hardware generator was using or something..." Then I scrolled down the code.... Oh my god. What the hell was this person thinking? Do you get paid by the line? Even if pay was by line this is probably the worst implementation of a random function ever. I won't go into details, nor do I care to even think about them, but the data generated by this isn't really even that random. This is an abomination of code.
Admin
Well, assuming that "random" is a properly seeded static instance of the Random class, then the C# version is identical, except one letter would need to be capitalized.
Admin
All Bracket-Liner-Uppers are going to hell. ;->
Admin
What are all of the available entropy sources in question? He's just hashing the values of the system properties and then using the current time. The values of the System properties rarely, if ever, change during the execute of the VM, and are likely to be the exact same values every time the VM is run. The only source of entropy is the current time, which is a weak source.
Remember, the System properties are like environmnet variables in the VM. They contain things like the classpath, platform specific values (OS name, file separator char, etc.), and command-line parameters.
Admin
But, what you don't seem to be realizing is that most of these "entrophy sources" aren't presenting any entrophy. Most of what he starts with is the bytes of the ASCII representations of "key + val" for the system properties --- where "val" changes little, and "key" doesn't change at all. He then tossing in the time, but he's using currentTimeMillis, which is the time, in milliseconds, from 1-Jan-1970, which means, an hour from now, most of the bits will still be the same.
Admin
I like this part of the code. I'm amazed that I didn't notice anybody else to have yet mentioned it:
The bits array is not introduced as a local variable, so there must be a class field "private static byte[] bits;" in this class, which makes the outcome of this method actually quite random in a multithreaded environment.
Admin
I think that might be an incorrect assumption. I think it's far more likely the author of the very long segment heard something about the phenomenon described at http://alife.co.uk/nonrandom/ and took it to heart...
Nonetheless, it's still a WTF. Firstly, the author doesn't appear to have paused to consider whether, nonrandomness aside, Random() is in fact sufficiently random for the application concerned. Secondly, it seems to me that his version will in fact be considerably less random than the "problematic" Sun version...
One more demonstration that a little knowledge is a dangerous thing.
Admin
That being the case, it's still arguably (a) over-engineered, and (b) unlikely to be very entropic.
Admin
It would be very funny if this code is indeed a replacement of the java Random class for security reasons, and the software gets hacked because some maintainer has replaced the improved version with the suggested five-liner. THAT would really make a top scorer on this site.
Admin
Admin
It is.
It did. I tested it. I also found that there was a smaller range than the full 32768 values.
It wasn't .net. It wasn't c. It wasn't a cosmic ray detector wired to the serial port. It was MSSQL server. Version 5 if I recall correctly.
Admin
Or a noise diode. I'm honestly amazed that there aren't noise generator / sampler circuits routinely included in machines. I suppose the issue is whether analogue noise is sufficiently random...? (I'm not prepared to accept that cost is significant; we're talking about a latch, a comparator and a diode.)
Admin
If by appropriate you meant "pointless" then I totally agree.
This post was brought to you by the letters I, D and E.
Admin
And in fact, that is typically the case for pseudo-random numbers. Though the number returned may not actually be the seed itself, it will depend on the seed.
If you restrict the range, you may appear to have numbers that aren't repetetive but you will still be caught by the number of bits the seed holds. A 16 bit seed limits you to 65536 values however you slice it. In the case of mssql server, this was actually much less.
Mathematically, you could write it
f(g1(x),g2(x+1),g2(x+2)...) == j(x)
Since x is limited to 2^16 values and f and g and j are deterministic, it doesn't matter what you do to munge them around, you're still limited.
Now, if you get lucky and through threading or other multitasking some function happens to also call rand() in the middle of your five-in-a-row rand() calls and updates the seed, you might get a new value. But if your system is calling rand() often enough that that would make any real difference, you probably need a redesign (not to mantion that that is an action-at-a-distance WTF in itself). Throwing in some entropy is a good bet too.
Rich
Admin
An IDE is not always available or appropriate. Still, why comment anyway, it's going to work first time, right?
Rich
Admin
Absolutely right. The opening-brace-on-the-same-line made sense in the days of printed listing - not any more. (And for consistency, one should put the closing brace at the end of the final line of code, right?)
For Xmas, give yourselves braces that line up, and no closing "this is what this brace does" comments - if it really needs them, the block is too long, so break it down further. And ho ho ho.
Steve
captcha: pizza. yum.
Admin
We still do code reviews here with printed listings. We have some online code review type applications, but they've never taken hold.
So when printing out functions, even if they're small, they'll often go across page breaks. It's very handy to have the end-brace comment. Also, when modifying code, sometimes it can be easy to get lost (as with the 5 closing braces in the example) for somebody to maintain.
Count me on the "comments are never bad unless they're wrong or misleading" side of this jihad.
Admin
I know I certainly enjoy only being able to see half the code on the page in flurries of block open/closes. Why, it's like traveling back to DOS days! ... Indentation matters much more for readability than brace position ever does - and I'm not even a python user, it should just be obvious. If you don't indent well, no amount of braces will ever save you, and if you indent well, it'll be obvious at a glance what matches what.
That's because you can do this in one line in every language. Geez.
Admin
Maybe since he'd already used the phrase "to do" twice, he was saying that was enough of them. But then the code contains a "to do" so that can't be it.
I know, since he's provided a 4-line alternative, there's nothing left to do!
Admin
Cryptography, compression, networking, and audio/visual encoding almost always works on bits instead of bytes. (Thus it's called a bitstream, with bitrates, and so on.) It's just another way of seeing the same data, of course, but it gives you far more flexibilty when you're trying to squeeze every superfluous bit out of the stream.
Admin
For a stack of closing braces, I agree, comments are a good idea.
Of course, if you have a stack of closing braces, the WTF isn't the commenting.