The Daily WTF: Curious Perversions in Information Technology

2012-05-18 Reply Admin

You're being racist. Can you please stop?

2012-05-19 Reply Admin

wbrianwhite:
Lurch:
wbrianwhite:
Frank:
snoofle:
Severity One:
Christian:
in my opinion it makes perfect sense to encapsulate something like a random-number generator in your own class, this way you can replace this component with one that returns a "random" number which you can control in your unit-tests.
Would you mind giving an example where a non-random random number makes sense?
If you want to write a test where you need predictable results, then you'll need to know the "random" inputs in advance.
Almost got there snoofle.
Everyone should know the answer to this question. If you don't, you're Doing It Wrong. (Or never worked with random numbers.)

Think about your workflow. You write software. You test it. It goes live. Someone finds a bug. You fix it. You test it again.

What is the goal of that last test?

Hmmm?

Are you sure?

If you answered "to confirm the bug is fixed" you lose. Well you get half points, which is 50%, which is an F in most classes.

The answer is "to confirm the bug is fixed and to ensure no new bug is introduced, or other old bug came back".

How do you do that? You save the results of the "before fix" test and compare them with the "after fix" test. Only one thing should have changed. The badness should now be goodness. No other new badness.

Now, if you have random numbers in the mix, most likely every run will be different. So how can you compare the output? Huh??

See, if you'd ever done proper testing against random numbers, you'd know you have to generate a known repeatable random number (or sequence) during this type of testing.

So 12321232 causes a problem, add a unit test that generates 12321232 every time.
Next month, 098098908 causes a problem, add a unit test that generates 098098908 every time. Repeat ad infinitum. Never have a test suite that runs N times generating actual random numbers every time, because that will only find actual bugs, rather than being repeatable. That doesn't seem right.

You know, every time your browser opens an HTTPS connection you send a random number to the browser to be used in establishing the connection. If you were unit testing the TLS protocol, would you generate the same number every time? It would miss the thrust of what you were trying to test wouldn't it?

mbrianwhite, you and kattman could conceivably both give me apoplexy. That is, if I EVER had to work with you.

Sure, TLS. There are standard vectors for testing. You could test (real) random vectors for the next hundred years, and still not have any real confidence. Put that stawman away.

And, I suspect that, since you (probably) can't crack the code, you would simply run the encrypted material through your decryptor, match the results and call it good?

Here is where I hit you in the head with a heavy object.

You could test real random vectors, and be testing the actual protocol. Or you could test using the same number every time, and be testing something that was almost TLS, except that it is 100% susceptible to replay attacks which is one of the primary things TLS is designed to protect against. Or you can do both. I would expect to see both done. In the predetermined sequence you know the calculated key you expect to get back using the server's public key and can validate that is correct. In the random sequence you just test that the protocol works and that you establish a shared key, and that at least tests the actual protocol.

I have seen errors caused when using the repository pattern where the unit test for the code passes, but the actual repository doesn't work. So I always like to see the actual code being exercised.

Am I really so annoying that people want to insult me and hit me in the head? If so, please at least state reasons you think I'm wrong.

Given:

The encryption phase is harnessed for testing.
The decryption phase is harnessed for testing.

And, assuming (typical) development:

The product provides both encryption and decryption
You use an RNG also under test harness
You send a message through the entire stack.

Let us look at some failure cases:

The key vector is computed incorrectly. In this case, the entire stack appears to work fine. The data may even appear "random". Of course, an entropy calculation filter on the actual stream would highlight the problem.
Everything has been done as above. Except that the decryption component has been replaced by another implementation, with a simple "pass/fail". Again, the RNG or key computation is incorrect. Message are "brute forcible". (Actually happened with Debian SSH). Can only be detected by proper analysis of the key path.

Once the key path is confirmed, it must be "boxed away" (the Debian fiasco occurred because a packaging dev "opened" the box to get rid of some valgrind warnings).

The testing of a crypto component in particular is difficult. Very, very difficult. It starts with known vectors. Followed by code path and data analysis. "Random" tests are really not at all useful, and mostly just hurt because they install a false sense of security. Aside from a "smoke test", but that is done more usefully with known vectors as well. Simply because, if the "smoke test" doesn't work, it needs to be replicated anyway.

(FWIW: FIPS 140 validation involves code analysis, and known test vectors. I have never needed "random" tests. However random testing for customer demonstration, along with a entropy analyzer is useful for sales purposes)

If random vectors are used in a smoke test, they would need to be stored along with the results (just in case of failure).

2012-05-19 Reply Admin

Joe:
Lurch:
Says Joe:
"You run the test 10 billion times so that you know you're testing every random 32-bit number (to within 5%) that might come up in production.

This also gives the advantage of never finding another problem during your career, and/or inflating the "number of tests passed" metric that you get a bonus on."

10 billion times? But there are only 4 billion 32 bit numbers (float or integer, doesn't matter, we are counting bit patterns)! So, you get OVER 200% coverage.

Learn something about randomness.

With only 65000 randomly-selected 32-bit numbers, you have over 50% probability of a duplicate. The probability of getting all 4 billion different values by selecting only 4 billion samples is quite small.

--Joe

Joe

If someone does 10 billion random tests for 4 billion possible cases -- that's, um... a serious WTF.

Just do 4 billion directed tests and guarantee 100% coverage.

That's all I meant to say.

Scarlet Manuka · 2012-05-20 Reply Admin

Severity One:
Would you mind giving an example where a non-random random number makes sense?

I'm sure SpectateSwamp would find a use for it.

2012-05-21 Reply Admin

Severity One:
Christian:
in my opinion it makes perfect sense to encapsulate something like a random-number generator in your own class, this way you can replace this component with one that returns a "random" number which you can control in your unit-tests.
Would you mind giving an example where a non-random random number makes sense?

Sure. Sometimes, a randomizer is used with a seed, which will make it return a consistent, rather than a random, result.

Randomizers used in this way are a nice tool in encrypting.

Seeing how the product is called 'legacy', the randomizer may be there to not exactly duplicate the behavior of the .NET randomizer, but an different one... for instance, the VB6 randomizer.

Been there, done that. Burnt the T-shirt.

peter

Captcha: quibus -- strangely enough, that is a (misspelled) Dutch colloquial word for a nutcase (kwibus).

Steve The Cynic · 2012-05-21 Reply Admin

DaveK:
Steve The Cynic:
In C / C++ on a 16-bit system, the first is wrong, because it says a day is 20864 seconds long.
(Hint: undecorated integer constants have type int if possible, and int is allowed to be 16 bits in C and C++. The promotion to uint32 is done after the multiplication.
Well in that case, the second one will also be truncated, won't it?

No, because the "defaults to int" behaviour is only if it can be an int with the value given. ("undecorated integer constants have type int if possible")

So on a 16-bit system:

32767 is an int
-32768 is an int
32768 is an unsigned int
65535 is an unsigned int
65536 is a long.
Perhaps most relevantly, 86400 is a long.
But 246060 is three ints multiplied together and the rules say that intint is an int so 2460*60 is also an int, and clips to 20864...

2012-05-21 Reply Admin

"Don't rely on the compiler to do your math." by "Not sure if Fry or just Philip"

That has to be the Dumbest or the Funniest phrase I've ever seen on The Daily WTF! (or both!)

ROFL, Jim

There are 10 types who have programming jobs:

Those who can program,
Those who can't.

2012-05-21 Reply Admin

Lurch:
If someone does 10 billion random tests for 4 billion possible cases -- that's, um... a serious WTF.
Just do 4 billion directed tests and guarantee 100% coverage.

That's all I meant to say.

Nope. That assumes that the PRNG is only called once per transaction (or whatever the main loop is doing). If it is called twice you need to square that (trice and **3, etc). And even then you might miss problems caused by leftover values from one iteration affecting the next.

Example (though not of a PRNG): One of my employer's computers had a bug where if a Procedure Exit instruction was preceded by one particular innocuous instruction, the exit would, well, not exit properly. No amount of individual instruction testing would have caught the bug. Some machine state was left weird, and it only affected the Procedure Exit instruction.

A PRNG example: a friend of mine was using a PRNG to initially place stars in a volume of space for a simulation of galaxy evolution. The PRNG was highly thought of, and passed many tests with flying colors (die, dice, poker hands, and other tests). When he took a stereoscopic picture of the resulting array he got what he called a "cosmic potato chip" instead of a spherical structure. It turned out that that PRNG had a known - but undocumented -- weakness -- sets of triplet values did have a non-random behavior. But it was fine for anything else! (or so they said).

An excellent PRNG is hard to find. Bad ones are easy and common.

Jim

and just who's opinions did you think these were?

2012-05-21 Reply Admin

Fool:
I do things like 199+6 isntead of 205 on occasion.
2 reasons:

I am often doing math where I might look at the code one day and say, why the heck did I put 205 there? But looking around I see the numbers 195 and 6 somewhere else in the code.

199+6 compiles to 205 so it should be equivilent to 205 in the binary file that is generated.

I'll also write "1000 * 60 * 10" to define 10 minutes in milliseconds, rather than "600000".

"199 + 6" could have been a left-over of some test code that ended up being kept, or it could relate to something that Mr. smart-ass Ryan failed to mention.

Ryan, honey, is that seriously the worst piece in that whole codebase that you were able to find, and you absolutely and desperately just had to find something to ridicule? Dumb-ass...

2012-05-23 Reply Admin

There's a HUGE difference between an "199 + 6" (*with nothing other than "Sanity Check" as a comment, and "1000 * 60 * 10" with a comment indicating something like "ten minutes, in ms".

The latter is clear and understandable, the former is not.

And there's no need to resort to name calling - While I freely admit to being a smart-ass, I don't particularly appreciate being called "honey" or "dumb-ass".

So lay off, if you will. There was plenty in the code base that was questionable, but that was a single line of code (2 with the comment) that made me bust out laughing in a slightly incredulous "This makes me so sad that I can only laugh to keep my sanity" type of laugh.

Captcha: nisl (National Institute of Silly Lexicon"

Sanity Check

Leave a comment on “Sanity Check”