The Daily WTF: Curious Perversions in Information Technology

QJo · 2012-05-17 Reply Admin

Christian:
in my opinion it makes perfect sense to encapsulate something like a random-number generator in your own class, this way you can replace this component with one that returns a "random" number which you can control in your unit-tests.

Maybe not so perfect. So, in your unit tests, and other automatic regression tests, you've got a little flag which tells the app whether it's being run in "test" mode. If it is, then path A is negotiated. If not, then path B.

The predictable happens, and there's a bug in path B. It's your job now to go and explain to the boss why was this not caught by the regression tests.

Not so unusual. This happened once, on my watch, to mine and my team's acute embarrassment. It's a common mistake to make. Only this week a colleague suggested that we embark on a similar strategy. My suggestion is that if you have designed an application which can not be tested, then it needs to be re-engineered so that it can.

Ultimately it's an engineering decision: do we spend extra effort on making it bulletproof, or do we risk something coming burrowing out of the woodwork later to bite us on the ankles? Your call.

2012-05-17 Reply Admin

like the System.Environment.Exit calls peppered throughout library code that causes the app to inexplicably exit.

I'm no .NET expert, but it seems to me exits caused by a function called "Exit" seem perfectly explicable to me...

Matt Westwood · 2012-05-17 Reply Admin

Gazzonyx:
TRWTF is that there isn't a comment on the magic numbers there. Not that there is anything wrong with something like this, but it should be documented. For instance, this is one of my commits to Samba :
[...]
/* domain/username%password */

const int max = MAX_DOMAIN_SIZE +
            MAX_USERNAME_SIZE +
            MOUNT_PASSWD_SIZE + 2;
[...]

So what's the fucking 2 for, you stupid prick?

2012-05-17 Reply Admin

Severity One:
Christian:
in my opinion it makes perfect sense to encapsulate something like a random-number generator in your own class, this way you can replace this component with one that returns a "random" number which you can control in your unit-tests.
Would you mind giving an example where a non-random random number makes sense?

this way you can replace this component with one that returns a "random" number which you can control in your unit-tests

2012-05-17 Reply Admin

Tyler:
Looks like a normal thing to do. He has a fixed length string of 199 characters that has 6 chars appended to the end.
As for "within a class" sounds good, better than global, I've even done such things as:

//Tightly-Scope { int bPtr=0,ePtr=0; ... } (No if statement/while block etc. to set off the block).

Yup, not even remotely a WTF.

QJo · 2012-05-17 Reply Admin

Not sure if Fry or just Philip:
Since there seems to be a misunderstanding, here's my point: Don't rely on the compiler to do your math. Show what you meant in the comment, and store the calculated result in the const.

Certainly not.

The whole point of using the compiler to do your arithmetic is so that if one of the terms in your calculation changes you just have to change it in that one place.

This is particularly important where one of the numbers is used in more than one place, and there is a multiplicity of constants which depend on its value. And if that is not the case now, then it may possibly be the case in the future.

2012-05-17 Reply Admin

Zylon:
Vamp:
I have done that with constants before. For example timeouts or similar:
const int TWO_MINUTE_CONSTANT = 2 * 60;

Makes it easier to read for me later.
So, in case at some point in the future you forget what "TWO MINUTES" means?

No, so that it is clear where the constant came from.

As others have said on here a bazillion times, a constant like TWO_MINUTE_CONSTANT is a bad name, only slightly better than "const int TEN = 10". A more likely name would be, say, "CACHE_TIMEOUT". So suppose I see "CACHE_TIMEOUT=46060". That clearly tells me that we're counting 4 of something that normally comes in packets of 60 which in turn come in packets of 60. I can quickly guess that this likely means 4 hours. But what if I see "CACHE_TIMEOUT=14400"? Where did that number come from? Is it second, milliseconds, hours? Or is it not a time at all, maybe it's the maximum number of entries to leave in the cache when the timeout expires. Etc etc. It's little pieces of self-documentation like that that make the difference between easy-to-read code and hard-to-read code.

Matt Westwood · 2012-05-17 Reply Admin

Jay:
Zylon:
Vamp:
I have done that with constants before. For example timeouts or similar:
const int TWO_MINUTE_CONSTANT = 2 * 60;

Makes it easier to read for me later.
So, in case at some point in the future you forget what "TWO MINUTES" means?

No, so that it is clear where the constant came from.

As others have said on here a bazillion times, a constant like TWO_MINUTE_CONSTANT is a bad name, only slightly better than "const int TEN = 10". A more likely name would be, say, "CACHE_TIMEOUT". So suppose I see "CACHE_TIMEOUT=46060". That clearly tells me that we're counting 4 of something that normally comes in packets of 60 which in turn come in packets of 60. I can quickly guess that this likely means 4 hours. But what if I see "CACHE_TIMEOUT=14400"? Where did that number come from? Is it second, milliseconds, hours? Or is it not a time at all, maybe it's the maximum number of entries to leave in the cache when the timeout expires. Etc etc. It's little pieces of self-documentation like that that make the difference between easy-to-read code and hard-to-read code.

Almost there - CACHE_TIMEOUT_SECONDS would be better. Then the obviousness is even more blinding.

wbrianwhite · 2012-05-17 Reply Admin

Matthew:
Repeatability testing.
If I'm testing a simulation program, I may want to run the simulation under the same set of constraints repeatedly for testing, but use random numbers in production.

So you don't want to test your production code, got it.

2012-05-17 Reply Admin

Any procedurally generated content. A good example is Perlin Noise. It's best to have a random number generator that's repeatable (so no physical sources such as various forms of radiation) so you can recreate the object, image, etc. at a later date.

2012-05-17 Reply Admin

"I don't even want to know why a constant is declared inside a class"

Umm ... maybe so that the constant is associated with the class that it relates to? I thought that was one of the key ideas of object-oriented programming: Group related code and data together.

Suppose I have, let us say, an "Integer" object. And I want to have a constant for the maximum value that can be stored in an Integer. Where should I logically put this constant? In some grab-bag place that includes hundreds of other constants used all over the system? Or in the Integer class?

If I put it in the Integer class, then anyone looking at the Integer class has a good chance of finding it. If we adopt this as a general practice, than anyone looking for constants related to Integers knows exactly where to look. Anyone who notices this constant in the Integer class can reasonably expect that it relates to Integers and not to something else, and furthermore that it relates to Integers in general, and not to a particular Integer used in some other class or set of classes.

If I put the constant in a grab-bag of globals, none of the above is true. With all constants in there, there's no easy way to tell what constants relate to what classes. I suppose we could establish a naming convention, like all constants must have names beginning with the class name that they are related to. But you know that people will forget or ignore such a rule. Having unrelated things in one place makes anything harder to find. Etc etc.

Oh, and by the way, in modern languages, putting constants in the class they relate to gives an extra bit of self-documentation. If I see code that says, say, "if (x==READ) ...", I might well wonder if this means that we are presently reading, that a file is readable, that the last action was a read, etc. But if I see "if (x==Permissions.READ) ...", that gives me a much better idea what's going on.

wbrianwhite · 2012-05-17 Reply Admin

Frank:
snoofle:
Severity One:
Christian:
in my opinion it makes perfect sense to encapsulate something like a random-number generator in your own class, this way you can replace this component with one that returns a "random" number which you can control in your unit-tests.
Would you mind giving an example where a non-random random number makes sense?
If you want to write a test where you need predictable results, then you'll need to know the "random" inputs in advance.
Almost got there snoofle.
Everyone should know the answer to this question. If you don't, you're Doing It Wrong. (Or never worked with random numbers.)

Think about your workflow. You write software. You test it. It goes live. Someone finds a bug. You fix it. You test it again.

What is the goal of that last test?

Hmmm?

Are you sure?

If you answered "to confirm the bug is fixed" you lose. Well you get half points, which is 50%, which is an F in most classes.

The answer is "to confirm the bug is fixed and to ensure no new bug is introduced, or other old bug came back".

How do you do that? You save the results of the "before fix" test and compare them with the "after fix" test. Only one thing should have changed. The badness should now be goodness. No other new badness.

Now, if you have random numbers in the mix, most likely every run will be different. So how can you compare the output? Huh??

See, if you'd ever done proper testing against random numbers, you'd know you have to generate a known repeatable random number (or sequence) during this type of testing.

So 12321232 causes a problem, add a unit test that generates 12321232 every time.
Next month, 098098908 causes a problem, add a unit test that generates 098098908 every time. Repeat ad infinitum. Never have a test suite that runs N times generating actual random numbers every time, because that will only find actual bugs, rather than being repeatable. That doesn't seem right.

You know, every time your browser opens an HTTPS connection you send a random number to the browser to be used in establishing the connection. If you were unit testing the TLS protocol, would you generate the same number every time? It would miss the thrust of what you were trying to test wouldn't it?

wbrianwhite · 2012-05-17 Reply Admin

Zylon:
Vamp:
I have done that with constants before. For example timeouts or similar:
const int TWO_MINUTE_CONSTANT = 2 * 60;

Makes it easier to read for me later.
So, in case at some point in the future you forget what "TWO MINUTES" means?

No. So that you can redefine two minutes to 1 minute. It makes your application twice as fast of course.

2012-05-17 Reply Admin

QJo:
Not sure if Fry or just Philip:
Since there seems to be a misunderstanding, here's my point: Don't rely on the compiler to do your math. Show what you meant in the comment, and store the calculated result in the const.

Certainly not.

The whole point of using the compiler to do your arithmetic is so that if one of the terms in your calculation changes you just have to change it in that one place.

This is particularly important where one of the numbers is used in more than one place, and there is a multiplicity of constants which depend on its value. And if that is not the case now, then it may possibly be the case in the future.

And, I might add, to make it easier to track down errors.

Suppose I see

// max lifetime is 4 hours
public final static int LIFETIME_SECONDS = 4 * 60 * 50;

It's pretty likely that the "50" is a typo and should have been "60".

But suppose instead it says

// max lifetime is 4 hours
public final static int LIFETIME_SECONS = 12000;

Well, 4 hours is not 12,000 seconds. But is the error that the programmer made an arithmetic mistake? Or that the time limit changed and no one updated the comment?

Sure, just writing "expectedlength = 199 + 6" is cryptic because I have no clue where those numbers came from. But

// length = data + header
public final static int EXPECTED_LENGTH = 199 + 6;

is excellent coding style. I see exactly what's happenning, and if one of those numbers changes, you just change the thing that changed and let the compiler re-do the calculation. Sure, with just two numbers it's not a big deal, but I've had times when I've had things more like:

// length = header length + number of packets * packet length + trailer length
public final static int LEN = 22 + 4 * 80 + 40;

Now if we up the number of packets to 5, we just change the 4 to a 5. We don't redo any manual arithmetic and introduce the possibility of an arithmetic error. We don't have to update the comment to keep it synchronized with the code. Lots of good things.

KattMan · 2012-05-17 Reply Admin

Why does it not surprise me thagt so many here have no idea how to really unit test?

A random number generator should be replaced with a repeatable series generator when testing the code that uses it's results, but then you do still need to test the randomness of the generator in a separaate test.

Now randomness is usually not needed in business applications, but maybe there are a few cases that need it. In those cases I'm sure it is not a case of "What happens when this particular number is generated?" but rather "What happens when the same number is generated back to back?" You can't test this using the generater directly, remember you aren't testing the generator here, but rather the code that uses the results.

In other apps, let's say game simulators, you may do something specific with the random number generated, but then you are usually using a small generation range, like 1-6 or maybe 1-52 for card sets. In this you want a test that passes each value into the test for the code that uses the result.

In neither case are you testing the randomness of the generator, you are testing the code that uses the results. The random number generator for production should be tested sepratly to make sure it reaches a good level of randomness for your usage.

Why can't people understand this?

KattMan · 2012-05-17 Reply Admin

Jay:
// length = header length + number of packets * packet length + trailer length public final static int LEN = 22 + 4 * 80 + 40;
Now if we up the number of packets to 5, we just change the 4 to a 5. We don't redo any manual arithmetic and introduce the possibility of an arithmetic error. We don't have to update the comment to keep it synchronized with the code. Lots of good things.

I would argue for more constants:

public final static int header_length = 22; public final static int number_of_packets = 4; public final static int packet_length = 80; public final static int trailer_length = 40;

public final static int LEN = header_length + number_of_packets * packet_length + trailer_length;

better documentation and if any of these values are used elsewhere you only change one place. Also if any are not to be constant the values are easily changed without changing other code.

2012-05-17 Reply Admin

[quote user="KattMan"][quote user="Jay"] // length =

I would argue for more constants:

public final static int header_length = 22; public final static int number_of_packets = 4; public final static int packet_length = 80; public final static int trailer_length = 40;

public final static int LEN = header_length + number_of_packets * packet_length + trailer_length;

better documentation and if any of these values are used elsewhere you only change one place. Also if any are not to be constant the values are easily changed without changing other code. [/quote]

Nope. You've got to serialize all your constants in an XML file.

The tricky part is to reference the file without using any string literals. So we need another XML file that contains the filename of our configuration XML file. Which means we'll need yet another XML file...

2012-05-17 Reply Admin

QJo:
Christian:
in my opinion it makes perfect sense to encapsulate something like a random-number generator in your own class, this way you can replace this component with one that returns a "random" number which you can control in your unit-tests.
My suggestion is that if you have designed an application which can not be tested, then it needs to be re-engineered so that it can.

He did that. He made a wrapper for his random number generator that could choose between pseudo-random numbers, or a predetermined sequence. (Why the original comment has very little to do with the article has already been addressed.)

And by the way, there's no reason to call the predetermined sequence "test" mode. Some tests would use it, but some should use the pseudo-random number generator. They just test different things.

KattMan · 2012-05-17 Reply Admin

Anna Moose:
The tricky part is to reference the file without using any string literals. So we need another XML file that contains the filename of our configuration XML file. Which means we'll need yet another XML file...

Hell yeah, employeed for life and never finishing, sounds like the perfect setup for this economy.

dkf · 2012-05-17 Reply Admin

Vamp:
I have done that with constants before. For example timeouts or similar:
const int TWO_MINUTE_CONSTANT = 2 * 60;
Makes it easier to read for me later. Maybe it was something similar for those 2 values :)

Plus all the bonus fun you get when some manager decides that the timeout now has to be 5 minutes, so you update that to read:

const int TWO_MINUTE_CONSTANT = 5 * 60;

The moral is “use the meaning in names, not the value”…

The Great Lobachevsky · 2012-05-17 Reply Admin

Speakign of sanity - am I the only one whose page up and page down keys inexplicably aren't working on this site anymore? (granted, I'm forced to use IE7, which causes all sorts of weird things)

2012-05-17 Reply Admin

dkf:
Vamp:
I have done that with constants before. For example timeouts or similar:
const int TWO_MINUTE_CONSTANT = 2 * 60;
Makes it easier to read for me later. Maybe it was something similar for those 2 values :)
Plus all the bonus fun you get when some manager decides that the timeout now has to be 5 minutes, so you update that to read:
const int TWO_MINUTE_CONSTANT = 5 * 60;
The moral is “use the meaning in names, not the value”…

Who ever said we were using it directly as a timeout?

const int TWO_MINUTE_CONSTANT = 2 * 60;
const int FIVE_MINUTE_CONSTANT = 5 * 60;
...
int currentTimeout = FIVE_MINUTE_CONSTANT;

2012-05-17 Reply Admin

enim is Latin for "truly"

2012-05-17 Reply Admin

Dan Minjoc:
Which would you rather have?
const uint32 NUM_SEC_IN_DAY = 60 * 60 * 24; const unit32 NUM_SEC_IN_DAY = 86400;

Both are correct, but the first one shows exactly where the calculation is coming from.

I agree with you, but this an unfortunate example. Some days have 23 hours, other days have 25.

That fact has bit me often enough that I've made a point of watching out for it. :/

shadowman · 2012-05-17 Reply Admin

Jack:
dkf:
Vamp:
I have done that with constants before. For example timeouts or similar:
const int TWO_MINUTE_CONSTANT = 2 * 60;
Makes it easier to read for me later. Maybe it was something similar for those 2 values :)
Plus all the bonus fun you get when some manager decides that the timeout now has to be 5 minutes, so you update that to read:
const int TWO_MINUTE_CONSTANT = 5 * 60;
The moral is “use the meaning in names, not the value”…
Who ever said we were using it directly as a timeout?
const int TWO_MINUTE_CONSTANT = 2 * 60;
const int FIVE_MINUTE_CONSTANT = 5 * 60;
...
int currentTimeout = FIVE_MINUTE_CONSTANT;

I re-factored this for you to make it more readable:

#define TIMES(A,B) A*B

const int ONE = 1;
const int TWO = 2;
.
.
.

const int SECONDS_IN_MINUTE = 60;
const int ONE_MINUTE_CONSTANT = TIMES(ONE,SECONDS_IN_MINUTE);
const int TWO_MINUTE_CONSTANT = TIMES(TWO,SECONDS_IN_MINUTE);

Speakerphone Dude · 2012-05-17 Reply Admin

Severity One:
Christian:
in my opinion it makes perfect sense to encapsulate something like a random-number generator in your own class, this way you can replace this component with one that returns a "random" number which you can control in your unit-tests.
Would you mind giving an example where a non-random random number makes sense?

Typically for testing boundary conditions (such as 0).

Speakerphone Dude · 2012-05-17 Reply Admin

The Great Lobachevsky:
Speakign of sanity - am I the only one whose page up and page down keys inexplicably aren't working on this site anymore? (granted, I'm forced to use IE7, which causes all sorts of weird things)

Did this problem appeared before or after you spilled your coffee on the keyboard?

Speakerphone Dude · 2012-05-17 Reply Admin

Anna Moose:
The tricky part is to reference the file without using any string literals. So we need another XML file that contains the filename of our configuration XML file. Which means we'll need yet another XML file...

Duh, this is why God created environment variables

2012-05-17 Reply Admin

Speakerphone Dude:
Anna Moose:
The tricky part is to reference the file without using any string literals. So we need another XML file that contains the filename of our configuration XML file. Which means we'll need yet another XML file...

Duh, this is why God created environment variables

I think it's fairly obvious from this reading of The Book of Bullshidia 3:14 that The Lord doesn't believe in your amoral "thinking machines".

And the man did flail his fists And the keyboard keys did fly And the Lord did grin And he said unto his prophet "Burn all such devices lest they drive your people to madness and damnation"

wbrianwhite · 2012-05-17 Reply Admin

KattMan:
Why does it not surprise me thagt so many here have no idea how to really unit test?
A random number generator should be replaced with a repeatable series generator when testing the code that uses it's results, but then you do still need to test the randomness of the generator in a separaate test.

Now randomness is usually not needed in business applications, but maybe there are a few cases that need it. In those cases I'm sure it is not a case of "What happens when this particular number is generated?" but rather "What happens when the same number is generated back to back?" You can't test this using the generater directly, remember you aren't testing the generator here, but rather the code that uses the results.

In other apps, let's say game simulators, you may do something specific with the random number generated, but then you are usually using a small generation range, like 1-6 or maybe 1-52 for card sets. In this you want a test that passes each value into the test for the code that uses the result.

In neither case are you testing the randomness of the generator, you are testing the code that uses the results. The random number generator for production should be tested sepratly to make sure it reaches a good level of randomness for your usage.

Why can't people understand this?

So if you're testing an online poker bot, the best way to test it is with a series of games where the other players always have the exact same hands? Seems like an excellent way to introduce algorithm errors where the algorithm works well in those specific cases but would fail elsewhere. If you want to say that is integration testing's job that is fine, but not many shops have a formal integration testing suite that is actually separate from unit tests. I could write unit tests for my code where my code would work on 4, 9, and 67 and put those in as the test values, but my could could just blow up on 5, 10, and 68 and that test suite would detect no problem.

Also, please define "tested to be sure it generates a good level of randomness". How would you measure this? What is 80% random versus 85% random? If this was actually a quantifiable event, computers wouldn't be stuck with pseudo-random number generators. Generally all you can do is use a PRNG from a crypto library in preference to the universally flawed RAND built in language functions.

2012-05-17 Reply Admin

Jay:
"I don't even want to know why a constant is declared inside a class"
Umm ... maybe so that the constant is associated with the class that it relates to? I thought that was one of the key ideas of object-oriented programming: Group related code and data together.

Suppose I have, let us say, an "Integer" object. And I want to have a constant for the maximum value that can be stored in an Integer. Where should I logically put this constant? In some grab-bag place that includes hundreds of other constants used all over the system? Or in the Integer class?

If I put it in the Integer class, then anyone looking at the Integer class has a good chance of finding it. If we adopt this as a general practice, than anyone looking for constants related to Integers knows exactly where to look. Anyone who notices this constant in the Integer class can reasonably expect that it relates to Integers and not to something else, and furthermore that it relates to Integers in general, and not to a particular Integer used in some other class or set of classes.

If I put the constant in a grab-bag of globals, none of the above is true. With all constants in there, there's no easy way to tell what constants relate to what classes. I suppose we could establish a naming convention, like all constants must have names beginning with the class name that they are related to. But you know that people will forget or ignore such a rule. Having unrelated things in one place makes anything harder to find. Etc etc.

Oh, and by the way, in modern languages, putting constants in the class they relate to gives an extra bit of self-documentation. If I see code that says, say, "if (x==READ) ...", I might well wonder if this means that we are presently reading, that a file is readable, that the last action was a read, etc. But if I see "if (x==Permissions.READ) ...", that gives me a much better idea what's going on.

I'm pretty sure the original post is just poorly worded. Constants in a class are fine and dandy... but the post states that this gem was found inside a class methods. So, it would only be in the scope of that one method, not usuable from the rest of the class or outside. Important distinction.

ochrist · 2012-05-17 Reply Admin

Sanity Clause...

I don't believe in no Sanity Clause!

2012-05-17 Reply Admin

Ok... I did read it at least five times and still fail to see what's the WTF here.

It only seems to be a matter of personal preferences of the programmer. Nothing wrong with it.

2012-05-17 Reply Admin

Dan Minjoc:
Not sure if Fry or just Philip:
airdrik:
Not sure if Fry or just Philip:
Really guys, use the fucking comments if you want to explain magic numbers.
// length is 199 (because of foo) + 6 (because of bar) public final static or whatever length = 205;

Even better, use javadoc.

Oh, now that makes perfect sense, thanks!
Since there seems to be a misunderstanding, here's my point: Don't rely on the compiler to do your math. Show what you meant in the comment, and store the calculated result in the const. On the other hand, generic variables like "foo" and "bar" are the most evil variable names ever. While they're perfectly acceptable in short and meaningless examples, they should never, ever appear in production code. Except when they mean something, like spacebar or meter bar.

Disagree. The compiler will automatically evaluate the expressions. Why decrease code readability for no benefit?

Which would you rather have?

const uint32 NUM_SEC_IN_DAY = 60 * 60 * 24; const unit32 NUM_SEC_IN_DAY = 86400;

Both are correct, but the first one shows exactly where the calculation is coming from.

I agree with this wholeheartedly. It's also easier to catch errors when the math is spelled out like this.

This would probably be caught easily by anyone that's used to seeing the correct number: const unit32 NUM_SEC_IN_DAY = 85400;

this would certainly be caught by anyone with half a brain: const uint32 NUM_SEC_IN_DAY = 60 * 60 * 23;

Write your code so it's easier to spot problems. Your brain can do way more work at once when it doesn't have to concentrate as hard on the simplest things.

2012-05-17 Reply Admin

Says Joe:

"You run the test 10 billion times so that you know you're testing every random 32-bit number (to within 5%) that might come up in production.

This also gives the advantage of never finding another problem during your career, and/or inflating the "number of tests passed" metric that you get a bonus on."

10 billion times? But there are only 4 billion 32 bit numbers (float or integer, doesn't matter, we are counting bit patterns)! So, you get OVER 200% coverage.

2012-05-17 Reply Admin

wbrianwhite:
KattMan:
Why does it not surprise me thagt so many here have no idea how to really unit test?
A random number generator should be replaced with a repeatable series generator when testing the code that uses it's results, but then you do still need to test the randomness of the generator in a separaate test.

Now randomness is usually not needed in business applications, but maybe there are a few cases that need it. In those cases I'm sure it is not a case of "What happens when this particular number is generated?" but rather "What happens when the same number is generated back to back?" You can't test this using the generater directly, remember you aren't testing the generator here, but rather the code that uses the results.

In other apps, let's say game simulators, you may do something specific with the random number generated, but then you are usually using a small generation range, like 1-6 or maybe 1-52 for card sets. In this you want a test that passes each value into the test for the code that uses the result.

In neither case are you testing the randomness of the generator, you are testing the code that uses the results. The random number generator for production should be tested sepratly to make sure it reaches a good level of randomness for your usage.

Why can't people understand this?

So if you're testing an online poker bot, the best way to test it is with a series of games where the other players always have the exact same hands? Seems like an excellent way to introduce algorithm errors where the algorithm works well in those specific cases but would fail elsewhere. If you want to say that is integration testing's job that is fine, but not many shops have a formal integration testing suite that is actually separate from unit tests. I could write unit tests for my code where my code would work on 4, 9, and 67 and put those in as the test values, but my could could just blow up on 5, 10, and 68 and that test suite would detect no problem.

Also, please define "tested to be sure it generates a good level of randomness". How would you measure this? What is 80% random versus 85% random? If this was actually a quantifiable event, computers wouldn't be stuck with pseudo-random number generators. Generally all you can do is use a PRNG from a crypto library in preference to the universally flawed RAND built in language functions.

Firstly, there are ways of testing random numbers. Second, you have a very poor understanding of regression testing. Third, I don't think you understand the concept of "universal". And, how would you know the quality of the "PRNG for a crypto library" anyway? (given you don't know how to test random number sequences). Indeed, (and this IS meant to be insulting), your comment is the real WTF here.

2012-05-17 Reply Admin

Say you want to document the number of seconds in a day:

#define SECONDS_IN_MINUTE 60 #define MINUTES_IN_HOUR = 60 #define HOURS_IN_DAY 12

#define DAY_AS_SECONDS SECONDS_IN_MINUTE * MINUTES_IN_HOUR * HOURS_IN_DAY

2012-05-17 Reply Admin

wbrianwhite:
Frank:
snoofle:
Severity One:
Christian:
in my opinion it makes perfect sense to encapsulate something like a random-number generator in your own class, this way you can replace this component with one that returns a "random" number which you can control in your unit-tests.
Would you mind giving an example where a non-random random number makes sense?
If you want to write a test where you need predictable results, then you'll need to know the "random" inputs in advance.
Almost got there snoofle.
Everyone should know the answer to this question. If you don't, you're Doing It Wrong. (Or never worked with random numbers.)

Think about your workflow. You write software. You test it. It goes live. Someone finds a bug. You fix it. You test it again.

What is the goal of that last test?

Hmmm?

Are you sure?

If you answered "to confirm the bug is fixed" you lose. Well you get half points, which is 50%, which is an F in most classes.

The answer is "to confirm the bug is fixed and to ensure no new bug is introduced, or other old bug came back".

How do you do that? You save the results of the "before fix" test and compare them with the "after fix" test. Only one thing should have changed. The badness should now be goodness. No other new badness.

Now, if you have random numbers in the mix, most likely every run will be different. So how can you compare the output? Huh??

See, if you'd ever done proper testing against random numbers, you'd know you have to generate a known repeatable random number (or sequence) during this type of testing.

So 12321232 causes a problem, add a unit test that generates 12321232 every time.
Next month, 098098908 causes a problem, add a unit test that generates 098098908 every time. Repeat ad infinitum. Never have a test suite that runs N times generating actual random numbers every time, because that will only find actual bugs, rather than being repeatable. That doesn't seem right.

You know, every time your browser opens an HTTPS connection you send a random number to the browser to be used in establishing the connection. If you were unit testing the TLS protocol, would you generate the same number every time? It would miss the thrust of what you were trying to test wouldn't it?

mbrianwhite, you and kattman could conceivably both give me apoplexy. That is, if I EVER had to work with you.

Sure, TLS. There are standard vectors for testing. You could test (real) random vectors for the next hundred years, and still not have any real confidence. Put that stawman away.

And, I suspect that, since you (probably) can't crack the code, you would simply run the encrypted material through your decryptor, match the results and call it good?

Here is where I hit you in the head with a heavy object.

2012-05-17 Reply Admin

wbrianwhite:
KattMan:
Why does it not surprise me thagt so many here have no idea how to really unit test?
A random number generator should be replaced with a repeatable series generator when testing the code that uses it's results, but then you do still need to test the randomness of the generator in a separaate test.

Now randomness is usually not needed in business applications, but maybe there are a few cases that need it. In those cases I'm sure it is not a case of "What happens when this particular number is generated?" but rather "What happens when the same number is generated back to back?" You can't test this using the generater directly, remember you aren't testing the generator here, but rather the code that uses the results.

In other apps, let's say game simulators, you may do something specific with the random number generated, but then you are usually using a small generation range, like 1-6 or maybe 1-52 for card sets. In this you want a test that passes each value into the test for the code that uses the result.

In neither case are you testing the randomness of the generator, you are testing the code that uses the results. The random number generator for production should be tested sepratly to make sure it reaches a good level of randomness for your usage.

Why can't people understand this?

So if you're testing an online poker bot, the best way to test it is with a series of games where the other players always have the exact same hands? Seems like an excellent way to introduce algorithm errors where the algorithm works well in those specific cases but would fail elsewhere.

I was about to say, "that's only really an issue if developers are deliberately coding by kludge to pass tests" but then I realized that part of our team has recently declared that they're going to do exactly that. :-(

2012-05-17 Reply Admin

Kattman

I misread a quote sequence, and I declared that it would give me apoplexy to work with you. I am sorry, that comment should not have been directed to you.

Lurch

Yazeran · 2012-05-17 Reply Admin

Rohan:
Dan Minjoc:
Which would you rather have?
const uint32 NUM_SEC_IN_DAY = 60 * 60 * 24; const unit32 NUM_SEC_IN_DAY = 86400;

Both are correct, but the first one shows exactly where the calculation is coming from.

I agree with you, but this an unfortunate example. Some days have 23 hours, other days have 25.

That fact has bit me often enough that I've made a point of watching out for it. :/

Yea, I too hate daylight savings time.....

I have tred for years to make management accept that all servers used for data logging (large government/university lab) run on only UTC, but the response has always been: 'But when I see that X happened at 12.30 on the graph, it should be at 12.on my watch too..'

I have given up and now have 2 timestamps in each data line, one unixtime (for mee to work with in data processing) and one localtime (with all the hilarity that entails in spring and fall) for data display...

Yazeran

Plan: To go to Mars one day with a hammer

Gazzonyx · 2012-05-17 Reply Admin

matt westwood:
gazzonyx:
TRWTF is that there isn't a comment on the magic numbers there. Not that there is anything wrong with something like this, but it should be documented. For instance, this is one of my commits to Samba :
[...]

/* domain/username%password */

const int max = MAX_DOMAIN_SIZE +

MAX_USERNAME_SIZE +

MOUNT_PASSWD_SIZE + 2; [...]

So what's the fucking 2 for, you stupid prick?

The 2 is for the '/' and '%' in the formatted string.

2012-05-17 Reply Admin

Lurch:
wbrianwhite:
KattMan:
Why does it not surprise me thagt so many here have no idea how to really unit test?
A random number generator should be replaced with a repeatable series generator when testing the code that uses it's results, but then you do still need to test the randomness of the generator in a separaate test.

Now randomness is usually not needed in business applications, but maybe there are a few cases that need it. In those cases I'm sure it is not a case of "What happens when this particular number is generated?" but rather "What happens when the same number is generated back to back?" You can't test this using the generater directly, remember you aren't testing the generator here, but rather the code that uses the results.

In other apps, let's say game simulators, you may do something specific with the random number generated, but then you are usually using a small generation range, like 1-6 or maybe 1-52 for card sets. In this you want a test that passes each value into the test for the code that uses the result.

In neither case are you testing the randomness of the generator, you are testing the code that uses the results. The random number generator for production should be tested sepratly to make sure it reaches a good level of randomness for your usage.

Why can't people understand this?

So if you're testing an online poker bot, the best way to test it is with a series of games where the other players always have the exact same hands? Seems like an excellent way to introduce algorithm errors where the algorithm works well in those specific cases but would fail elsewhere. If you want to say that is integration testing's job that is fine, but not many shops have a formal integration testing suite that is actually separate from unit tests. I could write unit tests for my code where my code would work on 4, 9, and 67 and put those in as the test values, but my could could just blow up on 5, 10, and 68 and that test suite would detect no problem.

Also, please define "tested to be sure it generates a good level of randomness". How would you measure this? What is 80% random versus 85% random? If this was actually a quantifiable event, computers wouldn't be stuck with pseudo-random number generators. Generally all you can do is use a PRNG from a crypto library in preference to the universally flawed RAND built in language functions.

Firstly, there are ways of testing random numbers. Second, you have a very poor understanding of regression testing. Third, I don't think you understand the concept of "universal". And, how would you know the quality of the "PRNG for a crypto library" anyway? (given you don't know how to test random number sequences). Indeed, (and this IS meant to be insulting), your comment is the real WTF here.

And the reason computers are "stuck with pseudo-random number generators" is because computers are, by design, deterministic, and not random.

But that only means you're stuck with psuedo-RNGs as long as you're using a purely software solution. There are devices that can generate true random numbers.

http://www.idquantique.com/true-random-number-generator/products-overview.html

KattMan · 2012-05-17 Reply Admin

Jack:
And the reason computers are "stuck with pseudo-random number generators" is because computers are, by design, deterministic, and not random.
But that only means you're stuck with psuedo-RNGs as long as you're using a purely software solution. There are devices that can generate true random numbers.

http://www.idquantique.com/true-random-number-generator/products-overview.html

Exactly, you know this, I know this, Lurch knows this, Brian seems to miss it. You are expecting a reasonable facsimili of randomness. How do you measure it? Well your specs should tell you how. Let's say you have a generator to give you a number between 1 and 100, if you get 100 different numbers you aren't random, you are picking from a pool and that pool is getting smaller each time. If you get 100 6's once again you aren't random, if you get the same 10 numbers, you aren't random. Should you expect 85% of the numbers with some duplication? Hell I don't know what is "good enough" for the situation at hand, there are people better then me that write those things so I don't try to write my own, but if I had to I would expect something along the lines of 85% with duplicagtion and the duplicated numbers perhaps 90% of those are different, or something like that.

As for the online poker bot, no, you don't need to check for a fully random sequence, you look at the hands possible, 4 of a kind, 3 of a kind, etc.compare that to other possible hands, with only 13 variances in some of those hands. you check variances against each other, not full deterministic sets, because the algorythym isn't going for if he has X I need to have Y, if you are, why even build an algorythym beyond that? Let's say high card, every other hand beats it so no further checks, high card is one of 13 values, easy variance to test against here, and your algorythym test can basically see if it has anything more then a single high card it is winning, only when it has a single high card does the algorythym need to work out what to toss and how many cards to get. quick test passing in not a random number, but high card pairs to check itself against.

wbrianwhite · 2012-05-17 Reply Admin

Lurch:
wbrianwhite:
KattMan:
Why does it not surprise me thagt so many here have no idea how to really unit test?
A random number generator should be replaced with a repeatable series generator when testing the code that uses it's results, but then you do still need to test the randomness of the generator in a separaate test.

Now randomness is usually not needed in business applications, but maybe there are a few cases that need it. In those cases I'm sure it is not a case of "What happens when this particular number is generated?" but rather "What happens when the same number is generated back to back?" You can't test this using the generater directly, remember you aren't testing the generator here, but rather the code that uses the results.

In other apps, let's say game simulators, you may do something specific with the random number generated, but then you are usually using a small generation range, like 1-6 or maybe 1-52 for card sets. In this you want a test that passes each value into the test for the code that uses the result.

In neither case are you testing the randomness of the generator, you are testing the code that uses the results. The random number generator for production should be tested sepratly to make sure it reaches a good level of randomness for your usage.

Why can't people understand this?

So if you're testing an online poker bot, the best way to test it is with a series of games where the other players always have the exact same hands? Seems like an excellent way to introduce algorithm errors where the algorithm works well in those specific cases but would fail elsewhere. If you want to say that is integration testing's job that is fine, but not many shops have a formal integration testing suite that is actually separate from unit tests. I could write unit tests for my code where my code would work on 4, 9, and 67 and put those in as the test values, but my could could just blow up on 5, 10, and 68 and that test suite would detect no problem.

Also, please define "tested to be sure it generates a good level of randomness". How would you measure this? What is 80% random versus 85% random? If this was actually a quantifiable event, computers wouldn't be stuck with pseudo-random number generators. Generally all you can do is use a PRNG from a crypto library in preference to the universally flawed RAND built in language functions.

Firstly, there are ways of testing random numbers. Second, you have a very poor understanding of regression testing. Third, I don't think you understand the concept of "universal". And, how would you know the quality of the "PRNG for a crypto library" anyway? (given you don't know how to test random number sequences). Indeed, (and this IS meant to be insulting), your comment is the real WTF here.

Are we talking unit testing, regression testing, or integration testing? All different. All potentially handled by entirely different toolsets. But I think most devs would create a unit test that actually ran N iterations of the code using real random numbers, since that provides a measure of testing what your code is actually supposed to do.

"And, how would you know the quality of the "PRNG for a crypto library" anyway?" - You're right, I'm relying on computer science teachers who tell me that they are better than RAND. I have not actually sat down and run 10 billion iterations and then looked for patterns myself. Please tell me the ways of testing random numbers. And then tell me why I would want to do so, rather than simply using the PRNG from whatever crypto library I am using? To me the real WTF is some unit test trying to test that a crypto PRNG is "random enough". Like you need to test that every month to make sure that the quality of the randomness hasn't declined? And what if you RANDOMLY got a series of numbers that looked non-random? You would raise a red flag on the status of the product? That's absurd.

wbrianwhite · 2012-05-17 Reply Admin

KattMan:
Jack:
And the reason computers are "stuck with pseudo-random number generators" is because computers are, by design, deterministic, and not random.
But that only means you're stuck with psuedo-RNGs as long as you're using a purely software solution. There are devices that can generate true random numbers.

http://www.idquantique.com/true-random-number-generator/products-overview.html

Exactly, you know this, I know this, Lurch knows this, Brian seems to miss it. You are expecting a reasonable facsimili of randomness. How do you measure it? Well your specs should tell you how. Let's say you have a generator to give you a number between 1 and 100, if you get 100 different numbers you aren't random, you are picking from a pool and that pool is getting smaller each time. If you get 100 6's once again you aren't random, if you get the same 10 numbers, you aren't random. Should you expect 85% of the numbers with some duplication? Hell I don't know what is "good enough" for the situation at hand, there are people better then me that write those things so I don't try to write my own, but if I had to I would expect something along the lines of 85% with duplicagtion and the duplicated numbers perhaps 90% of those are different, or something like that.

As for the online poker bot, no, you don't need to check for a fully random sequence, you look at the hands possible, 4 of a kind, 3 of a kind, etc.compare that to other possible hands, with only 13 variances in some of those hands. you check variances against each other, not full deterministic sets, because the algorythym isn't going for if he has X I need to have Y, if you are, why even build an algorythym beyond that? Let's say high card, every other hand beats it so no further checks, high card is one of 13 values, easy variance to test against here, and your algorythym test can basically see if it has anything more then a single high card it is winning, only when it has a single high card does the algorythym need to work out what to toss and how many cards to get. quick test passing in not a random number, but high card pairs to check itself against.

99.9% of software uses PRNGs. Yes there are random number generators out there. My favorite being a web cam that takes pictures of a lava lamp. But I have never heard of them being used.

"but if I had to I would expect something along the lines of 85% with duplicagtion and the duplicated numbers perhaps 90% of those are different, or something like that. " Yes, you would expect that but humans are terrible at random. Teachers often give their students an assignment of "flip a coin x times and write the results". The students who make it up are easily distinguished because they try to make it random, and end up making very clear patterns.

wbrianwhite · 2012-05-17 Reply Admin

Lurch:
wbrianwhite:
Frank:
snoofle:
Severity One:
Christian:
in my opinion it makes perfect sense to encapsulate something like a random-number generator in your own class, this way you can replace this component with one that returns a "random" number which you can control in your unit-tests.
Would you mind giving an example where a non-random random number makes sense?
If you want to write a test where you need predictable results, then you'll need to know the "random" inputs in advance.
Almost got there snoofle.
Everyone should know the answer to this question. If you don't, you're Doing It Wrong. (Or never worked with random numbers.)

Think about your workflow. You write software. You test it. It goes live. Someone finds a bug. You fix it. You test it again.

What is the goal of that last test?

Hmmm?

Are you sure?

If you answered "to confirm the bug is fixed" you lose. Well you get half points, which is 50%, which is an F in most classes.

The answer is "to confirm the bug is fixed and to ensure no new bug is introduced, or other old bug came back".

How do you do that? You save the results of the "before fix" test and compare them with the "after fix" test. Only one thing should have changed. The badness should now be goodness. No other new badness.

Now, if you have random numbers in the mix, most likely every run will be different. So how can you compare the output? Huh??

See, if you'd ever done proper testing against random numbers, you'd know you have to generate a known repeatable random number (or sequence) during this type of testing.

So 12321232 causes a problem, add a unit test that generates 12321232 every time.
Next month, 098098908 causes a problem, add a unit test that generates 098098908 every time. Repeat ad infinitum. Never have a test suite that runs N times generating actual random numbers every time, because that will only find actual bugs, rather than being repeatable. That doesn't seem right.

You know, every time your browser opens an HTTPS connection you send a random number to the browser to be used in establishing the connection. If you were unit testing the TLS protocol, would you generate the same number every time? It would miss the thrust of what you were trying to test wouldn't it?

mbrianwhite, you and kattman could conceivably both give me apoplexy. That is, if I EVER had to work with you.

Sure, TLS. There are standard vectors for testing. You could test (real) random vectors for the next hundred years, and still not have any real confidence. Put that stawman away.

And, I suspect that, since you (probably) can't crack the code, you would simply run the encrypted material through your decryptor, match the results and call it good?

Here is where I hit you in the head with a heavy object.

You could test real random vectors, and be testing the actual protocol. Or you could test using the same number every time, and be testing something that was almost TLS, except that it is 100% susceptible to replay attacks which is one of the primary things TLS is designed to protect against. Or you can do both. I would expect to see both done. In the predetermined sequence you know the calculated key you expect to get back using the server's public key and can validate that is correct. In the random sequence you just test that the protocol works and that you establish a shared key, and that at least tests the actual protocol.

I have seen errors caused when using the repository pattern where the unit test for the code passes, but the actual repository doesn't work. So I always like to see the actual code being exercised.

Am I really so annoying that people want to insult me and hit me in the head? If so, please at least state reasons you think I'm wrong.

2012-05-17 Reply Admin

It's a code obfuscation technique....makes it more difficult for someone who is specifically searching for 205 in a binary...not much more difficult, perhaps, but more difficult all the same.

wbrianwhite · 2012-05-17 Reply Admin

http://en.wikipedia.org/wiki/Randomness_tests Ah yes, good old Kolmogorov complexity. The shortest program to generate the sequence of numbers should look like print('...the sequence of numbers'). I forgot about that. I don't know how you would implement that in a unit test. The http://en.wikipedia.org/wiki/Diehard_tests look like you could definitely unit test. But it still seems like unit testing that opening a database connection returns a database connection. It's plumbing that you need to use but is generally trusted. And if you did test it it would be a one time test, not a unit test you ran every build.

2012-05-17 Reply Admin

Severity One:
Christian:
in my opinion it makes perfect sense to encapsulate something like a random-number generator in your own class, this way you can replace this component with one that returns a "random" number which you can control in your unit-tests.
Would you mind giving an example where a non-random random number makes sense?

He just did - when you're testing and you want to test how something relying on the generator acts for a specific value (that is, when you're testing code that relies on the Random Number, not the generation of the Random Number itself....

Sanity Check

Leave a comment on “Sanity Check”