- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
From GRG, the twelve problems with the last line ...
(1) Floating point numbers can't give an exact result when the numbers involved are basically transcendental and can't be represented exactly in a finite number of bits, so using the log function is basically foobar. (2) There's in FORTRAN a double-precision function "dlog" which will give better results. (3) "range" is a real number, which on most computers can't represent the full range of an integer. (4) Adding a fudge factor of "1" is a kludge, which helps a bit in some cases but also overestimates the answer by one in other cases. (5) That should be "1.0" anyway. (6) Coercing "range + 1" to real is unnecessary. (7) It should be "1.0D0" to ensure a double precision result. (8) In a few instances the argument to log() isnt constrained to be positive. Taking the log of a negateive or zero is undefined and results in a run-time error. Kaboom. (9) Dividing two inexact numbers will result in even more inexactitude. (10) Converting the result to log base two by dividing by log(2) is correct, but evaluating log base e of two will give an inexact result, as that number is transcendental. (11) And that should be "2.0" or better yet "2.0D0", or better yet, a variable holding the computed value. (12) Taking the int() of the real or double result is WRONG, as the result will sometimes be a few bits shy of the correct value. For example if you plug in "256" the answer is about 7.99981672, which when you int() it, becomes seven, which is off by one.
Admin
Admin
(5) That should be "1.0" anyway. (7) It should be "1.0D0" to ensure a double precision result.
I don’t know Fortran, but ... isn’t that cheating?
Admin
(13) the number of bits it takes to represent a positive integer n >= 1 is floor(log_2(n)) + 1 not floor(log_2(n+1)) So the function is wrong even without floating-point rounding errors.
Admin
You shouldn't use floats at all. What is wrong with ordinary shifting to count bit?
Expressed in C since I don't know fortran:
Admin
Gah obviosuly it should have been:
nbits = 0;
Also signedness ahve to be taken into account
Admin
Admin
That's probably why they started saying log(abs(...)+1). Wrong solution, anyone?
Admin
That's probably why they started saying log(abs(...)+1). Wrong solution, anyone?
Admin
For someone who confuses the numbers '2022' and '2048', and seems to shift back and forth between expressing percentages as decimals and whole numbers, the original reporter sure is picky about math.
(I kid, I kid.)
Admin
ugh, don't make me think.
Admin
Alex didn't get problem #12 right. To represent 256 you need 9 bits, not eight, so the answer is off by two.
Something like this would fix it.
if (range == 0) { nbits = 0; } // Probably not fortran but you catch the idea. nbits = int( alog( real( range ) ) / alog( 2 ) ) + 1
Admin
Bithacks, people!
http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog
CAPTCHA : xevious - like a xox!
Admin
I hadn't even got a clue about the original problem. I got 2022 needing 11 bits (2^11 = 2048), but I hadn't got a clue how to turn my 11 crank backwards into 2022, instead of any other number between 1024 and 2047.
Now, getting that 'magic' to happen on one line would be impressive. You've gotta have faith.
Admin
The real WTF is the 12 complaints.
(1) Yes, floating point numbers can't exactly represent all real numbers. That doesn't make them "foobar".
(2) When you're rounding the result to an integer, who needs double precision?
(3) Reals can't exactly represent the full range of an integer, that is true. However, for the purposes here, they are more than adequate.
(4) The real kludge is ignorance of the ceil() function.
(5) What compiler today isn't smart enough to figure this out?
(6) Duh.
(7) Again the obsession with needless precision.
(8) This comment is actually not a WTF.
(9) So what should the programmer do instead? Multiply?
(10) By this logic, it's a wonder that computers even have a log() function at all. Since they can't exactly represent a transcendental number, what's the point?
(11) Pedantic obsession with precision again.
(12) Yep, it's called round(). Or probably better in this application would be ceil().
Admin
(13) The function can be rewritten with two assembly language instructions. Assuming an x86 machine and that the input value is in %eax:
Admin
188 separate FORTRAN and C files? Gigabytes of really, really important data? I am going to take a wild stab and guess that this is GEMPAK?
Admin
nothing, except FORTRAN doesn't have bit-shift operators, as far as i know...
the first implies REAL, the second DOUBLE PRECISION. same distinction as between float and double in C.
bunch of quiche-eaters :p
Admin
I don't understand 8... The argument to log is always the the sum of the absolute value of some integer and one... the smallest the absolute value could be is zero ... even if zero ends up getting rounded to -0.000000001, 1+ -0.0000001 will never be <= zero
Someone's [captcha] kungfu is off today.
Admin
I don't see how the range can ever be negative. abs() will return a value between 0 (inclusive) and infinity (okay, okay, MAX_INT). And the addition of one is a way of avoiding taking log(0). Admittedly, it's not a good way.
Admin
You could get the same effect by successively multiplying 2 until it is > the number and then counting the number of mults.
Admin
Could have been worse...could have been:
if (x == 1) { nbits = 1; } else { if (x == 2 || x == 3) { nbits = 2; } else { if (x >= 4 && x < 8) { . . .
Admin
Then you simply divide by two instead. Fortran must have division.
Admin
Don't all 12 problems boil down to problem 1: Never send a float to do a longs job?
Admin
Fortran 77 didn't but the US MIL-STD-1753 specified bitwise operations so most flavours of Fortran 77 ended up supporting these . Fortran 90 does have bitwise operations as standard - ISHFT should do the job
Admin
I never programmed math related stuff... so this post has passed 5 Miles above me....
I'm Ashamed :(
Captcha: Pointer..... all the way To ignorance!
Admin
Wow, even if range is 100000000, then nbits is only 1. Seems you have found a truly brillant compression algorithm.
This comment is another WTF altogether, the + 1 is not a "fudge factor" but fencepost adjustment. Not adding it means an underestimation by one bit whenever range is 2^n-1. And an underestimation is obviously a lot worse than an overestimation.
Admin
Does't that fail if range is negative?
// while range is not 0; shift right and increment counter nbits = 0; while(range >>= 1) { ++nbits; } return nbits;
Admin
yup. it looks like whoever wrote this copied the formula out of a maths book, and (rather incompetently) converted that to FORTRAN.
the application i've been working with (particle physics event simulation) takes the opposite tack. it comes as one 75,000 line file that you compile along with your code. at least i never have to look in there...
Admin
I believe the correct solution is actually to use C, and to do a binary search for the first none-zero byte before scanning that to find the high bit.
Admin
You could just divide by 2. If the compiler's smart enough it'll compile that to a (most likely arithmetic(*)) bitshift instruction instead of a division.
(* = Which preserves the sign as would be appropriate for using it for dividing. Meaning -1 / 2 == -1. So you'd be advised to abs() first. Or use an unsigned type which will make the compiler use a logical shift instead.)
Admin
I don't think you understand the problem with floats for this application. Who needs double precision? This application needs more than double precision. For most cases it doesn't matter, but for cases close to a boundary (eg 7.9999...) you could need arbitrarily many digits of precision to get the correct result. (eg 8 vs 7.999...) Because it is rounded, it is very important that the result falls on the correct side of the boundary. The approach used in this code is fundamentally broken because of this.
Admin
FORTRAN? You need some XML:
Admin
Don't feel bad Rafael, you say you've never programmed Math . . . I'm only 24 and have never touched FORTRAN, and dont' do much programming, so it's totally out of my range too
Admin
I mean, if you are to store a positive number along with a separate sign bit, you'd have to add at least one bit.
However, if you want to know how many bits it takes to keep the binary representation of the integer intact, it will always be the number of bits in the used integer type (e.g. 16, 32, 64 or whatever), since the highest bit will always be 1 if it is negative.
Captcha: kungfu... yeah!
Admin
In C without bitshifts... Kind of depends on whether you think only negative numbers should account for the sign bit though, otherwise simply set the original bits value to 1...
int BitSize ( int Number ) { int bits = 0 ; int workingNumber = Number ;
if ( workingNumber < 0 ) { bits++ ; workingNumber = -workingNumber ; } while ( workingNumber > 0 ) { workingNumber /= 2 ; bits++ ; } return bits ; }
Admin
(4) isn't a kludge, it doesn't overestimate, it isn't a heuristic: it is correct. The number of integers in [a, b] is b - a + 1. The 2log of that number correctly represents the number of bits required for that range. When a == b, b - a + 1 equals 1, and therefore requires 0 bits. When b and a differ 1, the number of bits needed is 2log(1 + 1) == 1, when a and b differ 3, it is 2log(3+1) = 2, etc.
The only WTF is that there is no guard against imprecision in the floating point computations. And the consequences of that can be truely awful in this case.
Admin
I hate people whining about floating point being "inexact." Even though it is technically true, a IEEE 754 float or double is the closest possible approximation in the given number of bits. On modern hardware, the error for a double is never greater than 1 part in 2**53.
That error can of course dramatically (say, if you have a loop the runs 100 times, and every loop does an operation that increaes the error by a factor 10, your end result will be nonsense).
But in this case there's a couple of logs and a divide - the imprecision simply isn't a problem here.
Admin
Eating my own words immediately - the tiny imprecision can hear mean the difference between 7 and 8, because it's a rounding to a precise int that we need. I am stupid.
Admin
Any sort of computer science justification at this point is silly. What next? Remarks about how optimized this is using big O?
Admin
You don't see how range could ever be negative? Since you brought it up, what is the result of adding 1 to MAX_INT?
Admin
Right, so I learned fortran for this :-)
Admin
You can represent a negative number in fewer bits than the integer size. -1 can be represented as 3 using 2 bits (assuming you know there are 2 bits). Two's complement doesn't need a size barrier.
Admin
Hahahah, oh wow.
Please explain how the value 1 can be represented in 0 bits. I could use a laugh today.
Capcha = gotcha. HOW DOES IT KNOW?
Admin
On 32-bit PowerPC processors, the CNTLZW opcode counts the leading zero bits in the given number. Subtract that from 32 and you have the answer.
Admin
really all you need is
int x = 1; while (number > 1) Begin x = x+1 Number = numer / 2 End return x
Also, to comment 1 about floating point accuracy: Any numbering system used in a computer is limited in which numbers it can represent exactly, and which ones it cannot. Floating point represetations can represent some numbers exactly, for standar IEEE floating point representation, those that are some exact binary number raised to some positive or negative integral power of 2 (within certain limits) All the other numbers, must be represented by the closest number from the ones which can be represented.
But this is true for any representation scheme. If you try to represent .8 using integers you have the same issue.
Admin
Sounds like AIPS. The source download is in the region of 90MiB, is written in FORTRAN and C, doesn't use Makefiles (some horrible mess with a gazillion shell scripts to do the linking and compilation), DOS 8.3 file names and the 'versioning' is based on release dates (always 31DECYY where YY is the two digit year number) Wooh!
http://en.wikipedia.org/wiki/AIPS
I bet there are some WTF's there.
Admin
However, you need to fix this code to eg.
because bsr gives undefined results if the source operand is zero (this code returns 1 if input == 0, can easily be changed to return 0 instead).Admin
"closest possible" means "inexact" when it's not possible to be exact.
Remember, in binary, no .1 for you!
Admin
That's silly. The answer is always going to be in a limited range, so just write a static lookup table with all the values pre-computed. You will get much better performance, it is easy to translate to other languages, and it is trivial to bench test.