- Feature Articles
- CodeSOD
-
Error'd
- Most Recent Articles
- Secret Horror
- Not Impossible
- Monkeys
- Killing Time
- Hypersensitive
- Infallabella
- Doubled Daniel
- It Figures
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
I think your method will behave somewhat differently on negative inputs than the original.
Admin
I've never actually used them, but I've seen them used a long time ago. Since we're usually using bitfields to store a bunch of flags, here, I haven't come across a situation where it's really worth it to have the full flexibility of specific-precision integers, and I like being able to refer to bits by a subscript (so I can loop over them, etc.) but thanks for reminding me this is out there. C++ is such a huge language.</knumfoobitfields>
D'oh! That sounds -really- promising, though, I'd like to see how you do that. :)
Admin
I bet you didn't try that, right?
The compiler will complain because i and j are not constants. Template Parameters need to be Compiler-Time Constants<i ,j="">
Admin
I think the WTF here begins with the fact that this is an optimization for multiplying small numbers. Even if the compiler didn't do the constant multiplication for you, WTF is so hard about the coder doing it? Something like:
i = 20; \ hey, dumbass, this is 4*5
????????
Admin
I'd normally agree with you, except for the fact that the fortran code was WTF. Really, I timed both versions of the program to make sure. There's this array of distance data, let's call it Distances. The fortran code, instead of predicting, just looped around the array until it reached distance X to obtain the right index. Given that this distance might be at index 10,000, it was highly inefficient. I re-wrote it using a distance predicition routine that attempted to close in on the index in 5 iterations (example: target distance is 1000, difference between i and i+1 is 1. new i is predicted to be 1000). If it failed to reach local minima in 5 iterations (if it bounced around the minima instead of reaching it), it finishes off the function with a regular linear search. Oh and the Fortran version didn't even start from the last known point. Now tell me, which code would be faster?
Admin
People people calm down! It's only an example! No there aren't supposed to be 13 months in a year, no there aren't supposed to be 32 days in a month, and yes, it would be a bad idea only to store 100 years. You guys sure are jumpy today! Obviously, you would optimized the number of bits to whatever application you have. And yes you need to use 5 bits, if your months start at one instead of zero.
Admin
Just email me dude and I'll send it to you! (hint: look at the last sentence above what you wrote :D)
Admin
magic numbers, that's why
#define FOO_SIZE 8
#define FOO_COUNT 42
And somewhere else, using
i = FOO_SIZE * FOO_COUNT;
is better than
i = 336; // 8*42
Admin
Bingo... but I still don't see any advantage to
over
Surely if multiplication by adding numbers is so much faster, the compiler should be able to do that itself rather than being forced like this.
Admin
Oh yeah. Duh. Forgot about the precompiler.
Maybe they figured the compiler lost the abilitiy to optimize constant arithmetic operations. You know, to make room for new, cooler features like recursive template instantiation.
Admin
OMG! TEH QUOTE WORKED!
Admin
Huh? Which C++ compiler are you using that allows templates to be instantiated at run-time?<i ,j="">
<i ,j="">
Admin
That's an implementation-specific limit. 17, IIRC, according to a comp.lang.c++.moderated post. A little research here indicates the answer is likely "not strictly <a href='http://en.wikipedia.org/wiki/Turing_complete'>Turing complete". </a>
Admin
Aside from being slower and more obscure.. it's a lousy adding algorithm... it's O(n), here's one that's O(log(n)):
Admin
I was wondering why half the posts use escaped HTML.. it's because the forum escapes it, but the preview function doesn't.... So the Preview looks awful if you don't use HTML, and the post looks awful if you do. Somebody should fix this.
The code is:
Aside from being slower and more obscure.. it's a lousy adding algorithm... it's O(n), here's one that's O(log(n)):
// Stupid implementation of a shift-and-add // multiplier template<int firstval, int secondval> struct QuickMultiply { static const int value = QuickMultiply<(firstval << 1), (secondval >> 1)>::value + (secondval&0x1)!=0?firstval:0; };
template<int firstval> struct QuickMultiply<firstval,0> { static const int value = 0; };
This would take QuickMultiply<4,5> and would go through
= QuickMultiply<8,2> + 4 = QuickMultiply<16,1> + 0 + 4 = QuickMultiply<32,0> + 16 + 0 + 4 = 0 + 16 + 0 + 4 = 20
Admin
Nah, I'll only be impressed if you use a simulated hardware multiplier using inline VHDL. ;) There IS a way to get it to nearly-constant time. This excersise is left up to the reader.
But here's a hint...
http://www.doulos.com/knowhow/vhdl_designers_guide/models/carry_look_ahead_blocks/
</firstval></int></int>
Admin
If you're trying to optimize multiplies by adding -- you should be using powers of 2 anyway...
so:
x * 6 = ((x+x)+(x+x))+(x+x) -- 3 adds not 5.
or better yet:
x * 6 = (x << 2) + (x << 1) + x;
So any claim that the original code is in any way shape or form a competent optimization is wrong (even assuming it could multiply variables).
As for the claim that integer multiplies are alreadt better than multiple additions -- well that's also wrong. From an article on optimizing x86 assembler:
In other words, the hardware integer multiple is NOT faster than bitshifts + adds if fewer than 8 bits of the value you're multiplying are set (probably most of the time, and certainly true the you're multiplying by less than 256).
For the full article, see:
http://www.geocities.com/SiliconValley/2151/opts.html
Admin
Quick clarification:
a = x + x;
b = a + a;
c = b + c;
6 * x == b + c; // actually 4 adds vs. 5
The better option is (correctly):
6 * x == x << 2 + x << 1; // 2 bitshifts, 1 add.
Admin
gack.... you're right. I misread the original post... the template parameters do have to be determined at compile time...
I guess I'd just never imagined that someone would use templates in such a WTF way! I was thinking of something like QuickMultiply<int ,int=""><int,int> and then calling <int ,int="">multiply(4,5) (or whatever that infernel syntax is that I still have to look up every time I use it 10 years later...) as a somewhat less WTF way of doing it!
(and WTF is it with this board... All I want to do is include angle brackets in my post!!! A plain text board with no HTML wysiNwg editor would be a damned site easier!!)
</int></int>
Admin
Dude, I was calm. I was truly interested. And I still am. Why do I need 5 bits to make the number 12?
1 bit = 0..1
2 bits = 0..3
3 bits = 0..7
4 bits = 0..15 // enough for 0-11 or 1-12
5 bits = 0..31
Or has my knowledge of bitcalculation degraded that much after using vb.net for 3 years?
Drak
Admin
Okay, so their CPU has a really slow multipy. (at least for small values) I'll belive that.
Their compiler is appearently gcc or something like that, but they wrote the backend in house because one didn't exist for that CPU. The way they wrote the backend broke both the optimizer, and inline assembly. Recursive templates still work though.
Funtion calls turn out be surprisingly expensive - more expensize than multipy. Inline functions are an optimization that doesn't work, of course.
Profiling revealed this program is spending 80% of its time multitpling small constants. And the program was too slow.
They tried:
#define FOO 4
#define BAR 5
#define FOOTIMEDBAR 15
but as you can see when FOO changed from 3 to 4 they kept forgetting to update FOOTIMESBAR.
It is easier to write templates to do this, than to fix the compiler, so that is what they did.
I just knew by sleeping on this I could eventially come up with an explination. I just wish you would tell me where this is, because I don't want to work in a company that faces thost limits.
Admin
You're absolutely right. I must have had a brainfart :-/
Admin
Did you even realize that the WTF is NOT taken from Boost?
Admin
You are sure using an interesting compiler in case it lets you get through with this, because, well, any standard-compliant C++ compiler won't.
(In case you are reading this and are the person who wrote the forum software: Take a beginners course in gwbasic and try to work yourself up from that. Thank you.)
Admin
Oh, an anti-C++ troll. How original. I once wrote a lexer generator in C++ that emitted C++ code and, using a simple input grammar, the very first version produced a lexer, exception-handling and all, that was substantially faster than the flex-generated one.
As far as FORTRAN is concerned: Oops. You have been blitzed ...
See also Stroustrup94, The Design and Evolution of C++ for a discussion of the speed sacrifices Stroustrup was willing to accept for new features (hint: there aren't many ...).
Admin
Actually not, because Kernighan and Richie had some very interesting ideas about operator precedence. So the compiler looks at this and interprets it as:
(x << (2 + x)) << 1
Which is not equal to 6 * x most of the time (except when x == 0).
Admin
That just might be the reason why I *never*, if I can avoid it, put structs or classes and data exchanges too close together. It's simply not a good idea. Memory layout and endian issues always get in the way.
The correct solution is to write serialize methods that feeds the data to the target stream, and is capable of reading versioned data back in. Java's object serialization was broken from the very beginning for long-term storage, and the Swing developers even acknoweldge this. It's only useful for transferring data between two places that are guaranteed to run the same version of the software. Which is a rather rare case.
Admin
I think he was actually bashing FOTRAN, because it's supposedly as easy to use as VB, only "it's faster than anything else out there"(TM), which is why it's still popular in the scientific community. But, what's really dumb is that there are BETTER mathematical programming languages out there such as Matlab, which kicks FORTRAN's ass in terms of being closer to how math is represented and super-optimized for ginormous arrays and mathematical equations. Our company does analysis for pavement testing software, and all I can say is that most of it is written in C++, with only the most generic mathematic stuff being linked to fortran DLL's (such as multi-dimensional regressions). But I'd trade it for a link to a MATLAB runtime any day. Since Blitz++ has been brought up, I might mention it next time we make analysis software... hell it would probably be useful for a gaming engine as well.
Admin
I think it's a job security thing :) Failed attempt though :) Funny though.
Admin
The place I used to work did this all the time. They insisted that deleting and then inserting a row into a SQL Server table was faster than doing an update. They had done "extensive testing" before they started the app to verify this. I whipped up a test case that showed update was faster in every case, no matter the number of records, indexes/no indexes, etc. They didn't really have any way to counter the results of my "extensive testing". These sorts of things were all over the company and deeply embedded in many apps.
Admin
I think I found a more efficient and no less obscure way of doing it:
Admin
Aaaarrgggg!!!! My eyes! This scream "class" all-round! The template (ab)use is bad enough, now he's using c's cass-work-a-like. (i don't have anything against structs, but they are better at storing/passing large amount of vars to funcs/arrays) Captcha minim yes, he should minimalize the struct-as-class use! (i do have an account here, just can't be bothered to login :P)