- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Not only is putting a variably-sized array at the end of a struct a common trick in C, the gcc compiler has an extension (not used here) for the sole purpose of making that trick easier: zero-length arrays. There's a slight issue in that the programmer is using C idioms in a C++ project, and he's probably in for constructor-related troubles, but that doesn't qualify as a wtf.
Admin
The main (only) difference between a struct and a class in C++ is that a struct's members default public and classes members default private.
Admin
In addition to avoiding overhead of allocating memory external to the struct (As has been mentioned), you can also avoid copying data around unnecessarily. Think of dealing with data that has some kind of fixed-width header followed by a body whose size is indicated in the header, like a file record, or a network packet.
You can cast a pointer to the raw network packet to a struct like this, access the required fields, and manipulate the body without allocating separate space for the header and payload, and copying stuff around. This is especially handy if your packet is layered -- something like TCP in IP in an Ethernet frame. Once you know your ethernet frame contains an IP packet, you can simply cast the pointer to the body to a pointer of type "struct my_ip_packet", etc., and go from there without copying anything.
Admin
Quite true. Of course just because you can do something doesn't mean you should. You could make all your classes structs (make the members private) and have functions in them too. Same diff, makes your code a little more difficult to understand, but you want to make yourself indispensable right?
Admin
Cryptic struct names+unneeded public access specifier(C++ structs default to public)+linked list= Fun :)
Admin
No, we are always right.
If it seems like we're not right, it's because you forgot to provide a partial template specialization of a member template function of a template class you've never heard of.
Somewhere.
In your code.
;-)
Admin
Aha! So that's why I keep getting nailed by assholes with railguns, while my return fire just burns the walls behind them! Stupid inaccurate sqrt functions, messing up my perfect aim...
;-)
Last!
Admin
Just to clarify on the Quake 3 code, if anyone was still interested:
http://games.slashdot.org/comments.pl?sid=209372&cid=17070964
It's Newtons Method a few times, with a decent first guess.
Admin
My recollections of C (quite dated now) are that declarations of the form "char* x" declare pointers not arrays.
So the programmer is declaring a structure that consists of a pointer followed by an OrdArith::OrdElem (whatever that is).
The implications are that, somewhere, an array of summands is to be created, perhaps like this:
The comments would then seem to indicate that the code might address sa[6], which is beyond the bounds of the array (which would have subscripts from 0..5), because "sa[6]^.summand" would overlay the OrdArith::OrdElem of sa[5] in memory, right?
So, unless someone can explain how I got this wrong, it seems this is WTF:
But the OrdArith::OrdElem isn't initialized (unless the constructor initializes it out of sight of the code snippet) and so the program might be accessing "random" data in sa[6]^.summand.
A OrdArith::OrdElem is allocated in all of sa[0] through sa[4], even though only the one in sa[5] is supposedly needed. It would seem better to leave it out and declare the array with 7 elements.
Even that assumes that there is some logical reason why the programmer can't keep the array subscripts in within bounds.
So did I misunderstand all this somehow?
Admin
The reason C99 added this is that there was not a blessed way to do it beforehand. It's not strictly conforming now, and it wasn't then... To say nothing of it being obviously C++.
Admin
Ok, in C this is almost valid, for the reasons explained above many times. However in C++ it should almost certainly be one of the following:
template<int S> struct summand { static const int size = S; summand *myNext; OrdArith::OrdElem myOrd[S]; };
or
struct summand { summand *myNext; std::vectorOrdArith::OrdElem myOrd; };
Or the above with the typedef, would make it a doubly linked list though.
So the WTF is that it's extremely poor style and probably wrong for C++, and technically just about wrong for C99.
Admin
I would expect compilers to provide a (optional) warning whenever char[1] appears other than at the end of a struct (including use of that struct in an array or derived type) or when the struct is copied by value.
Seriously, who goes out of their way to allocate an array of char with a single element?
Admin
This is not WTF at all -- varariable size structs do make a lot of sense, e.g.
Though they are noz the most beautful construct C/C++ offers...
Admin
The real WTF is what's wrong with the guy. Instead of putting this comment, he would have been better with using sizeof( summand ) and not having to care about elements ordering. What's wrong with (Element*)(((char*)aSummand) + sizeof( summand ))? You can #define it.
Admin
Windows headers even declare
#define ANYSIZE_ARRAY 1
so that you can do this...
typedef struct _THING { int arraysize; int AllMyNumbers[ANYSIZE_ARRAY]; } THING;
captcha: bling (rhythms with THING)
Admin
I don't really understand today's snippet, but I do have one like it:
We have a monster global dummy object (well over 500 simple member variables). If you need to add to the object (and one often does), you must:
This is because the bytes are streamed out for persistence. Straight up bits. Who needs structured data of any kind?
Take that for inflexible.
Admin
Newton's Method.
Admin
I've read the comments, but I'm just not understanding today's snippet and why some say it is "good"?
Is it attempting to set up a situation where one can write to memory after the end of the heap? How can one make sure that that the expected space there even exists?
Would declaring some sort of uninitialized (and unused) array of the maximum expected size after that declaration solve any issues about "this must be last", or would it refuse to compile? Or would the optimizer get rid of it?
Admin
Your primary problem with this is if you tried to do this with a base class in C++, or a class involved in multiple inheritance. You'd be hosed. You can't reliably predict which base is going to be located relative to is peers. This trick is really only reliable for simple inheritance, and only when done at the most derived class, or all child classes don't add any new data members.
Admin
This code is a fairly common way to allow for variable sized data structures. Here's how you use this sort of code (imagine this is some kind of networking API):
This works in C++ too, as long as the struct or class has nothing that couldn't be in a C struct (i.e., no private or protected members, no constructors, no virtual functions, etc).
Admin
Accurate square-root computation can be expensive in hardware, but relatively simple in software. Many RISC computers don't have "fsqrt" since the instruction will take longer to execute than other floating point instructions. Software can converge on the answer in only a few iterations.
Admin
let's not hit below the belt there
Admin
The idea is that each instance of CV could hold a different number of doubles (specifically n_coeffs). You are supposed to allocate sizeof(CV) + sizeof(CV::coefficients) * (number_of_elements_to_store - 1) bytes of memory in the constructor, cast it to a CV pointer, and assign number_of_elements to the n_coeffs member before returning the pointer.
Whenever the user access CV->coefficients[i], it's okay for i > 1... i can go all the way up to CV->n_coeffs, because that's how much extra memory you allocated. The compiler won't issue a warning if i > 1 if it is capable of doing so. Most understand that a 1-length array (which isn't normally useful to define in a struct) is to be treated as an UNBOUNDED array for just this kind of construction. In any case the element is never optimized out because it's not zero sized. And all compilers will attempt to locate it at the end of the struct. But not all compilers will try to align it... that's something you have to handle yourself if that matters.
Incidentally, the C99 standard now allows for you to just use the [] notation, omitting the 1, and it will do the right thing. In that case, the array member has no size contribution when you use sizeof(), but the name of the array and it's type is still kept track of. In such cases it is "optimized out", while retaining the desired behavior. Accessing that array will access the memory location right beyond the struct itself... remember you allocated extra just for that purpose in the constructor/factory. GCC has an extension that recognized the nonsensical "[0]" for the same purpose.
Why is this feature useful? You do this when you have simple types wrapping a variably sized array to avoid fragmenting the heap by allocating both arrays AND nodes to point to them.
Admin
mi.size = sizeof(mi); //seems to be what ms thinks is good form
Admin
I beg to differ -- I cut my teeth on C++ back in ~1997, and never learned "correct" C. If this apparently hackish attempt at fooling the compiler is the "right" way to do variable-sized structs, maybe there shouldn't be such things (in C anyway) as variable-sized structs. In a vacuum, I still consider this example a WTF, simply because any code that does non-obvious things and isn't commented accordingly is a WTF, at least if the reader doesn't know better.
I'm glad this got posted, because from time to time I have to get my hands dirty with somebody else's C code, and where I would previously have recoiled in horror at seeing this abuse of assumptions, now I can understand that at least there's precedent -- maybe even a good reason.
One of the main reasons I read WTF is to learn more about the "right" way of doing things in different languages (usually by watching reaction to the "wrong" way). I want to see the CodeSOD stick around!
CAPTCHA: "ewww" Heh. Apropos!
Admin
As others have said, this is a fairly normal variable-length struct in C. Weird comment, but normal.
But it's a mistake in C++. Maybe not WTF-level, but a mistake.
Why? It should have been a struct, not a class. A C++ class may have a vtable, and that vtable can be stored at the beginning or end of the class in memory. If it's at the end then accesses past [0] will be conflicting with the vtable. Don't think it doesn't happen ... a few years ago I ran into some code that had problems because we switched to a new version of the compiler that relocated the vtables.
This class happens to not get a vtable because it's plain-old-data (POD) and isn't subclassed. But it's implicitly relying on a pretty subtle and fragile/unguaranteed behavior -- that nobody is using a subclassed version of it, that there either isn't a vtable, and if there is a vtable that it's kept at the start of the class.
Admin
Unless I'm very much mistaken, static breaks it totally. And this code is doing something else entirely.
Admin
That is a perfectly fine imeplementation idiom. The WTF was that the guy had to comment on it -- does he work with people who don't know the langauge or something?
No, I'm serious! There are many cases where the number of heap blocks (or the number of cache misses getting to your data) matters significantly.
Admin
Look again, it's a struct.
Admin
In C++, you allocate objects like this one on the stack, using reference counting (and move semantics in the copy ctor / assignment operator), which allows simple and fast handling of the objects (e.g. using in STL containers by value). The object then allocates its buffer on the heap which doesn't fragment more than in the WTF solution.
Admin
It's also a problem if you put more than one of them into an array.
Admin
I know several people replied to this, but...
An array is not synonymous with a pointer. Although you can treat them the same in code, in reality they aren't the same.
When you do char *p (NOOOO the star doesn't belong on the char!) the compiler will allocate (on a 32 bit machine) 4 bytes for a pointer. When you do char p[1] the compiler (compiler independed) will reserve 1 byte (while a char is 7 bits, it will reserve 1 byte, possibly 4) for your character.
In both cases you can refer to your character as *p, but you can see they are implemented differently.
As other people pointed out, if you do char *p, you now have to point your pointer at something. Either malloc(new, whatever) up some more memory, or point it to some that already exist. But with a char p[1] the memory has already been reserved.
Now one common use of this pattern is to read in a chunk of memory, and stick a header on the front, ala BITMAPINFO.
Admin
Laff. The WTF is people who think this is a WTF.
Admin
Yeah, you missed the boat.
It appears that summand is supposed to be a linked list, so he would not do "summand sa[6]" But if he did, then he accessed "summand[6]" he would be access memory outside is block (period. It wouldn't overlay OrdElem, because no extra memory was given for OrdElem)
This setup was to create large blocks of memory, and link them together through a "summand."
It would go something like this:
pNewnode = (summand*)malloc(sizeof(summand)+sizeof(OrdArith::OrdElem)999)/ some large number*/) pNewnode->myNext = prevSummand; for (t = 0 to 1000) summand->myOrd[t] = fooOrd[t]; // Or whatever copy mechanism you want to use
I have to say, if this struct ONLY contains a pointer, and a chunk of memory, there are other better ways to do this. If this struct has other metadata other than a next pointer, then its ok.
Admin
OK, I think you guys need a new rule: no more Java programmers submitting C/C++ WTFs. Placing a variable-length array at the end of a structure is a long-standing technique in C. It avoids the space overhead of storing a pointer to an extra heap block, and likewise avoids the inherent problems with static array sizes. But syntactically, it can be accessed just like a member array.
It is actually very robust. ISO C99 has even blessed the (already mentioned by jimrandomh above) gcc extension allowing a "zero size" member at the end of the array.
Really guys. The criteria for a WTF has to be a real problem, not just "OMG WTF duz this c0d du!!!"
Admin
I thought this was crazy too, the first time I saw it in C. It's used in the PostgreSQL "VARYING" data types. The "text" datatype (i.e. VARCHAR(N)) is defined this way:
typedef struct { unisigned int varlen; char data[1]; } text;
Apparently, "data" couldn't be a (char *) type; that stores a pointer in the struct. the "data[1]" makes it store a byte, and "data" references it's first part as an array.
PostgreSQL requires the "data" to be allocated with the struct, and the real size set in "varlen". For an ASCII VARCHAR(30), add 30 bytes:
/* Store "Hello World" in STR VARCHAR(30); */ text str; str = malloc(sizeof(text) + 30sizeof(char)); memcpy(str->data, "Hello World", 11); str->varlen = 11;
That said, C++ should never use this technique! Use a constructor and the new() operator. Even a cross-compiler to a PDA or embedded-chip makes tight code from this.
Admin
Normal for C. Not for C++.
It is a struct. The public keyword is redundant (though I've been known to start class declarations with private: even though it's default. Readability, people, seriously). As for the comment next to the public, well, let me just say that I'd rather use friends than make everything public. (If needed, and the functions would've been global scope, you could just put the functions as static members of a singleton class. Then friend the class.) Then again, you shouldn't need friend anyway except for operator overloading. Two words: Accessor methods.
Again, it's a struct, not a class. But in C++ the keywords are equivalent for all purposes except default visibility (struct: public, class: private).
As you said, this POD will have no vtable. IIRC, only the declaration of a virtual method or destructor introduces a vtable. If it is inherited by a class that introduces virtuals, that subclass will also introduce the vtable (and thus put the vtable at the start/end of that class - nothing else would make sense really). Of course, any sort of inheriting, vtable or no, will mess up the variable-length feature of this structure. You could actively prevent this by abusing the struct==class equivalency, a'la private constructor + public static factory method (that takes the size paramter) (this works because inheriting basically requires at least one usable constructor). But having never used the trick myself, I don't honestly know if having a private constructor wrecks the trick. Alternatively, you could actively prevent it by not inheriting your struct, put a comment saying DON'T INHERIT OR ELSE, and rely on other coders using your struct to not be idiots with it. Alternatively, you could use more C++-friendly approaches (See below).
Now, as for why this is a WTF. No matter what, it's a WTF because the trick needs a size field, else you can't know how far "greater than 1" you can index "with impunity". sizeof won't tell you (it'll just give the size of the pointer, or the compile-time known size of the struct (with the array being [1])). I guess you could negative-index into the malloc header or something, but then you're no longer portable. Otherwise, we'd need to know what OrdArith::OrdElem is. If it's anything but a typedef to char, then the program could just as easily use std::vectorOrdArith::OrdElem (or std::vector<const OrdArith::OrdElem*> if it's a big class/struct so you don't want massive-copy-fun). If it's a typedef to char, then there's two possibilities:
Admin
There's plenty of context to scoff at that. Even in 1995, you would not do something this ugly to save 8 bytes. Typical 1995 machines had over 1mb of RAM. There's no excuse for this type of code, anywhere, except maybe on some embedded system from the early 80s, and even then, it's poor quality work.
In fact, I can say that that code is unacceptable at any point in time since the existence of C++, which is what that code is.
Captcha: bathe: what I feel like I need to do after reading that code
Admin
The real WTF is (and I always hated the people that say that): How lowly he thinks of his peers.
captcha: cognac - no thanks, I had enough
Admin
I also can lurk no longer. More quality less quantity. I am going back to reading the old articles I have missed and hope things are fixed by the time I return.
Things are worse than fucked.
To try and also make a suggestion: You took the daily out of the name and now there are articles pretty much every day? I dont see an rss feed, just let us know when there is something new and we don't have to read low standard articles.
How about a section like - you might think this is a WTF but its actually quite clever.
Anyone remeber lints bug of the month? http://www.gimpel.com/html/bugs.htm
Admin
This is even standardized (at least in C99, but perhaps even before). It's called a "flexible array member" and the standard says you can declare that array as:
struct s { int i; char a[]; };
But for compatibility with older GCCs, you'd often declare it as a[0]. And for compatibility with compilers that don't support that either, there's always a[1].
Admin
Your template<int S> solution only works if the size of the array is known at compile time.
And you don't need a size field if OrdElem contains some value that tells you you're at the end of the list.
And if you're mapping to a buffer (a very common case where this trick is used), you can't use a std::vector without a lot of wasted work/memory.
Face it people, this isn't enough to call this a real WTF unless there's more context we aren't seeing.
Admin
I know everybody's pointed out already that this is an old, well-known abuse of a dark corner of C. For the story, Dennis Ritchie himself used it and attached a funny comment to it: "unwarranted chumminess with the compiler" (some sources list it as ""unwarranted chumminess with the implementation"). FWIW
Admin
Thanks Andrew,
This is the first comment on this thread that actually makes sense of the technique used.
Admin
Perfectly sensible C that either inexplicably survived a C -> C++ port or was written by a C programmer who didn't actually know C++.
Admin
This illustrates very well why some languages provide a "final" keyword. Thanks.
Captcha: So who or what is "vern" without the trailing "e", and what does he/it have to do with programming, WTFs, etc.? Second consecutive appearance.
Admin
Now this is a WTF because it's technically bad code. One look at that by an unseasoned programmer would ilicit a "WTF" especially since there's no comments explaining the use of the function. No mention of fact that you'd need to allocate this struct with malloc with extra size added on (especially important to mention since we're in a C++ environment). Just because it's a well used trick doesn't mean it's not abusing anything and doesn't magically make it not a WTF (especially since it's being used in a C++ environment).
Admin
The real WTF is people thinking such a thing as "C/C++" exists. There is C, and there is C++. The fact that most C is valid C++ is not an excuse. The SOD might have been a C clever hack to circumvent the limitations of the language, but is completely unportable "bad behaviour" if you are writing in C++.
Admin
Admin
While I must admit I didn't know about the trick, it seems to be part of the language, and indeed std::string is also implemented in a similar way. However, the WTF worthy thing is the comment about making everything public to avoid friends. This gives some idea about what the rest of the code might look like. So, instead of keeping the (necessary and useful) ugliness all hidden in one place and providing an easy and straightforward interface, can we assume that every bit of code using this struct needs to know about its memory layout? I wonder, why std::string looks so nice and why I never even knew how it is laid out in memory...