• Rafael Larios (unregistered)

    I'm suffering form mental diarrhea.... this has completely obliterated my mind.

    captcha: alarm... sure!

  • Petli (unregistered)

    But that's a very nice technique. In C, in constrained memory situations, where you can't afford to waste eight bytes or so by allocating two chunks of memory and having to store an extra pointer. I think I used it last, oh, 1995 coding C on an Amiga.

    In C++ and the typical environments where that is used, it's a bit more ridiculous, but there's not enough context here to safely scoff at it.

  • The Frinton Mafia (unregistered)

    'summand'?

    I guess this guy was too clever to just create a generally useful linked list node when he has a collection of things to add up...

  • Faxmachinen (cs)

    This post was supposed to read "THIS POST MUST BE FIRST", but then Rafael Larios had to go and break it. From now on, Worse Than Failure will only be serving recipies involving ketchup.

  • CWF (unregistered)

    I wish I could say this is not common practice, but it's made it into the dirent structure in the POSIX(?) spec.

  • JazzScheme (unregistered)

    Ok.

    I don't see what is wrong. Isn't that a typical variable-sized structure?

    The only wtf is those poor souls who can't understand the concept.

    Captcha: onomatopopepia... or something like that

  • John (unregistered)

    Yeah, that's not too bad. It can be nice to just have an array inside the structure, so you just have to malloc or free one thing at a time (and increase space efficiency too, although I doubt that that would matter normally). Of course, then you have to start using "malloc" instead of "new" in C++ code, and worry about lower-level things than you otherwise would have to. In any case, I know that the linux kernel uses this technique.

  • Volmarias (cs)

    This line must be last!

    Ah fuck.

  • nobody (unregistered)

    I've seen cases where this can be useful, but it certainly is an invitation to read random memory and for memory corruption. But the size is nowhere in the structure, meaning it is even easier for these problems to occur. (I don't usually post captchas, but ewww was too good)

  • Will (unregistered)

    This ABSOLUTELY MUST BE THE LAST COMMENT. Post another and everything could come crashing down!

  • asuffield (cs)

    There is nothing wrong with this - it is not only standard practice, but it is explicitly blessed by the C specification itself as the correct method for creating variable-size arrays before C99.

    In C99, you use [] instead of [1] and the compiler will check for you that it is the last member in the struct.

  • JustSomeone (unregistered)

    In C code, there's nothing wrong with this technique. C is a low-level language (consider things like offsetof), and there's nothing wrong with using it as such, as long as you know what you're doing.

    In C++, it is more likely to be bad form, although if it's a POD class, and there is reason to use something like that, it isn't really a WTF - using something like this in a non-POD class would be a WTF.

    It might also be a slight WTF if the particular program didn't need the memory-efficiency or low-level control - but that would only be the usual case of premature optimization.

    I'm sure everybody can agree that when, say, using C to implement the runtime for a high-level language, a reasonable string representation might be something like:

    struct string { size_t length; char data[1]; };

  • Hallvard (unregistered)

    So? This is just the "struct hack", which has been used in C since the dawn of time.

    It does depends on how he uses it and how hard such a hack was needed, of course. And compilers have been getting too smart for some undefined code lately.

    Strictly speaking record->myOrd[3] is undefined even if allocated, since the compiler is allowed to know that myOrd only has 1 element. The safe way to use it (when enough is allocated) is, er, this, I think:

    ((OrdArith::OrdElem *) ((char *)record + offsetof(struct summand, myOrd[0])))[3]

    ...at least if C++ works like C in that respect.

    And the hack is now officially supported in C99 if you remove the index size in the declaration, but not in C++98.

  • the real wtf is all too often a meta-wtf these days (unregistered)

    How is portability a "myth"? The mythical aspects of the software I'm using to write this post seems to carry with them pretty tangible, real benefits.

  • UnwiredBen (unregistered)

    While valid C, I certainly hope no one ever derives a class from that... those new class members gotta go somewhere, and it's certainly not BEFORE the old ones in more implementations.

  • mav (unregistered)
    Comment held for moderation.
  • mav (unregistered) in reply to mav
    Comment held for moderation.
  • dave (unregistered)

    The real WTF is "why didn't the poster know about this standard technique?".

    It's perfectly reasonable in C and is, for compatibility reasons, acceptable in C++ as long as the struct is just "plain old data". (It may be considered 'poor style' in C++ by std::string afficionados, but I fart in their general direction - actually, they're usually right, but not always).

    I use it myself for storing highly-replicated data. Consider a cache of filenames, for example. They're up to 64K bytes long in Windows, so you don't want to allocate a maximally-sized array. If you're going to store tens of thousands of the buggers, there might be some benefit in having the name contiguous with the rest of the structure rather than indirected, thus requiring a separate heap allocation.

  • leeg (unregistered) in reply to asuffield
    asuffield:
    There is nothing wrong with this - it is not only standard practice, but it is explicitly blessed by the C specification itself as the correct method for creating variable-size arrays before C99.

    In C99, you use [] instead of [1] and the compiler will check for you that it is the last member in the struct.

    I was going to say exactly this. I will now go away and come crashing down.

  • bobday (cs) in reply to the real wtf is all too often a meta-wtf these days
    the real wtf is all too often a meta-wtf these days:
    How is portability a "myth"? The mythical aspects of the software I'm using to write this post seems to carry with them pretty tangible, real benefits.
    Get your head in the game
  • Fred (unregistered)

    This is standard way to perform unbound allocation in C. It is supported by the standard. It seems to me that quite a few WTF recently have been the submitter not understanding the code.

    What is next? Putting some of the inner of a device driver for embedded hardware and complaining that it access memory with fixed addresses ? Showing this code:

    float SquareRootFloat(float number) { long i; float x, y; const float f = 1.5F;

    x = number * 0.5F;
    y  = number;
    i  = * ( long * ) &y;
    i  = 0x5f3759df - ( i >> 1 );
    y  = * ( float * ) &i;
    y  = y * ( f - ( x * y * y ) );
    y  = y * ( f - ( x * y * y ) );
    return number * y;
    

    }

    as a WTF ? (For the uninformed, this is the square root implementation of quake 3. And yes, it is faster that the fsqrt assembly instruction).

  • Anonymous Coward (unregistered)

    Am I the only one that noticed that the 'public' keyword was applied to a 'struct', where (unless the rules have changed) all members of a struct in C++ are public by default? Things that make you go hrmmmm.....

  • newt0311 (unregistered) in reply to Petli

    its actually a very common technique in C when you need to have variable length structures and dealing with libraries. For example, thats how PostgreSQL handles variable length data types. First a 4-byte signed integer to indicate length in bytes and then a binary blob to hold the actual data. I still can't think of why this would be useful an any language with variable length data types.

  • Former Jr. Programmer (unregistered)

    Last Comment!

  • jlevin (cs)

    I think that code looks fine. The daily WTF really has been getting worse lately. The name change was only one step in that direction, the entries follow the same trend.

    I have seen several well designed APIs that use that convention. The only example I can find right now is Microsoft's DRIVE_LAYOUT_INFORMATION_EX ( http://www.osronline.com/ddkx/storage/k306_6cvm.htm ), but there are many more.

  • CB (unregistered) in reply to Former Jr. Programmer
    Former Jr. Programmer:
    Last Comment!
    With sinking heart, I sense a new WTF meme....
  • Thief^ (cs) in reply to Fred
    Fred:
    This is standard way to perform unbound allocation in C. It is supported by the standard. It seems to me that quite a few WTF recently have been the submitter not understanding the code.

    What is next? Putting some of the inner of a device driver for embedded hardware and complaining that it access memory with fixed addresses ? Showing this code:

    float SquareRootFloat(float number) { long i; float x, y; const float f = 1.5F;

    x = number * 0.5F;
    y  = number;
    i  = * ( long * ) &y;
    i  = 0x5f3759df - ( i >> 1 );
    y  = * ( float * ) &i;
    y  = y * ( f - ( x * y * y ) );
    y  = y * ( f - ( x * y * y ) );
    return number * y;
    

    }

    as a WTF ? (For the uninformed, this is the square root implementation of quake 3. And yes, it is faster that the fsqrt assembly instruction).

    I read that as "this code accesses memory directly", which it doesn't.

    It is a wtf that that code is faster than the assembly square-root instruction though.

    EDIT: I also can't figure out how it works :-S

  • DaveAronson (cs) in reply to UnwiredBen
    UnwiredBen:
    While valid C, I certainly hope no one ever derives a class from that...
    My C++ is kinda rusty, but note that it said "struct" not "class". Can you derive a class from a struct? I wouldn't think so, but C++ has a lot of things I don't think make sense. :-)

    On the other claw, you are right if you rephrase it as "ever includes it directly in another struct or class". Doing so could be OK so long as it is, again, the very last thing. However, this severely limits reusability since you can't use both it and another element that needs to be last!

  • snoofle (cs) in reply to jlevin

    I once inherited an implementation of a pseudo random number generator built with this technique. They put a single struct like this on the base of the stack at the start of the program, and then indexed up the stack to get some 'random' values. Supposedly, it took them a bit of trial and error to get the typical stack depth, so they knew how far to index.

    The numbers weren't too random (for some <sarcasm>inexplicable reason</sarcasm>, the code behaved in a predictable manner and the numbers on the stack were surprisingly consistent). I wound up replacing it with something a bit more standard...

    Last!

  • iwpg (cs) in reply to Thief^
    Anonymous Coward:
    Am I the only one that noticed that the 'public' keyword was applied to a 'struct', where (unless the rules have changed) all members of a struct in C++ are public by default? Things that make you go hrmmmm.....
    STFU. That's like complaining that someone wrote braces round a single-statement loop body.
    Thief^:
    It is a wtf that that code is faster than the assembly square-root instruction though.
    It's faster because it's approximate, but good enough for a game. The assembly instruction is general purpose, and has to take more time to get an accurate answer.
  • Fred (unregistered) in reply to Thief^
    Comment held for moderation.
  • EmmanuelD (unregistered) in reply to DaveAronson
    DaveAronson:
    UnwiredBen:
    While valid C, I certainly hope no one ever derives a class from that...
    My C++ is kinda rusty, but note that it said "struct" not "class". Can you derive a class from a struct? I wouldn't think so, but C++ has a lot of things I don't think make sense. :-)
    Sure you can (and sure, C++ has a lot of stuff that don't make much sense).

    On the other hand, this evil programmer may probably need to get a small bullet in the head. To be sure he won't add anything after his "this must be the last" comment.

  • Tidus (unregistered)

    And suddently I feel sick ... Cough how can one possybly writhe ... that ... Yirk !

  • Anonymous Coward (unregistered) in reply to Fred
    Fred:
    as a WTF ? (For the uninformed, this is the square root implementation of quake 3. And yes, it is faster that the fsqrt assembly instruction).
    Only on the hardware of the time. On modern processors, the fsqrt instruction is faster (somebody benchmarked this, but I don't have a link handy).
  • jeremyh (unregistered)

    The REAL wtf is that the programmer added a comment to explain what he was doing and why. Everyone knows REAL programmers don't use comments, especially when they do anything "clever". :-)

  • EmmanuelD (unregistered) in reply to Fred
    Fred:
    This is standard way to perform unbound allocation in C. It is supported by the standard. It seems to me that quite a few WTF recently have been the submitter not understanding the code.

    What is next? Putting some of the inner of a device driver for embedded hardware and complaining that it access memory with fixed addresses ? Showing this code:

    float SquareRootFloat(float number) { long i; float x, y; const float f = 1.5F;

    x = number * 0.5F;
    y  = number;
    i  = * ( long * ) &y;
    i  = 0x5f3759df - ( i >> 1 );
    y  = * ( float * ) &i;
    y  = y * ( f - ( x * y * y ) );
    y  = y * ( f - ( x * y * y ) );
    return number * y;
    

    }

    as a WTF ? (For the uninformed, this is the square root implementation of quake 3. And yes, it is faster that the fsqrt assembly instruction).

    Last time I checked, C didn't have the "public" keyword. Maybe that's a hint, something like "maybe the used language is C++?".

    And, for the "uninformed", this implementation is faster than the CPU's fsqrt instruction, but slower than the inverse square root instruction you'll find in the multimedia oriented SIMD instruction sets. Especially since you don't have to tackle with int-to-float conversions.

    And it works because of the careful choice of the magic constant. A quick search on Carmacks inverse square root (because that's what the algorithm computes, as the "return number * y" statement clearly says) will give you the reasonning behind this ugly-but-clever piece of code (not that the code is from Carmack himself. He borrowed it from who knows who, last time I checked).

    That's all I had to say. Yummy.

  • savar (cs) in reply to JustSomeone
    JustSomeone:
    I'm sure everybody can agree that when, say, using C to implement the runtime for a high-level language, a reasonable string representation might be something like:

    struct string { size_t length; char data[1]; };

    Why char[1] instead of char*?

  • Dark Shikari (unregistered) in reply to Thief^
    Thief^:
    Fred:
    This is standard way to perform unbound allocation in C. It is supported by the standard. It seems to me that quite a few WTF recently have been the submitter not understanding the code.

    What is next? Putting some of the inner of a device driver for embedded hardware and complaining that it access memory with fixed addresses ? Showing this code:

    float SquareRootFloat(float number) { long i; float x, y; const float f = 1.5F;

    x = number * 0.5F;
    y  = number;
    i  = * ( long * ) &y;
    i  = 0x5f3759df - ( i >> 1 );
    y  = * ( float * ) &i;
    y  = y * ( f - ( x * y * y ) );
    y  = y * ( f - ( x * y * y ) );
    return number * y;
    

    }

    as a WTF ? (For the uninformed, this is the square root implementation of quake 3. And yes, it is faster that the fsqrt assembly instruction).

    I read that as "this code accesses memory directly", which it doesn't.

    It is a wtf that that code is faster than the assembly square-root instruction though.

    EDIT: I also can't figure out how it works :-S

    Google the hex number and you'll find some articles explaining how it works.

    It is NOT faster, as far as I know, on modern CPUs. It only provides a tangible benefit when:

    a) You don't need exact accuracy (its about 1.7% off) b) You're on a processor where floating point operations are slow.

    I've tested it on modern CPUs and it is slightly slower than the built-in sqrt function.

  • mav (unregistered) in reply to savar
    savar:
    JustSomeone:
    I'm sure everybody can agree that when, say, using C to implement the runtime for a high-level language, a reasonable string representation might be something like:

    struct string { size_t length; char data[1]; };

    Why char[1] instead of char*?

    Because char* requires a separate allocation, while char data[1] will be contiguous memory. You do, after all, only want to malloc once. :-)

  • Thief^ (cs) in reply to savar
    savar:
    JustSomeone:
    I'm sure everybody can agree that when, say, using C to implement the runtime for a high-level language, a reasonable string representation might be something like:

    struct string { size_t length; char data[1]; };

    Why char[1] instead of char*?

    Because char* would mean that you'd have to allocate a separate external-to-the-structure array. The idea is that you do (struct string*)malloc(sizeof(size_t)+length*sizeof(char)), and then you can just access the elements of the array as if it was char[length] instead of char[1].

    Makes passing the struct to functions a pain though, because pass-by-value would only pass the first char, because that's all the struct definition says it contains.

  • Pete Kirkham (unregistered) in reply to Anonymous Coward

    There's a clue for you in the non-C pubic keyword, meaning it has to be compiled with a C++ compiler, and will be treated as a class with all members public. Unlike C, in C++ there is no requirement that members of classes are contiguous (which is how VS2003 can get away with extending managed classes in C++), though usually they are, and are allocated in the order presented in the source file (otherwise it might have to be first).

    With a compiler where you know the layout, using a reinterpret cast or operator new on some memory you've allocated yourself to create a pointer to this type is safe. But not portable.

    I'd imagine the length of p->myOrd is given by (reinterpret_cast<void*>(p->myNext)-reinterpret_cast<void*>(p->myOrd)-sizeof(summand*))/sizeof(OrdArith::OrdElem ), or if the sizes of the types are the same: (p-p->next-1).

    So maybe the WTF is that a C programmer has got his hands on a C++ compiler and not noticed.

  • cparker (cs)
    OrdArith::OrdElem myOrd[1]; // myOrd
    Ah, I see. Thanks for clarifying that for me. I wouldn't have guessed "myOrd" from "myOrd[1]". You're a real pal.
  • mav (unregistered) in reply to Pete Kirkham

    This is totally off topic, but I was listening to random music on my fairly large mp3 drive when I realized that no matter how happy I am, I'm only 1 Emmylou Harris song away from depressed.

    CAPTCHA: (smile) I would, if it wasn't for that damned Emmylou.

  • Hallvard (unregistered) in reply to Pete Kirkham
    Pete Kirkham:
    Unlike C, in C++ there is no requirement that members of classes are contiguous (which is how VS2003 can get away with extending managed classes in C++), though usually they are, (...)
    Nonsense. In both languages, members are required to have proper alignment just like other data. Which means the standard can _not_ require them to be contiguous.
    With a compiler where you know the layout, using a reinterpret cast or operator new on some memory you've allocated yourself to create a pointer to this type is safe. But not portable.
    Safe and portable, as long as you don't depend on more layout than the standard requires, and use code which doesn't allow the compiler to "know" that there is just one member.
    I'd imagine the length of p->myOrd is given by (reinterpret_cast<void*>(p->myNext)-reinterpret_cast<void*>(p->myOrd)-sizeof(summand*))/sizeof(OrdArith::OrdElem ), or if the sizes of the types are the same: (p-p->next-1).
    No. The compiler - and sizeof()- doesn't know that the programmer uses another size than the structure definition imples. The programmer has to take care of the size himself.
  • drfloyd5 (cs)

    I can lurk no longer. I made an account just to make this post in person.

    The CodeSOD now officially blows. I will use this 1 WTF to prove it.

    1. The WTF isn't.
    2. There are 30 posts explaining in non-unique ways why #1 is true.

    The signal to noise ratio of the CodeSOD WTFs is pretty crappy. And the comments s/n ratio is even worse. So all in all the whole CodeSOD is a useless read.

    Now, believing that one should not critize without offering a suggestion: Simple, Change the CodeSOD to the CodeSOW. Just one a week. That would releive the pressure to release sub-standard CodeS just to fill todays slot. Complex, Create a Which Code is the Bigger Failure Challange. Present two snippits of code, user vote determines wich ones suck more. Winers go on to fight last weeks winners, and so forth.

    -DF5

  • Anonymous Coward (unregistered) in reply to iwpg
    iwpg:
    Anonymous Coward:
    Am I the only one that noticed that the 'public' keyword was applied to a 'struct', where (unless the rules have changed) all members of a struct in C++ are public by default? Things that make you go hrmmmm.....
    STFU. That's like complaining that someone wrote braces round a single-statement loop body.

    Hey now! I actually like putting braces around single-statement loop bodies. You never know if or when you're gonna want to add other stuff later.

    I just think the effort of adding the public specifier to a struct, along with the adorning comments is just dumb.

  • sibtrag (cs)

    I am reminded of the first bug I investigated during my limited career as an OS developer as an undergraduate intern at a long-gone computer company in the early 80s.

    I was left pretty much on my own to figure out how to build the OS after my code finally went through the assembler. Having done so, I waited for my time slot to use the machine for testing. This was a big iron machine, so the little development office I worked in only had one...we used it for coding, building & OS testing. Thus, testing came in time slots allocated before & after the normal business day.

    My code quickly hit an illegal instruction trap & dropped me into the rudimentary debugger. When I looked around, all my code was gone....replaced by 0s. My mentor & I looked around to try to figure out what happened to the code when he noticed that the "OSEND" symbol was not at the end of the OS.

    It was then revealed to me that I had broken the one rule about building that OS: The OSEND module must be at the end of the list of modules on the link list. The memory initializer started clearing memory at that symbol and thus wiped out much of the OS.

    Happily, that could be disabled with a simple patch so that I could find my second bug.

  • rdez (unregistered)

    Well it was clear to me that it wasn't a WTF once I saw "impunity" in the comments.

    Not having done C in a while, I must say that I miss hacks like that..

  • PseudoNoise (unregistered)

    The real WTF is he's not declaring it:

    typedef std::list< std::vector< OrdArith::OrdElem > > summand;
    

    (unless there's more to the class that has been commented-out)

    Also, thanks to the poster above that mentioned C99 codifies the variable struct trick with elem[] instead of elem[1]. That's quite cool.

  • eraserhd (unregistered) in reply to rdez

    This isn't a WTF. In fact, look at the implementation of any STL std::string class in C++. It is what makes an efficient implementation versus one that fragments the hell out of your heap by requiring you to allocate 4-byte slots for pointers to pointers to data.

    From GNU std::string:

          struct _Rep_base
          {
        size_type       _M_length;
        size_type       _M_capacity;
        _Atomic_word        _M_refcount;
          };
    
          struct _Rep : _Rep_base
          {
        // Types:
        typedef typename _Alloc::template rebind<char>::other _Raw_bytes_alloc;
    
        // (Public) Data members:
    
        // The maximum number of individual char_type elements of an
        // individual string is determined by _S_max_size. This is the
        // value that will be returned by max_size().  (Whereas npos
        // is the maximum number of bytes the allocator can allocate.)
        // If one was to divvy up the theoretical largest size string,
        // with a terminating character and m _CharT elements, it'd
        // look like this:
        // npos = sizeof(_Rep) + (m * sizeof(_CharT)) + sizeof(_CharT)
        // Solving for m:
        // m = ((npos - sizeof(_Rep))/sizeof(CharT)) - 1
        // In addition, this implementation quarters this amount.
        static const size_type  _S_max_size;
        static const _CharT _S_terminal;
    
        // The following storage is init'd to 0 by the linker, resulting
            // (carefully) in an empty string with one reference.
            static size_type _S_empty_rep_storage[];
    
            static _Rep&
            _S_empty_rep()
            {
          void* __p = reinterpret_cast<void*>(&_S_empty_rep_storage);
          return *reinterpret_cast<_Rep*>(__p);
        }
    

Leave a comment on “Must Be Last”

Log In or post as a guest

Replying to comment #:

« Return to Article