• King (unregistered)

    So, a sleep inside matrix_se would have solved it?...

  • P (unregistered)

    Isn't TRWTF using i as row number and j as column number? It's contrary to the convention in almost every other language.

    Also, we're already assuming the matrices are always square?

  • MiserableOldGit (unregistered)

    Reading this one made me nod off.

  • David-T (unregistered)

    The real WTF is the author not knowing the difference between the logical and bitwise operators.

  • (nodebb)
    No matter how you slice it, you need to && or || a value with the current value of the byte containing the bit.

    Just a little quibble. && and || are logical operators; they look at the overall values of the arguments, with short-circuitting, and return a value: 0 or 1. This would destroy pretty much any bit-wise information you might be trying to preserve, unless you get an exceptionally lucky edge-case.

    & and | are the bit-wise operators that compare the bit patterns of the arguments and return the composite value, depending on the operator. Short-circuiting does not happen with these guys.

    Otherwise, the point stands. I find it interesting that this is self-reported and pretty quickly. I can just imagine the conversation the next morning. "Hey, do you remember that code we checked in last night?" "Barely. I was just so tired, I don't remember much. Did we miss something?" "Well, come over here and look at this." "What were we thinking?" "More like 'Were we thinking?' Clearly not." "We got way too tired and may way to many bad decisions last night. This piece of junk is the kind of thing on Daily WTF." "Yeah, let's submit this mess to them, then we roll back this mess and try again while we can think straight." "Which will be tomorrow. I still need more coffee."

  • Drak (unregistered)

    Relys? Relies...

  • WTFGuy (unregistered)

    @P I think you have that backwards. Generally you process stuff in row-major order. And generally you use i as the outer index, with j as the inner index. And k as the inner-inner index in a 3D case, etc. (Wwarning: attempted markdown ahead...)

    for (i = 0; i < rowMax i++) {     // process each row
        for (j = 0; j < columnMax; j++) {    // process each column in the row
          DoSomething(someArray[i,j]);     // process each element in the column
        }
    }
    

    Using row major processing improves locality of reference for the typical internal memory organization of arrays (even sparse ones). If I ever saw a reference to someArray[j,i] I'd immediately start getting suspicious that was a mistake and have to carefully verify the usage.

  • (nodebb)

    The thing about UB, or at least reads-from-uninitialised-memory extends beyond merely bitbanging operations.

    Consider this structure:

    struct bitfields_galore
    {
        unsigned int a_bit:1;
        unsigned int another_bit:1;
        unsigned int the_third_bit:1;
    };
    

    Now declare a (local non-static) variable of this type and:

    void some_func(some parameters)
    {
        struct bitfields_galore bitfields;
    
        bitfields.a_bit = 1;
    
        /* etc. */
    }
    

    In my experience, tools like Valgrind or Purify will report a read of uninitialised memory for the assignment.

  • (nodebb) in reply to WTFGuy

    If I ever saw a reference to someArray[j,i]

    I'd suspect idiocy or incompetence if I saw that in C code. It should be [j][i].

  • Officer Johnny Holzkopf (unregistered)

    Terminology, my dear Watson: As this is C, calloc() is not a method, it's a function; methods belong to OOP ("instance methods", "class methods" and such), and C technically does not belong there - it uses functions and their derivates. The calloc() and other memory allocation functions' manual pages confirm the correct use of "function", which is consistent with C standards documentation.

  • (nodebb)

    This is a good reminder that even in a language like C which is very fast, terrible code can be written which slows down some operations by 20x.

  • RLB (unregistered)

    Whether this is UB or not depends on the underlying integer type. If it's unsigned, the value may be unspecified, but never a trap representation.

    (I do believe that the value remains technically unspecified, even after all bits have been zeroed using this function, but that's still not the same thing as undefined.)

  • Brian (unregistered) in reply to WTFGuy

    When I'm dealing with row/column stuff, I usually prefer to name my iterators "row" and "col". You know, because that actually describes their purpose rather than some random meaningless letters. Or possibly x and y (and z) if it's something that could reasonably be viewed as a Cartesian layout.

    'i' is reasonably ok for simple one-dimensional loops, but even then I'd go with 'index' or something for clarity. It's not like those extra letters cost anything...

  • (nodebb)

    Maybe I'm TRWTF but I don't see how the first read of each byte could possibly lead to "undefined behaviour". It's reading a byte from allocated memory. All it can do is set or clear a bit in that byte, and write it back.

    The value it reads may be undefined, but that's not important here.

  • Carl Witthoft (google)

    Hey listen all you guys arguing about i, j vs. j,i : don't use either! Granted, in nice low-level languages like c & its brethren, it doesn't matter, but in higher level languages like R, python, MATLAB, etc. "i' and "j" are preset to the complex value sqrt(-1). Those constants get lost the moment you overload with the loop index.

  • MB (unregistered) in reply to Carl Witthoft

    @Carl: In Python, it's "1j" for sqrt(-1); it's a suffix.

  • (nodebb)

    Sounds familiar. I recently wrote a driver in C for a monochrome OLED display for an embedded system, 1 bit per pixel, and for bonus brain bending the display is organized such that writing one byte writes to a column of 8 pixels within a 'page' of 8 rows. Writing a driver that just twiddled individual bits would have been easy but inefficient, especially for displaying characters and symbols which are stored in their own bit matrices. Allowing a symbol of arbitrary size to be displayed at an arbitrary position means handling a bunch of different alignment cases--full or partial first row, full or partial last row, some number of full rows in between. Getting all of the bit shifting and masking correct across all cases was a nice few days of brain melt.

  • DanK (unregistered) in reply to gordonjcp

    The values in the memory are indeterminate after the malloc() call, which means that it can be an unspecified value (I think that's what you mean by undefined) or a trap representation. Accessing a trap representation is undefined, unless it is accessed as a char. So, unless set_bit uses a char pointer to access the memory, this is undefined behavior.

    In practice, this is unlikely to be checked by machines for integer types, but one could be designed to check for this. A related example: IA-64 (Itanium) has an extra "Not a Thing" bit for its registers. When you allocate a new register, it is set. If you try to read the register before writing to it, it will cause an exception. This was used for tracking speculative memory accesses, but it also has the side effect of catching some accesses to uninitialized local variables.

  • JJ (unregistered)

    Oh, come on, we've all been Elya. I know I have, dead tired, but hopped up on espresso.

    (Upon which I've written code I couldn't understand the next morning.. It's exceeding likely a code review would break the WTFPM meter, but at least it worked :-))

  • Somebody Somewhere (unregistered) in reply to alphajbravo

    For what it's worth, you're not alone. Abstracting the workings of those small LCD/OLED screens into something the average developer would consider to be a sane API requires a lot of special-case bit mangling.

  • Loko8765 (unregistered)

    Undefined behavior... it probably isn't, since the matrix certainly is not defined as register: https://stackoverflow.com/questions/11962457/why-is-using-an-uninitialized-variable-undefined-behavior/11965368 However, possible undefined behavior is certainly not the only bad thing in that code base -- shudder.

  • löchleindeluxe (unregistered)

    Eh, at least this won't break if you switch to a sparse matrix implementation, right?

  • Martijn Lievaart (unregistered)

    @Loko8765, as epr that StackOverflow thread, it's not UB only if the type is unsigned char, which is guarenteed not to have a trap value. The register thingy is a bit of a red herring here, as the matrix cannot be in register as you already noted. But anything else than unsigned char can theoretically have trap values, even in memory.

    As unsigned char is the only type that makes sense as the underlying data representation, it probably isn't UB, but until we know which type implements data elements, we must assume that it is UB.

  • (nodebb)

    Most definitely undefined behavior - if you disagree, show me the reference into to the C/C++ specifications that defines it.

  • (nodebb)

    How does reading uninitialized memory and setting bits cause undefined behavior? Sure, if you try to read a bit that was never set, it's value might be unpredictable. But so what, you're not going to do that, because you're initializing every bit. And it's not like ANDing or ORing is going to get an overflow. So ... even if we assume that however you allocated the memory does not initialize it, so okay, you start out with a random string of bits. Then one by one you set them. Umm ... so what?

  • DanK (unregistered) in reply to saneperson

    Trap values... Read the C standard.

    There are two important things to consider here:

    1. Just because you get away with it in your environment that does not mean it’s defined behavior. In this case, the behavior is only defined if you are using a char pointer.
    2. Compilers can (and do) assume that your program does not invoke undefined behavior. If your program relies on undefined behavior, the compiler could erase your hard drive or make demons fly out of your nose... or, more likely, perform optimizations that will change the behavior of your code and break it.

    To really learn and understand C, you have to read the standards. (Though it is obviously not the first step.) Code that works in your development environment could break with a different compiler or on a different machine or using different compiler flags if you invoke undefined behavior. Probably my favorite undefined behavior: signed integer overflow. A lot of code uses signed integers for no good reason. Often, there are checks for overflows (after the overflow happened) that I am sure the developer tested in a debug build at some point... too bad that compiler optimizations may remove those checks because signed overflow is undefined behavior, so the code breaks in weird and wonderful ways in production.

  • Martijn Lievaart (unregistered) in reply to saneperson

    "How does reading uninitialized memory and setting bits cause undefined behavior?"

    Trap values. That machines you use don't have those, doesn't mean that all don't have them. I didn't know that either while I call myself rather knowledgeable.

  • (nodebb) in reply to gordonjcp

    "Undefined behaviour" just means the C standard leaves it up to the implementation to decide what happens. In most cases - every implementation I have ever used - you just get back what was in the block of memory when it was allocated. However, I could imagine a hardware/software combination that caused a trap if you attempted to read from an address which hadn't previously been written. The C standard allows such a combination to exist without the malloc implementation having to care.

Leave a comment on “Just a Bit Bad”

Log In or post as a guest

Replying to comment #:

« Return to Article