• (disco)

    It must be me, because BY DEFINITION the value of chars[strlen(chars)] is '\0'. That is how it is calculated.

    I refer people to K&R 2 page 103 where an example implementation exists.

    This is like saying "if (zero == 0) zero = 0;" In other words "why bother".

  • (disco) in reply to accalia

    I made a Windows-like shell using TI-Basic on the 83+ once. The API could draw windows, buttons, text, and even had a working mouse! (no right-click though). This of course was back when I accepted that a screen refresh would naturally take 3-4 seconds, and moving the cursor took about 1/2 seconds if the screen did not need updating. Screen buffering FTW!

  • (disco) in reply to Zylon
    Zylon:
    A freaking 2560 x 1700, 773K JPG.
    Considering the [Source File][1] is a lovely 5616x3744 at 2.5 MB, it's not _extravagantly_ bad.
  • (disco)

    Note how even a single NUL byte write into heap can actually be exploitable for local privilege escalation against a suid binary: http://googleprojectzero.blogspot.fi/2014/08/the-poisoned-nul-byte-2014-edition.html

    Scary stuff.

  • (disco) in reply to Kuro
    Kuro:
    C++ added a [string class](http://www.cplusplus.com/reference/string/string/?kw=string). So there is that.

    Ah yes, C++, that stellar example of clear syntax that will not, under any circumstances whatsoever, allow you to shoot yourself in the foot.

    Other than that, I fail to see how the code in the article could ever overwrite memory it wasn't supposed to access.

  • (disco) in reply to Severity_One

    Theoretically, the buffer could end up being next to a read-only page, and the assignment could write to read-only memory. It wouldn't do anything useful for an attacker unless it was a long-running server that didn't automatically start back up or whatever, but it's a problem they should fix. By deleting the code.

  • (disco) in reply to Severity_One
    Severity_One:
    Ah yes, C++, that stellar example of clear syntax that will not, under any circumstances whatsoever, allow you to shoot yourself in the foot.

    C makes it easy to shoot yourself in the foot; C++ makes it harder, but when you do it blows your whole leg off. Bjarne Stroustrup

  • (disco) in reply to VinDuv

    When I started programming with C++ I was truly afraid (I was 16 or so) that I would destroy my computer or the Palm Pilot. Maybe that's why I moved to VM languages :ferris_wheel:

  • (disco) in reply to antiquarian
    antiquarian:
    Hmmm... Well, now I have read that thread, and there are no references to pendantry in there.

    Oh yes, and by the way, yesterday I got one of those stupid pink-orange popups from Discurse about posting style. It can fuck off.

  • (disco) in reply to Steve_The_Cynic
    Steve_The_Cynic:
    posting style

    What's that? I never heard of that one and so far I've only seen the multireply and the frequency toasters.

  • (disco) in reply to aliceif
    aliceif:
    I've only seen the multireply

    i'm guessing it was that one....

  • (disco) in reply to Steve_The_Cynic
    Steve_The_Cynic:
    posting style

    Body is invalid; try to be a little more descriptive

    :question:

  • (disco) in reply to aliceif

    What

  • (disco) in reply to Gaska

    multi

  • (disco) in reply to Gaska

    reply?

  • (disco) in reply to Gaska

    Oh, that one.

  • (disco) in reply to Steve_The_Cynic
    Steve_The_Cynic:
    Hmmm... Well, now I have read that thread, and there are no references to pendantry in there.

    Fixed.

  • (disco) in reply to accalia
    accalia:
    i'm guessing it was that one....
    Yup, that's the one. It can fuck off.
  • (disco) in reply to Steve_The_Cynic
    Steve_The_Cynic:
    Yup, that's the one. It can fuck off.

    what sort of implement would you like to borrow to help it do so? i have a variety to choose from, fresh from being sterilized in the autoclave.

  • (disco) in reply to accalia
    accalia:
    what sort of implement would you like to borrow to help it do so? i have a variety to choose from, fresh from being sterilized in the autoclave.
    If the choice were left up to me, I'd probably target it with my memeic GAU-8, but that's just me.

    My code-excising weapon of choice depends on the environment. On my personal Windows machine, that'd be Visual Studio 2010, while on a *NIX machine, I'd generally put my holy war stake on the side of Emacs.

  • (disco) in reply to Steve_The_Cynic

    i should probably nominate for a woosh...

    that's not the sort of impliment that one uses for FXXK nor one that would require sterilization in an autoclave...

    want to try again? ;-)

  • (disco) in reply to aliceif
    aliceif:
    What's that? I never heard of that one and so far I've only seen the multireply and the frequency toasters.
    multireply. I call it "posting style" because it seems to complain about it as if it is a question of posting style in a broad sense.

    And because I don't bother reading it properly because it is being self-righteous and pompous.

    So it can fuck off.

  • (disco) in reply to accalia
    accalia:
    i should probably nominate for a woosh...

    that's not the sort of impliment that one uses for FXXK nor one that would require sterilization in an autoclave...

    want to try again? ;-)

    Minor whoosh I suppose. Besides, I had no intention of (...)ing it. I merely invited it to go forth and multiply.

    Shrug, whatever.

  • (disco) in reply to Steve_The_Cynic
    Steve_The_Cynic:
    I merely invited it to go forth and multiply.

    oooooh. my misunderstanding then.

    have fun with the GAU-8

    if Ivan asks for COD tell him that you know it was already paid for, but tip him a 750ml beverage anyway (make sure it's at least 90 proof)

  • (disco) in reply to ben_lubar
    ben_lubar:
    Theoretically, the buffer could end up being next to a read-only page, and the assignment could write to read-only memory. It wouldn't do anything useful for an attacker unless it was a long-running server that didn't automatically start back up or whatever, but it's a problem they should fix. By deleting the code.

    That doesn't make sense. The NUL character is part of the buffer. If you want to store "Hello" in a C string, you need a char array with 6 elements: one for each letter, and one for the NUL.

  • (disco) in reply to Jaloopa

    My favorite "shoot yourself in the foot" joke is not from programming but in reference to the old Infocom text adventures. ;-) It went something like this:

    > shoot foot
    I don't see any foot here.
    > shoot self in foot
    You don't have the gun.
    > get gun
    You get the gun.
    > put bullets
    You put the bullets in the gun.
    Your lamp has gone out.
    You are eaten by grues.
    ***YOU HAVE DIED***
    
  • (disco)

    Of course, if the string weren’t null terminated, we’ve entered nasal demon territory- the behavior of strlen is undefined for non-null terminated strings. If you present a C-compiler an undefined construct, it is allowed to do anything it likes, including make demons fly out of your nose.

    The behavior of strlen is not undefined. strlen is a function written in the C programming language, and the C standards define the C programming language, not specific functions written using the C programming language. strlen returns when it encounters a '\0' in the string argument. If there is no '\0' in the string, it will just keep going through memory, which may or may not trigger a segmentation fault.

    For example:

    int main() { char str[4]; int str_end = 0; char *str_test = "test"; memcpy(str, str_test, 4); printf("strlen(str) = %d\n", strlen(str)); return 0; }

    Using gcc to compile (on intel architecture), will correctly return 4 as the string length because even thouigh str is not null terminated, the strlen function hits the first byte of the integer str_end on the stack. Since the first byte of str_end is 0, and ('\0' == 0), strlen returns a value that happens to be the correct length of the string str.

    But even on null terminated strings, this code is dangerous. Since arrays in C, like any sane language, are zero indexed, this code may attempt to access memory beyond the end of the array, overwriting whatever’s there with a null terminator.

    This statement doesn't make sense. Indexing has nothing to do with reading beyond an array's boundary; they are two entirely independent concepts. The worst that could happen is that strlen finds a 0-byte, then overwrites it with a 0. In the example above, the result is that the first byte of str_end, which is 0, will be replaced with the same value, 0:

    int main() { char str[4]; int str_end = 0; char str_test = "test"; memcpy(str, str_test, 4); str[strlen(str)] = '\0'; / memory is written outside the bounds of the str array, but it is overwritten with the same value the memory held before being overwritten */ return 0; }

    Getting back to the bigger picture, array boundaries in C only exist in the programmer's head. This is an important concept when programming in C.

    This is the sort of code that happens when Java programmers attempt to write C, without understanding how they’ll shoot themselves in the foot.

    This sort of article happens when Java programmers attempt to write an article about C, without understanding C. Sorry, you kind of set yourself up for it ;)

  • (disco) in reply to accalia
    accalia:
    oooooh. my **misunderstanding** then.

    I read that word COMPLETELY wrong

  • (disco) in reply to chubertdev

    what did you read it as?

    hmm?

  • (disco) in reply to accalia
    accalia:
    what did you read it as?

    hmm?

    I'd have to respond in the Palm Pilot thread

  • (disco) in reply to chubertdev
    chubertdev:
    I'd have to respond in the Palm Pilot thread

    :blush: oh.

    OOH!

    :-D

  • (disco) in reply to boomslang
    boomslang:
    The behavior of strlen is not undefined.

    This invocation of strlen() causes it to read outside the allocated buffer, which is undefined. If your code example is compiled with optimization, the stack layout may change, and the returned length may not be 4. Segmentation faults may or may not happen.

    The following is also undefined:

    char * p = malloc(20);
    p += 30;
    
  • (disco) in reply to Steve_The_Cynic
    Steve_The_Cynic:
    I merely invited it to go forth and multiply.

    Ah.. the 'sex and travel' option...

  • (disco)

    http://youtu.be/PmakuC7OgIk?t=26s

  • (disco) in reply to mott555

    I am not watching that video.

  • (disco) in reply to mott555

    DO NOT WANT! [image]

  • (disco) in reply to accalia

    I think the other picture you use more adequately captures the sentiment than this one does.

  • (disco) in reply to FrostCat

    which other one? i use a lot.

  • (disco) in reply to accalia
    accalia:
    which other one? i use a lot.

    The golden shepherd or whatever it is. It's got a much more horrified look.

  • (disco) in reply to boomslang
    boomslang:
    If there is no '\0' in the string, it will just keep going through memory, which may or may not trigger a segmentation fault.
    Or it won't, because the behavior in that case is undefined. It could go for a while, give up, and return a random value. Or do even stranger things.
    The header <string.h> declares one type and several functions, and defines one macro useful for manipulating arrays of character type and other objects treated as arrays of character type. The type is size_t and the macro is NULL (both described in 7.17). Various methods are used for determining the lengths of the arrays, but in all cases a char * or void * argument points to the initial (lowest addressed) character of the array. *If an array is accessed beyond the end of an object, the behavior is undefined*.
    7.21.1(1) in [this][1] C draft standard, but I guarantee you that the ratified ones from C89 onward will have similar language. (Emphasis mine.)

    I could quote other sections that actually define UB, but I don't have the energy right now.

    boomslang:
    /* memory is written outside the bounds of the str array, but it is overwritten with the same value the memory held before being overwritten */
    Or maybe it isn't and your program crashed before then instead. Or produced wrong answers, because different live ranges of a not-address-taken variable got placed at different addresses, so `&x` in one location was `0x1234` and at another was `0x1232`, so your assumption that the value didn't change in the intervening time didn't hold. (I don't know that compilers actually do this, but I wouldn't be remotely surprised -- you'd almost have to specifically arrange to _not_ do that if you're going from SSA form -- but I'm pretty sure it would be totally legal in any case.)
    boomslang:
    This sort of article happens when Java programmers attempt to write an article about C, without understanding C. Sorry, you kind of set yourself up for it
    They're not the one who set themself up for it.
  • (disco) in reply to boomslang
    boomslang:
    This sort of article happens when Java programmers attempt to write an article about C, without understanding C. Sorry, you kind of set yourself up for it
    I will, however, link to the following blog posts. Read at least one of the series, then come back.

    What Every Programmer Should Know About Undefined Behavior: http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_21.html

    A Guide To Undefined Behavior in C and C++ http://blog.regehr.org/archives/213 http://blog.regehr.org/archives/226 http://blog.regehr.org/archives/232

  • (disco) in reply to Zylon
    Zylon:
    There isn't nearly enough making fun of Remy here for making the article's image-- 1. Hotlinked from wikimedia.org 2. A freaking 2560 x 1700, 773K JPG.

    Considering that the hotlinked image is, itself, a thumbnail of the original, it could have been worse.

  • (disco) in reply to PleegWat
    PleegWat:
    The following is also undefined:
    char * p = malloc(20);
    p += 30;
    

    No, since you are just modifying the pointer. It would still be perfectly defined if you then did:

    p[-15] = 123;
    

    You are accessing at a point inside the allocated memory block; that the pointer points somewhere else doesn't matter.

    It would matter if you did:

    char * p = malloc(20);
    p += 30;
    p[0] = 123;
    

    Don't do that unless you like nasal demons.

  • (disco) in reply to dkf
    dkf:
    No, since you are just modifying the pointer.
    Technically, PleegWat is correct.

    C draft standard, 6.5.6(8):

    When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. **If both the pointer operand and the result point to elements of the same array object**, or one past the last element of the array object, the evaluation shall not produce an overflow; **otherwise, the behavior is undefined.** If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.
    (Emphasis and deemphasis mine.)

    There are a number of reasons for this; one is that any object could potentially be set at the end of a memory space and that addition could overflow, and they left overflow behavior for pointers undefined (like signed integers but unlike unsigned integers, where overflow behavior is defined). It also makes legal implementations that check for valid pointer operations at the pointer arithmetic step instead of at the dereference step, though I don't know of any that take advantage of this behavior.

    Practically speaking of course, the operation in question is quite unlikely to produce any adverse effects.

  • (disco) in reply to EvanED

    Yes, except that the compiler will usually start by optimising the code to (the equivalent of):

    char *p = malloc(20);
    *(p + (30-15)) = 123;
    

    Assuming that p has no other uses later. ;)

    Anyone building a compiler where this is not the case in any memory addressing mode will get beaten up by an angry mob of programmers. We're talking true pitchforks-at-dead-of-night territory here.

  • (disco) in reply to dkf

    I think this has more to do with hardware than compilers. On x86, there is a uniform memory model, in which every bit pattern of the correct width is a potential correct pointer value, and all memory can be seen as one contiguous unit.

    On different platforms, different bit patterns may point to different types of memory, or may not be valid at all.

    In @dkf's example, on some platform, the +30 may be done as a normal integer addition, and result in a bit pattern pointing to a different section of memory. The subsequent array indexing may use a special array indexing operation, which may use different semantics.

    p = malloc(20);   // p = 0xFEE8
    p += 30;          // p = 0xFF06
    p[-15] = 123;     // (p - 15) = 0xFFF7
    

    This would be possible on hardware where an array index caluclation only works on the last 8 bits of the pointer value.

  • (disco) in reply to PleegWat
    PleegWat:
    This would be possible on hardware where an array index caluclation only works on the last 8 bits of the pointer value.

    I think that would not be a conforming C implementation. Suppose one had done p = malloc(275); instead? All that's changed is an argument to a function. (It's a built-in function, but it's still just a function.) Pointer arithmetic has got to work the way it is defined to, and that permits some fairly strong simplifications.

    If you're going to say that malloc can't allocate buffers that size, we're back to the pitchforks-at-midnight. Watch your back…

  • (disco) in reply to dkf

    Or that malloc would return a different kind of memory. Or the compiler would only use that array indexing instruction if it could verify it was safe in that case - that would not exclude my example because undefined behaviour was in play there.

  • (disco) in reply to dkf

    malloc(20) certainly could put the object 20 bytes (well, 21 bytes depend on what assumptions you make) from the last address in its segment. Doesn't mean that malloc(275) couldn't work, it just means that it would have to put that at least 275/276 bytes from the end.

Leave a comment on “Nasal String Length”

Log In or post as a guest

Replying to comment #:

« Return to Article