• (disco) in reply to boomzilla

    Googling "dreamweaver page site:thedailywtf.com" should find what you're thinking of.

  • (disco) in reply to chubertdev

    Which I tried before my frist post about this, but it didn't seem to. Or, the bug may have been buried in a bigger discussion, but I didn't wade in very deeply, so maybe I missed it in there.

  • (disco) in reply to EvanED
    EvanED:
    Patrick_Schluter::
    Adding one byte to the size of the mmap transforms your "random data block" to a nul terminated string.
    Until you read a file containing NUL.

    Yes, and? The scan will stop at that position, nothing more. At least it won't segfault. It is furthermore irrelevant if the file is memory mapped or accessed with I/O commands. If the file is wrong, it is wrong and the program may behave sanely or be completely insane. The problem of the 8K bug is that it crashes the program with a file that is completely OK. In the article a css file, in my case it was sgml files.

  • (disco) in reply to EvanED
    EvanED:
    Doesn't matter.
    Yes it does. Worrying about text strings with nulls inside is like worrying about 1+1 not being equal to 2.
    EvanED:
    Your user can provide one, so you better behave correctly
    If your user can then you fucked up the frontend.
    EvanED:
    Operating as if the stuff after the NUL isn't there is a WTF (and in many cases a potential security vulnerability
    There are potential security vulnerabilities EVERYWHERE.
    EvanED:
    and in others provides a very easy way to delete a bunch of the user's data
    It's recoverable.
  • (disco) in reply to anonymous234
    anonymous234:
    Or in other words, a length-prefixed string. Which are so obviously superior to null-terminated strings it's not even funny. But I guess the designers of C couldn't afford the extra 4/8 bytes per string.

    You do realize that when C was created, 8K was a fairly typical size for the entirety of RAM, right? So yes, an extra 4/8 bytes per string was significant. At that time the length field probably would have been another 8-bit integer (as it was in Pascal), meaning strings longer than 255 bytes would not be supported. So no, length-prefixed was not "obviously" better in the context in which the language was designed.

    Someday in the far future I'm sure TDWTF will host discussions of people claiming that UTF-128 is so "obviously" superior to ASCII that 20th century designers must have been complete morons not to have implemented it first and avoided the whole ridiculous mess of encoding schemes. And it goes without saying that the length prefix needs to be at least 128 bits so that each volume of the entire glorious Vogon collection of epic poetry can be represented in a single string.

  • (disco) in reply to narbat
    narbat:
    You do realize that when C was created, 8K was a fairly typical size for the entirety of RAM, right? So yes, an extra 4/8 bytes per string was significant. At that time the length field probably would have been another 8-bit integer (as it was in Pascal), meaning strings longer than 255 bytes would not be supported. So no, length-prefixed was not "obviously" better in the context in which the language was designed.

    There may be a "Someone didn't get the joke" badge in your future.

  • (disco) in reply to Gaska
    Patrick_Schluter:
    Yes, and? The scan will stop at that position, nothing more. At least it won't segfault.
    Depends what you're doing to it. Lots of programs read in data, put it into some internal representation, let the user edit it, then pretty print it when saving. If you read in only half of a file because there's a NUL in the middle and do that, congrats, you've just deleted half the user's file.

    That's much worse than a segfault IMO.

    I'm not saying that programs that something like that necessarily have the above bug -- that's why I said that reading in a file and immediately starting to do crap with C string functions is a smell and not an outright error.

    Gaska:
    EvanED: Your user can provide one, so you better behave correctly

    If your user can then you fucked up the frontend.

    If you're saying that the UI code should check for NULs, then that's great; I agree that's a fine solution. I'm just saying you can't just *mmap* a file and pretend it's a string without looking first. (Or that if you do, it's a big smell.)
  • (disco)

    CROMEMCO really was one of the coolest tech company names ever.

  • (disco) in reply to antiquarian
    antiquarian:
    There may be a "Someone didn't get the joke badge" in your future.

    Possibly, but not "obviously". :-)

  • (disco) in reply to Gaska
    Gaska:
    anonymous234: Regardless of what you want to support, you ALWAYS need to make sure your program won't break (in an insecure way) if someone maliciously injects a \000 in its input. So that's yet another thing you have to keep in mind when using C functions.

    Usually when this happens, you end up with truncated string. That's far from crash.

    But still pretty bad. At best you might be missing some user's data. At worst you could have a gaping security hole. One SSL library (can't remember which) once had a bug where it didn't handle embedded NULs in certificate subject names properly, so if a website presented a certificate saying it was "google.com\000evilhax0rsite.ru", the SSL library would pass validation for google.com, and suddenly the user is susceptible to a MITM attack.

  • (disco) in reply to anonymous234
    anonymous234:
    Or in other words, a length-prefixed string. Which are so obviously superior to null-terminated strings it's not even funny. But I guess the designers of C couldn't afford the extra 4/8 bytes per string.

    It's not even an extra 4/8 bytes. It makes substring an O(1) operation that allocates exactly two words of memory.

  • (disco) in reply to ben_lubar
    ben_lubar:
    It's not even an extra 4/8 bytes. It makes substring an O(1) operation that allocates exactly two words of memory.

    makes string concatenation more interesting though.

    not unsolveably so, but interesting is interesting.

  • (disco) in reply to EvanED
    EvanED:
    I'm just saying you can't just mmap a file and pretend it's a string without looking first.
    I was saying the exact same thing. Expect my rationale was different - it's not because it would yield wrong result, but because mmap is conceptually *not a string*.
    accalia:
    makes string concatenation more interesting though.

    not unsolveably so, but interesting is interesting.

    Not much more interesting than counter-sized string concatenation. You just have to do two strlen()'s, that's all.
  • (disco) in reply to Gaska
    Gaska:
    Also, if you are a programmer and don't use any kind of VCS, you should learn it ASAP. For your own good.
    Yes, exactly. That's probably the most important aspect of programming that they never taught me a single thing about in college.
  • (disco) in reply to ben_lubar
    Mason_Wheeler:
    That's probably the most important aspect of programming that they never taught me a single thing about in college.
    I think it's bonkers that it's not part of CS 101 classes. I've taught an upper-division class on compilers a couple of times, and in my experience:
    • If you ask how many people have used VCS somewhere around half will raise their hands;
    • If you ask how many have had folders called project.1, project.2, project.working, etc. basically everyone will raise their hand;
    • If you ask how many have had something working, then broke it, and had a hard time going back to working, almost everyone will raise their hand.

    I spent about 20 minutes talking about and giving a demo of it in an early class and wrote up a guide that tries to cover SVN/Git/Hg in one shot for someone who hasn't used any VCS, which I haven't really seen elsewhere (though probably there is one somewhere). I've considered requiring it but have shied away from it for a couple of reasons...

  • (disco) in reply to EvanED

    My college didn't, but we had VMS, which has a crude system-wide pseudo-VCS, in that it automatically versions every single file whenever you edit it, appending a ;n to the file name, where n is a sequentially-numbered integer.

    then the problem was you had to learn how to purge old versions due to the tiny disk quota we were allocated.

  • (disco) in reply to EvanED

    In the course I am currently in (literally, I am in class right now), the assignments are bare git repositories with Eclipse projects in them that we have to clone over ssh. When class starts on the day the assignment is due, we lose write access to our repositories.

  • (disco)

    TRWTF is requiring to use Dreamweaver ... shittest editor ever.

  • (disco) in reply to ben_lubar

    That's great! I would go a step further and work only with pull requests and code reviews by teacher, students or assistants.

  • (disco) in reply to EvanED

    Source control is a fine art and I think is more important than some of the other subjects taught. Also "build systems" aren't every taught. There has been weeks wasted in the company by people trying and failing the following the "setup doc".

    I don't have a project on my machine that isn't in source control anymore.

  • (disco) in reply to lucas
    lucas:
    Source control is a fine art and I think is more important than some of the other subjects taught. Also "build systems" aren't every taught. There has been weeks wasted in the company by people trying and failing the following the "setup doc".

    I, at least, had basic Makefiles in my 101 classes -- had I had my dibs, Mercurial would be in there as well, since it's well-suited for individual use and will go with you wherever you go.

  • (disco) in reply to tarunik

    I think the idea of a "build system", "automated testing" and source control relate very close together. I wasn't exposed to these things via University and only because I have mucked around with things like ports and git did the picture start fitting together for me.

  • (disco) in reply to EvanED
    Patrick_Schluter:
    Adding one byte to the size of the mmap transforms your "random data block" to a nul terminated string.

    Except if someone else opens the same file with some extra length, and wrote garbage into the trailing data. I've seen this happen on linux: If you mmap() an empty file, and write to it, mmap() in another process and read, you see that data. Probably not guaranteed, but at least it is possible.

    Just don't use str*() functions on anything if you're not sure about nulls.

    EvanED:
    and that means that calling string functions on it is a big smell, at least to me. (Aside from strlen to compare to the actual buffer size so that you can throw an error if it's not there.)

    NO. Don't do this. Ever. If it's not there your strlen() will read past the end of the buffer, potentially you know the drill.

    Use memchr() instead.

  • (disco) in reply to Gaska
    Gaska:
    it's stupid and counter-productive to unit-test 3rd-party software

    Unit testing is for increasing confidence, and that's just as valid for 3rd party software as it is for 1st party. Unit test anywhere the increased confidence provides more value than the cost of writing the tests.

    Unit tests are also useful in bug reports both as an executable specification and as a clear and unambiguous statement of intent.

  • (disco) in reply to lucas
    lucas:
    I think the idea of a "build system", "automated testing" and source control relate very close together. I wasn't exposed to these things via University and only because I have mucked around with things like ports and git did the picture start fitting together for me.

    In general...you'll have build systems almost no matter where you go (unless you rely on an IDE to do all that for you, but you'll still have one then), source control in many places (TRWTF is the place that lacks it), and automated tests in places that actually care + have code that's testable -- many legacy codebases are simply too tangled for unit testing.

  • (disco) in reply to lucas
    lucas:
    Also "build systems" aren't every taught. There has been weeks wasted in the company by people trying and failing the following the "setup doc".
    I feel mostly differently about build systems, for a couple reasons.
    • You don't need it nearly as early, if you either encouraging IDE use (it's still there but it is hidden, and that's OK) or are working in a non-compiled language (in which case there's nothing to build). You could pretty reasonably get through a whole course -- or at least to almost the end of it -- before going beyond that; but I think you could make a somewhat-compelling case for making the first assignment in CS 101 be to commit some stuff to a Hg repository even before coding. (I don't think I'd actually advocate that, but you could make an argument for it.)
    • There are way more build systems than VCS, and way more differences in common build systems (IMO) than in common VCSs, and build systems are often language-specific. You could teach someone Hg or Git or SVN and they could use that single for every class project in college and it would fit in pretty well; but I don't even think there's a build system that is comfortable (or what I'd call comfortable anyway) with both Java and C let alone going to the world of Python or something. This makes teaching build systems feel far more specialized and less appropriate for a University setting to me than version control.
  • (disco) in reply to ben_lubar

    that....makes a lot of sense actually....

  • (disco) in reply to EvanED
    EvanED:
    There are way more build systems than VCS, and way more differences in common build systems (IMO) than in common VCSs, and build systems are often language-specific. You could teach someone Hg or Git or SVN and they could use that single for every class project in college and it would fit in pretty well; but I don't even think there's a build system that is comfortable (or what I'd call comfortable anyway) with both Java and C let alone going to the world of Python or something. This makes teaching build systems feel far more specialized and less appropriate for a University setting to me than version control.

    It's odd, though, because you can tailor it for build systems too precisely. I've used C# with Git, Subversion, TFS, VSS, SourceOffSite, and a couple of other VCS systems, but I've never used Java.

  • (disco) in reply to lucas
    lucas:
    There has been weeks wasted in the company by people trying and failing the following the "setup doc".

    Ah hell. A new machine setup here can only be accomplished via sneaker-net. Because not everything is in git. And working on Windows in a Mac-shop sucks. (network == googledocs)

  • (disco) in reply to another_sam
    another_sam:
    Unit testing is for increasing confidence, and that's just as valid for 3rd party software as it is for 1st party. Unit test anywhere the increased confidence provides more value than the cost of writing the tests.
    Unit testing means testing single unit of a program (one class, for example) in a controlled environment so that for the same code it will always result in the same result. With 3rd party, you can't divide the program to units, you can't run it in fully controlled environment without spawning VM, and you can't fix the bug even if you discover it.
    another_sam:
    Unit tests are also useful in bug reports both as an executable specification and as a clear and unambiguous statement of intent.
    1. So is a list of steps. 2. In my company, I heard of one case of filesystem driver vendor refusing to acknowledge a bug DESPITE GETTING A SHORT BASH SCRIPT (what could be called a unit test) THAT SUCCESSFULLY REPRODUCED THE BUG EACH AND EVERY TIME IT WAS RUN ON ANY CONFIGURATION WITH THIS DRIVER.
  • (disco) in reply to tarunik

    The attitude that I have encountered in most places I have worked is one of "there is this box and if it is configured just right, you build onto it and then you kinda tweak the settings until it works" and other braindead methods.

  • (disco) in reply to lucas
    lucas:
    The attitude that I have encountered in most places I have worked is one of "there is this box and if it is configured just right, you build onto it and then you kinda tweak the settings until it works" and other braindead methods.

    That sums up what the $vendor I deal with does...

  • (disco) in reply to FrostCat
    FrostCat:
    My college didn't, but we had VMS, which has a crude system-wide pseudo-VCS, in that it automatically versions every single file whenever you edit it, appending a ;n to the file name, where n is a sequentially-numbered integer.

    then the problem was you had to learn how to purge old versions due to the tiny disk quota we were allocated.

    So you went to college in the 80s... we actually still use the VAX for many of our projects.

  • (disco) in reply to Gaska
    Gaska:
    >Eldelshell: And commas and periods. Good God, that was like reading my mothers WhatsApps.

    Well, it was perfectly correct sentence. Lack of commas comes from cleverly connecting dependent and independent clauses in the way that's absolutely unambiguous, mitigating the need for commas.

    Reality is somewhere between these two extremes. There are only two commas missing, one require and one optional but helpful:

    CarrieVS:
    All I know about Dreamweaver is the name, and I haven't encountered source control, since I'd not entirely got to grips with it at the end of the very hurried crash course the contracting company that owns me put me through before hiring me out to where I'm working now the best part of a year ago
    The first one is required; without it, the sentence is a run-on sentence. The second one separates a somewhat parenthetical remark (explaining why you haven't encountered source control) from the statement it is explaining. It's grammatically optional, sort of, but it helps break the complicated sentence into smaller, more digestible pieces.
  • (disco) in reply to Gaska
    Gaska:
    Unit testing means testing single unit of a program (one class, for example) in a controlled environment so that for the same code it will always result in the same result

    You can set initial conditions and run a specific test. Also I wasn't necessarily thinking 3rd party == program, I was also thinking 3rd party == library.

    Gaska:
    you can't run it in fully controlled environment without spawning VM

    Surely you know what resources it uses? Filesystem, network, etc?

    Gaska:
    you can't fix the bug even if you discover it.

    But you can report it.

    Gaska:
    So is a list of steps.

    A list of steps isn't executable (by a computer). If it is executable it's code and a unit test.

    Gaska:
    In my company, I heard of one case of filesystem driver vendor refusing to acknowledge a bug

    That vendor is a dick. Dump them. The unit test (bash script) is the documentation for the reason to dump them. You didn't create the bug by unit testing.

  • (disco) in reply to EvanED

    When I say taught, some developers don't seem to ever grok that having a repeatable process that works by firing off a script of some sort is a really essential. Most of my "build" scripts are nothing more than a git pull then running a tool like npm/nuget/composer etc and then setting the right config values.

    I have worked too many places now where nobody understands how the build really works and things have to be set and installed manually on the deploy target before anything works at all.

  • (disco) in reply to another_sam

    Posting from Windows Phone where selecting text on Dicksource is virtually impossible, so numbers instead of quotes.

    1. Unit testing third party library is as pointless and impossible as third party app.

    2. No, you don't know what resources and in what way it uses. You don't have the source code. You can't know if it doesn't for example checks 185736th byte of /dev/sda for some magic value.

    3. You can bug report with list of steps too. It's also far more common.

    4. Bug reports should be human-readable, not machine-readable.

    5. It's not up to me to decide.

  • (disco) in reply to lucas
    lucas:
    When I say taught, some developers don't seem to ever grok that having a repeatable process that works by firing off a script of some sort is a really essential.
    That's reasonable. And I meant to mention, and then forgot, that I did require that for my projects; they had to turn it in with a shell script called `build.sh` or something that would go off and build it. Presumably by calling Ant or whatever they picked, but I wasn't picky.
  • (disco) in reply to boomzilla
    boomzilla:
    I could have sworn this bug had been on TDWTF before, but now I can't find it.

    That thread crashed because of an 8192 byte comment in it.

  • (disco) in reply to Gaska

    I'm feeling lazy, so:

    1. Wrong.
    2. How do you use the software if you don't know what it's doing?
    3. Less efficient.
    4. Both is best.
    5. Sucks to be you.
  • (disco) in reply to another_sam
    another_sam:
    How do you use the software if you don't know what it's doing?
    I know *some* of the things it's doing - and it's sufficient to do my task at hand. You don't have to know the tool inside out to use it - but to unit test it, you do.
    another_sam:
    Less efficient.
    Quite the opposite. Writing a list of steps to reproduce is much quicker than writing a test exhibiting the faulty behavior. And it takes much more time for a human to read a test written in some programming language than to read a list of steps.
    another_sam:
    Both is best.
    Yes, but human-readability takes priority. Machine-readable bug report is just one of million things that are nice to have but not necessary.
    another_sam:
    Sucks to be you.
    Actually, none of this is my problem - I'm two "layers" of "abstraction" above the project affected.
  • (disco) in reply to Gaska
    Gaska:
    Posting from Windows

    That's all I could get in my attempted quote just now. It's a really annoying bug, especially when the devs have decided that the tiny "quote full post" button takes up too much space to appear in the mobile view.

    With that and the random jumps up the page on load, using the forum on WP8 is a real chore.

  • (disco) in reply to Keith

    It's Discodiscoverable, but if your problem is not seeing the "quote reply" button, you can select text and just click the main reply button to the post. Of your problem is that you can't reliably select text on WinPhone, you're buggered

  • (disco) in reply to Jaloopa

    Yeah, text selection is broken. Only on Discourse though (of course).

  • (disco) in reply to Keith

    I have no idea how they'd manage that, but I'm not surprised

  • (disco) in reply to Jaloopa

    In Android + Discourse, when trying to paste something, instead of placing the cursor and showing the button (standard), it makes a random selection and shows the button. So you have to make sure to leave some white space to allow for the pasting to work.

    Only with Discourse I've seen this behavior.

    [image]
  • (disco) in reply to Jaloopa
    Jaloopa:
    I have no idea how they'd manage that, but I'm not surprised

    I've documented the problem here now.

  • (disco) in reply to Gaska
    Gaska:
    I know some of the things it's doing - and it's sufficient to do my task at hand. You don't have to know the tool inside out to use it - but to unit test it, you do.

    That doesn't sound right to me. You at least have an expectation of inputs and outputs. When it has to be upgraded due to some security thing or other dependency run around, at least you have something that will confirm that your expectations are correct.

    There are two reasons that a test like this is valuable to me:

    1. It helps me confirm my expected behavior.
    2. It notifies me if the behavior changes.

    The black boxiness of the third party code doesn't change any of that.

  • (disco) in reply to boomzilla

    For me, automated software testing has the following goals:

    • ensures that the implementation is correct so if it's not, you can correct it quickly
    • if there is a bug, helps detecting what part of product you have to fix
    • checks if any of subsequent changes introduce old bugs so you can easily fix them (assuming you know what the original bug was)

    None of the above is possible if you can't modify the product.

  • (disco) in reply to Gaska
    Gaska:
    For me, automated software testing has the following goals:
    • Sleep at night
    • Leave home at time

Leave a comment on “The 8K Bug”

Log In or post as a guest

Replying to comment #443064:

« Return to Article