• Dave (unregistered)

    Its nice to see a real coding WTF for a change.

  • nmclean (unregistered) in reply to DrPepper
    DrPepper:
    Who's to say that this is a WTF?

    Without knowing the actual requirements, who knows. For all we know, perhaps the requirements are that "it's bullet proof as far as crash/restart; and that performance is irrelevant."

    Sure, there are other ways to do it -- for example, put each line in a DB, and mark the record after it's been processed; and when the job restarts, only process records that are not marked.

    But suppose the task was assigned like this: As quickly as possible, write a processing loop that is guaranteed to send a line from the file to the processor exactly one time. Performance is not an issue, but I've allocated less than one hour of time for this task.

    How else would you do it? Not everything needs to be "enterprise quality"; and not every task has the luxury of a week of design/programming time.

    Oh how I hope this is a joke and lament the likelihood that it's not.

    First of all your hypothetical requirements conflict one another: "bullet proof" and "less than one hour". Nope, sorry, doing it right takes time. This solution is still far too easy to break.

    Second, having a time limit doesn't change the fact that the code is a huge WTF. If we really needed a half-assed solution to this problem written as quickly as possible, a competent programmer could have done so without resorting to such convoluted file operations, and in less time, by simply storing the progress and seeking to it at the beginning of an attempt. Even if there were a bizarre requirement that the lines need to be removed from the file, that operation could also run once at the beginning instead of every iteration. This code is not defensible on any level.

  • Hannes (unregistered)

    This reminds me of a program I changed just recently. Before, it ran for 4 to 6 hours (depending on how busy the servers where with other stuff). Now it only takes about 1 hour.

    Before, there was stuff like: "Look at one line in the table. Look for a specific value. Is this value bigger than X? If so, update another column. Look at the next line in the table..." Of course, every time the db connection was opened and closed. The table in question contains over 120 million records! Usually took about 2 hours to complete. Now I simply load everything into a DataTable (yes, C# is TRWTF), do a foreach over every row and copy everything back with Bulk Copy. Takes a few seconds.

    There was other WTF-y stuff in there, like populating a table and then deleting rows that aren't needed (instead of simply excluding them while populating the table) and then doing an update on the table to fill a different column (instead of simply doing this calculation while populating the table).

  • Meep (unregistered) in reply to Dan
    Dan:
    DrPepper:
    ip-guru:
    This is just defensive programming. should the process fail it will resume where it left of instead of having to process the whole file again ;-)

    Plus one. Sure, it's slow, and sure it does a lot of operations (hint -- isn't that what computers are for?) AND if you power off the computer in the middle of the job, it will just pick up where it left off.

    Who's to say that this is a WTF?

    Without knowing the actual requirements, who knows. For all we know, perhaps the requirements are that "it's bullet proof as far as crash/restart; and that performance is irrelevant."

    Sure, there are other ways to do it -- for example, put each line in a DB, and mark the record after it's been processed; and when the job restarts, only process records that are not marked.

    But suppose the task was assigned like this: As quickly as possible, write a processing loop that is guaranteed to send a line from the file to the processor exactly one time. Performance is not an issue, but I've allocated less than one hour of time for this task.

    How else would you do it? Not everything needs to be "enterprise quality"; and not every task has the luxury of a week of design/programming time.

    You HAVE to be trolling!
    And you had to be trolled.

  • (cs) in reply to Hannes
    Hannes:
    Now I simply load everything into a DataTable (yes, C# is TRWTF), do a foreach over every row and copy everything back with Bulk Copy. Takes a few seconds.
    Teach yourself about the SQL UPDATE-statement! It can be used for multiple rows at once!
    Meep:
    And you had to be trolled.
    In case IHTBT by Hannes: IHL! HAND.

Leave a comment on “Line by Line”

Log In or post as a guest

Replying to comment #:

« Return to Article