• (cs) in reply to ThomsonsPier
    ThomsonsPier:
    Someone You Know:
    Gorfblot:
    A diamond in the rough is something that looks to be of little value, but is actually worth quite a lot once some polishing has been done. IE, You talked with one of the Tier 1 helpdesk people and realized they had some talent and were quick to learn. You get them some further training, move them to a junior development position, and in a short time they become a major contributor to success- That's a diamond in the rough.

    I don't think "IE" means what you think it means.

    Internet Explorer? If that's the case, I make sure to pronounce it, 'AIEEEEEE!'

    Agreed. But using "i.e." when one means "e.g." is almost as annoying as IE.

  • Mover And Copier (unregistered) in reply to SomeCoder
    SomeCoder:
    mv may be CLOSE to atomic but it's definitely not guaranteed to always be atomic. And if we suddenly have to change directories across partitions then it damn well is NOT atomic.

    I think Jeremy H should get a better interview question.

    We shouldn't get hung up on exactly what "atomic" means. Just focus on the question whether or not there is any chance the Watcher might process a file too soon.

    With the temp directory solution, the file has already been written completely to disk, and all the mv has to do is to create a new directory entry for it in the new directory. So what might happen if the mv is not atomic? The directory entry might not be complete, and the Watcher only sees half the file name and therefore can't find the file? It'll just have to try again later.

    Obviously, when setting up the solution you would have to make sure the temp directory is on the right file system. /var/tmp might indeed not cut it.

    The interview QUESTION is good enough. The problem might be that when I'm told I may not modify the Watcher, how the heck am I supposed to know that I might be allowed to modify the Downloader?

  • (cs)

    We have 2 open positions on our team, due to high turnover, so I interview probably about 5 or 6 people a week and have gotten really good at giving technical interviews. It usually involves giving them a real problem from some code we have, and seeing if they solve it the right way, and then I explain to them how it should be done and make sure they agree.

    It is hard to find people who have the right mix of skills and personality. Some people realize halfway through my technical interview that they lack the required knowledge and simply cut it short and walk out of the room, I assume in embarrassment.

  • (cs) in reply to klenow
    klenow:
    Is it just me, or does Jeremy sound like a bit of a jerk? He simply wouldn't let the guy use any solution that wasn't his pet solution. Seems the simplest solution is to have the Downloader activate the Watcher when it's done.

    Yes, it is just you. Actually, I would say that he should rephrase the question, something like this maybe:

    Users are running on a version of Linux which they install and upgrade themselves. We have written a downloader application to download some important data for their business. They are using a 3rd party processing app which watches a specific directory for new downloads. As soon as the processing application sees a new file in the directory, it begins processing it. However, processing happens faster than downloading, and the watcher will produce an error if it processes an incomplete file. How can we modify the downloader to make sure this doesn't happen?

    Now it's clear that you can't modify the OS because you don't control it, and you can't modify the watcher because you don't control it. You've also made it clear that you want a solution which modifies the downloader program.

    In any event, I don't think Jeremy was being a jerk at all. He was just having trouble describing a real-world problem accurately. It's very common for inexperienced people to become flummoxed at real-world questions like these because they don't have the experience to know that these problems will arise. It just requires a little forethought when phrasing the question.

  • September ain't over yet (unregistered) in reply to SomeCoder

    Every file system command? No.

    Rename? Yes, it's guaranteed to be atomic, by the relevant standards, at least in the way we care about.

    Seriously, it links the inode to the new directory, and unlinks it from the old one (link + unlink is another way to do rename, BTW). In no sane implementation can that result in seeing a partial file.

    And before you mention different filesystems, rename (and link) generally do not support that (they return errors), and, if they do support it, they are STILL required to be atomic.

    RTFS — http://www.unix.org/single_unix_specification/

  • uxor (unregistered)

    What is the point of an Async watcher/downloader when they files needs to be synchnerized? I'd just make the downloader start the watcher to process the file it finished downloading.

    Though if they were to keep the original design and this being in Linux with the same assumtion from the question.

    I'd have the downloader create a second file to mark download completion for each file downloaded. This will prevent the moving gigs of data...

    The 2nd file would be a MD5 read from the server and on the downloaded data and other meta data about the file for verification purposes...

  • E (unregistered)

    Instead of mv, why not ln?

  • (cs)

    I think I know what happened with Jeremy's interviewee, because I tripped over exactly the same thing. When Jeremy said "you can't change the Watcher", I, and I presume the interviewee, interpreted that as "you can't change the Watcher or the Downloader". Without pausing to check the constraint, the solution proposed is not unreasonable; without restating the constraints back to the interviewer, the wrong assumption will go unnoticed (and produce an apparently bizarre solution).

    So I don't think this is a WTF, just a misunderstanding. I hope this question wasn't the only reason the guy didn't get the job, because he's exactly the kind of person you'd want to have around when things really are that intractable.

  • September ain't over yet (unregistered) in reply to Paul
    #!/bin/sh
    # Assume that the watcher is already running, and is watching
    # for new files in /var/local/whatever/ready using inotify
    # or whatever
    
    cd /var/local/whatever
    [ -d tmp ] || mkdir tmp
    
    wget -O tmp/$$ "http://www.whatever.com/file"
    mv -f tmp/$$ ready/$$

    So even if it was not possible to modify the downloader, this is STILL easy to do.

  • David (unregistered)

    "margin: auto" Thanks for the tip.

  • fert (unregistered)

    every body is worried about mv being atomic or not, but if you can't assume that you have access to an additional temporary directory on the same disk, couldn't you download to some temp directory you do have access to, and then toss a link into the directory being polled? Cleanup would be a bit more of an issue, the watcher would take care of the links, but the original file(s) might require a bit of trickery. But this would certainly guarantee atomic operation, download file - create link - .... - clean up file, just random ideas.

  • Gorfblot (unregistered) in reply to Someone You Know
    Someone You Know:
    I don't think "IE" means what you think it means.

    It's certainly possible. I think it means id est, and loosely translates into "That is to say".

    I used it between two attempts at explaining the metaphor- A literal one, and one where I attempted to show a situation where the description might be more valid.

    What do you think it means?

  • JamesQMurphy (unregistered) in reply to TopCod3r
    TopCod3r:
    We have 2 open positions on our team, due to high turnover, so I interview probably about 5 or 6 people a week and have gotten really good at giving technical interviews. It usually involves giving them a real problem from some code we have, and seeing if they solve it the right way, and then I explain to them how it should be done and make sure they agree.

    It is hard to find people who have the right mix of skills and personality. Some people realize halfway through my technical interview that they lack the required knowledge and simply cut it short and walk out of the room, I assume in embarrassment.

    Have you ever asked why you have high turnover?

  • adsfg (unregistered) in reply to Branan
    Branan:
    A file move is different from a file copy in Linux. A move involves changing one pointer in the filesystem, no information is actually "moved". So it's not actually a problem.
    What if you move from physical drive to another?
  • (cs) in reply to adsfg
    adsfg:
    Branan:
    A file move is different from a file copy in Linux. A move involves changing one pointer in the filesystem, no information is actually "moved". So it's not actually a problem.
    What if you move from physical drive to another?
    Don't.

    Problem solved.

    I am totally amazed at how many 'over complicators' there are posting messages here. One would think they would be too busy over complicating things and not have any free time to post to TDWTF.

  • Jimmy Jones (unregistered)

    Surely a better solution to the download problem is to add ".unfinished" to the file name then rename it when it's complete.

    This avoids the problem of some smartass putting the temporary folder on another disk and then you get expensive copy operations, run out of disk space when you try to move the file, etc.

  • RBoy (unregistered) in reply to AMerrickanGirl
    AMerrickanGirl:
    He was stuck in traffic, with a cell phone that worked perfectly, since they got through on the first try, but he hadn't bothered to call his office to tell them to make nice to the 9 am interview that he was going to be very late for.

    Ah, but what if his cell phone didn't work?

  • ChiefCrazyTalk (unregistered) in reply to Azeroth
    Azeroth:
    There is another simple solution with the downloader/watcher problem - downloader should open the file exclusively while it's being downloaded, this way watcher won't be able to access it until it's closed. This way it's not even required to move anything anywhere.
    Another solution I've seen - download a 0-byte semaphore AFTER the file is downloaded, and check for the presense of that file before starting your processing of the main file.
  • Pat (unregistered) in reply to Gorfblot
    Gorfblot:
    Someone You Know:
    I don't think "IE" means what you think it means.

    It's certainly possible. I think it means id est, and loosely translates into "That is to say".

    I used it between two attempts at explaining the metaphor- A literal one, and one where I attempted to show a situation where the description might be more valid.

    What do you think it means?

    Well, I think it means nothing, because it was written as 'IE', not i.e.. That was kind of his point to begin with.

  • Reader X (unregistered) in reply to ThomasP
    ThomasP:
    I think you missed the part where they were talking about file moves in Linux, not Windows.

    That part isn't in the original problem statement. It's only included at the very end, where the interviewer says, "so you’re saying, to solve the problem of the Watcher processing files that are not done downloading, you would modify the Linux kernel?"

    Original Problem:
    Every night, a Downloader program will retrieve a handful of several-gigabyte files from a remote server and save them to a certainly directory on disk. A Watcher program monitors this directory and immediately processes whichever files show up. However, because downloading takes significantly longer than processing, the Watcher program will crash if it reads a file that has not been fully downloaded. How would you prevent this from occurring?
  • (cs)

    So. Why did the interviewee for a PHP developer position got asked about CSS?

  • (cs) in reply to ThomasP
    ThomasP:
    I think you missed the part where they were talking about file moves in Linux, not Windows.

    Move within the same filesystem ("rename") is atomic and doesn't involve copying/moving the file bits. In any reasonable OS, which Windows is, too, no matter how you would object. As soon as the file appears in the target directory, it's there instantly, and "Watcher" can read it without any problem.

  • adsfg (unregistered) in reply to Rick
    Rick:
    adsfg:
    Branan:
    A file move is different from a file copy in Linux. A move involves changing one pointer in the filesystem, no information is actually "moved". So it's not actually a problem.
    What if you move from physical drive to another?
    Don't.

    Problem solved.

    I am totally amazed at how many 'over complicators' there are posting messages here. One would think they would be too busy over complicating things and not have any free time to post to TDWTF.

    Actually, I was just asking what Linux does in this case. I'm not asking in relation to the problem.

  • SoonerMatt (unregistered) in reply to RBoy
    RBoy:
    AMerrickanGirl:
    He was stuck in traffic, with a cell phone that worked perfectly, since they got through on the first try, but he hadn't bothered to call his office to tell them to make nice to the 9 am interview that he was going to be very late for.

    Ah, but what if his cell phone didn't work?

    Seriously?!? They said the office called him and his cell phone worked perfectly.

  • Gorfblot (unregistered) in reply to Pat
    Pat:
    Well, I think it means nothing, because it was written as 'IE', not i.e.. That was kind of his point to begin with.

    Touché.

  • (cs) in reply to adsfg
    adsfg:
    Rick:
    adsfg:
    Branan:
    A file move is different from a file copy in Linux. A move involves changing one pointer in the filesystem, no information is actually "moved". So it's not actually a problem.
    What if you move from physical drive to another?
    Don't.

    Problem solved.

    I am totally amazed at how many 'over complicators' there are posting messages here. One would think they would be too busy over complicating things and not have any free time to post to TDWTF.

    Actually, I was just asking what Linux does in this case. I'm not asking in relation to the problem.

    I was referring to a myriad of over complicators posting. However, if you are asking for general knowledge, rather than to answer the interview question properly...

    Physical drives are irrelevant in Linux. Moves across file systems are not atomic, but are implemented as copy and remove. Traditionally file systems did not span physical drives, but today they can in various ways.

  • duplicity (unregistered) in reply to JamesQMurphy

    No, don't make it blue! It's just our favorite troll, TopCod3r.

  • (cs) in reply to mauhiz
    mauhiz:
    I don't know what I am missing there, why not just wait for the Downloader to finish its job?

    a script would look like this :

    Downloader && Watcher

    I don't think having a daemon poll a directory is a good practice anyways...

    A situation similar to this comes up at my office on a daily basis. We virtually never have control over both the watcher and the downloader. Often they don't even run on the same machine and simply interact through a shared folder or something similar.

    Saying "just have a script wait till the downloader finishes to start the watcher" assumes you have complete control over the system which may or may not be true. The solution of a tmp name/directory (or something like a poll timer checking for file size if you're the watcher) is the most universally accepted as it requires only modifying and/or controlling one of the applications

  • fortyrunner (unregistered)

    Another way to process large files is to ensure that the processor does not start processing a file until it sees a small semaphore file.

    E.g. a 100MB Movie.MOV file won't be processed until a 1byte Movie.MOV.GO file is in the same directory. I've been using this for years.

  • Chris (unregistered)

    Even more fun, ask for a SECOND simple solution.

    The temporary directory is a good solution, but it has problems. E.g., in Unix you can rename atomically, but I can imagine situations where you would have to manually read/write the entire file. Is there another standard approach?

    It turns out there is --- and it's actually more correct in some ways. Have the downloader get a 'write' lock on the file. Have the watcher get a 'read' lock on the file. (Or its own 'write' lock if it deletes the file as the final step in the process.) You're fine as long as everyone uses locks and the watcher program is smart enough to keep retrying to get a lock. (I assume it's not so stupid that it will wait indefinitely for that lock.)

  • KM (unregistered) in reply to Pat
    Pat:
    Gorfblot:
    Someone You Know:
    I don't think "IE" means what you think it means.

    It's certainly possible. I think it means id est, and loosely translates into "That is to say".

    I used it between two attempts at explaining the metaphor- A literal one, and one where I attempted to show a situation where the description might be more valid.

    What do you think it means?

    Well, I think it means nothing, because it was written as 'IE', not i.e.. That was kind of his point to begin with.

    I'm pretty sure the actual point was that i.e. != e.g. http://www.wsu.edu/~brians/errors/e.g.html

  • Steve Burnap (unregistered) in reply to mauhiz

    The problem statement said that the downloader downloads multiple files. Presumably you'd want the watcher to process file 1 while the downloader was downloading file 2.

  • (cs) in reply to KM
    KM:
    Pat:
    Gorfblot:
    Someone You Know:
    I don't think "IE" means what you think it means.

    It's certainly possible. I think it means id est, and loosely translates into "That is to say".

    I used it between two attempts at explaining the metaphor- A literal one, and one where I attempted to show a situation where the description might be more valid.

    What do you think it means?

    Well, I think it means nothing, because it was written as 'IE', not i.e.. That was kind of his point to begin with.

    I'm pretty sure the actual point was that i.e. != e.g. http://www.wsu.edu/~brians/errors/e.g.html

    I still think e.g. was what was needed here. I believe e.g. would be here is one of many examples.

    **EDIT - adding a link that shows my point http://ancienthistory.about.com/od/abbreviations/f/ievseg.htm

  • LEGO (unregistered) in reply to Someone You Know
    Someone You Know:
    Gorfblot:
    A diamond in the rough is something that looks to be of little value, but is actually worth quite a lot once some polishing has been done. IE, You talked with one of the Tier 1 helpdesk people and realized they had some talent and were quick to learn. You get them some further training, move them to a junior development position, and in a short time they become a major contributor to success- That's a diamond in the rough.

    I don't think "IE" means what you think it means.

    No, I think the usage is correct here IE = "Id Est" to denote clarification or further explanation.

    captcha: dolor. ie lorem ipsum dolor est...

  • Mizchief (unregistered) in reply to JamesQMurphy
    JamesQMurphy:
    TopCod3r:
    We have 2 open positions on our team, due to high turnover, so I interview probably about 5 or 6 people a week and have gotten really good at giving technical interviews. It usually involves giving them a real problem from some code we have, and seeing if they solve it the right way, and then I explain to them how it should be done and make sure they agree.

    It is hard to find people who have the right mix of skills and personality. Some people realize halfway through my technical interview that they lack the required knowledge and simply cut it short and walk out of the room, I assume in embarrassment.

    Have you ever asked why you have high turnover?

    Yea i'm guessing that you are the problem. People don't just walk out of interviews out of technical embarrassment if they don't know something they usualy say something like "If I had google in front of me I could look it up in 2 seconds!" They walk out in situations where they find the interview to be asinine, or decide the company sucks considering the events that lead up to that point.

  • Max (unregistered)

    So the real WTF is me for misreading move vs copy...

    Or everyone else for assuming that your average developer has control over where the SAN administrator chooses to store various file paths?

    Or are you all assuming that this is a small/medium shop where a developer also controls the infrastructure? That just isn't true in enterprise organizations (at least, not all of them).

    I admit my original comment was short-sighted, but come on people... if you need a caveat on your response, then your response is flawed, too.

  • (cs) in reply to David Emery
    David Emery:
    The "open exclusive" is the -right way*- to do this.
    As always, TRWTF is in the comments. You're like the 10th person now to suggest something involving locking or permissions. If the Watcher program is going to crash if it tries to process an incomplete file, what the hell makes you think it won't crash when it tries to open a file that's locked or privileged? You're making all sorts of unstated assumptions there that just aren't necessary because there's a much simpler solution that works just as well.

    I think these interview questions must hit a raw nerve with programmers who know they would have failed them. The level of both ignorance and hostility in the comments is phenomenal.

  • duplicity (unregistered) in reply to Mizchief
    Mizchief:
    JamesQMurphy:
    TopCod3r:
    We have 2 open positions on our team, due to high turnover, so I interview probably about 5 or 6 people a week and have gotten really good at giving technical interviews. It usually involves giving them a real problem from some code we have, and seeing if they solve it the right way, and then I explain to them how it should be done and make sure they agree.

    It is hard to find people who have the right mix of skills and personality. Some people realize halfway through my technical interview that they lack the required knowledge and simply cut it short and walk out of the room, I assume in embarrassment.

    Have you ever asked why you have high turnover?

    Yea i'm guessing that you are the problem. People don't just walk out of interviews out of technical embarrassment if they don't know something they usualy say something like "If I had google in front of me I could look it up in 2 seconds!" They walk out in situations where they find the interview to be asinine, or decide the company sucks considering the events that lead up to that point.

    Don't feed the trolls

  • (cs)

    I've got a "downloader/watcher" set of apps I wrote. There are a few ways to solve the problem. In fact, moving files around would be my last choice.

    My watcher and downloader are in the same app. Each thread knows what to pass along so that the files can be processed correctly. An old version had two separate apps, and used temporary file names. It was much slower, though.

  • yellowstuff (unregistered) in reply to jpaull

    Recruiters are not always your friend. He may have been sending you just to get some information about the position, or to make the next guy look better.

  • yellowstuff (unregistered) in reply to LEGO

    No. "Id est" means "that is", or "in other words." It does not mean "for example." http://en.wikipedia.org/wiki/Inter_alia#I

  • Gerrit (unregistered) in reply to Aaron
    Aaron:
    If its in linux, the lsof command works quite well at telling you a file is still open, without having to modify the kernel or anything.

    lsof | grep filename || start whatever you wanted

    Try inotifywait, it can report file names when they are closed. But I would still go for the temporary location. Suppose the download gets interrupted and is restarted. That could lead to incomplete files being processed.

  • Bush (unregistered) in reply to Spectre
    Spectre:
    So. Why did the interviewee for a PHP developer position got asked about CSS?

    umm... because web development involves css? Even if you have a dedicated design team that gives you the static html, you will still likely need to modify it as you build out the site, and that requires knowing html and css.

  • Mizchief (unregistered)

    It would really depend on what the "Watcher" was "Processing" If we are talking about multi-GB files I would consider coding the "Watcher" so that it could start processing before you had to download the entire file and/or move it to it's proper directory. And what exaclty is this data and how is it used? The "simpler" solution may be to set up a one method web service access the bits of data as needed by the user vs. constantly keeping two data sets in sync and eating up bandwidth.

    I suppose attempting to solve the functional problem and not a techincal problem is what separates Engineers from Programmers (the men from the boys).

    If I were to ask a question like this in an interview I would expect to hear several questions from the canidate regarding what the goal of this feature was and what his restrictions were, but hopefully modifying the kernel is assumed outside his domain of control.

  • Kozz (unregistered) in reply to JamesQMurphy

    JQM: Please do keep up. TopCod3r really is tdwtf's resident troll. ;) Just see other posts by him and you may detect a pattern.

  • Schnapple (unregistered) in reply to AMerrickanGirl
    AMerrickanGirl:
    The first story about the interviewer going out to lunch reminded me of a somewhat similar experience.

    I was contacted by a recruiter who asserted that he had a couple of jobs that were perfect for me. But first, he insisted on meeting me in person.

    His office was 40 miles away, but I agreed to be there at 9 am.

    I arrived there at 9 am. He wasn't there yet. They put me in a small windowless room and asked me to wait.

    Twenty minutes later I wandered out to find someone and ask what was going on. They didn't know why Recruiter Guy wasn't in yet, but they sent out his cube mate, another recruiter, to talk to me. Problem was, she didn't have any of my information available to her so it was kind of a waste of time.

    Another twenty minutes went by and someone finally called Recruiter Guy. He was stuck in traffic, with a cell phone that worked perfectly, since they got through on the first try, but he hadn't bothered to call his office to tell them to make nice to the 9 am interview that he was going to be very late for.

    They had the nerve to ask me if I would wait another 40 minutes for him to arrive. I said no thanks and walked out. If he had made the effort to call his office and apologize to me for being late, I would have waited. But he could have cared less, so he wasn't getting any commission off my back.

    I had a similar thing happen wherein the guy wanted me to come meet him, but the first opportunity I had was after I got done with work on Friday (was looking for another job but I wasn't about to endanger my current one). So there I was driving over there after 5:30 PM on a Friday, to go somewhere that takes at least an hour to drive to in normal traffic.

    Recruiter calls me when I'm halfway there and offers to meet on Monday instead. I said I'm halfway there and we can just go ahead and do this today.

    When I finally get there, the recruiter is gone. He said f*** this and left - apparently I was crimping his Friday night plans (in all fairness meeting on Friday after work was his idea). I was pissed since I had driven this whole way for nothing.

    On the upside they did have someone else to talk to me - a blonde former cheerleader who had been in the workforce all of three weeks and apparently had not had the "don't dress provocatively" speech with her boss yet.

  • PG (unregistered) in reply to Chris
    Chris:
    It turns out there is --- and it's actually more correct in some ways. Have the downloader get a 'write' lock on the file. Have the watcher get a 'read' lock on the file. (Or its own 'write' lock if it deletes the file as the final step in the process.) You're fine as long as everyone uses locks and the watcher program is smart enough to keep retrying to get a lock. (I assume it's not so stupid that it will wait indefinitely for that lock.)

    BUZZZZZ

    Thanks for playing. You can't change the Watcher, and it needs to take out a lock, and *NIX doesn't do an automatic locking as part of the filesystem.

  • (cs) in reply to Azeroth
    Azeroth:
    There is another simple solution with the downloader/watcher problem - downloader should open the file exclusively while it's being downloaded, this way watcher won't be able to access it until it's closed. This way it's not even required to move anything anywhere.

    What if the downloader crashes or the network goes away?

    anyway, the solution I've seen work (and work well) is this:

    download file. when done, create file.DOWNLOADED file muncher notices a file ending in .DOWNLOADED, creates a .PROCESSING file in the same matter and writes its pid to it. when file muncher finishes, write file.DONE, delete file.PROCESSING

    you can fill in the error checks pretty easily, and using ls will tell you what's going on.

  • (cs) in reply to imMute
    imMute:

    File copies would still be a problem, but file moves (also known as renames) are extremely quick as long as the src/dst are on the same volume: the FS only moves the inode data. Hell, even windows handles this as well as *nix.

    But the way Linux works, the entire filesystem is represented as being contiguous, even when the physical storage isn't.

    All it would take is one "clever" sysadmin to put the temp directory on a separate partition and all of a sudden you've reintroduced the same race condition you had before -- except this time you're not even aware of it.

    The best idea is to use some sort of locking. Either using built-in operating system locking or just something as simple as dropping a lock file in the directory that both the Downloader and Watcher respect.

    Yes, I realize the Watcher isn't supposed to be modified... but stating that you can't modify the Watcher does make this an absurd scenario to begin with.

  • m&m (unregistered)

    "...remote server and save them to a certainly directory on disk. A Watcher program monitors this directory..."

    and then:

    “What about if the Downloader just wrote files to a temporary directory, and then moved the file to the appropriate directory when the download was complete.”

    I think the first door he closed was that leading to his 'briliant' solution... And as always, if you know how the conjurer does his tricks, they are so easy you could have thought of them yourselve

Leave a comment on “A Problem at the Personal Level & More”

Log In or post as a guest

Replying to comment #:

« Return to Article