- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
So did Jeremy ever find a candidate that could read minds? I'm sure the real point is to see what kind of solution an interviewee comes up with, not to see how long it takes for them to guess your 'right' solution. Why not try 'hot and cold' or 'Password'? 'Twenty questions' would be even more fair. I spy, with my little eye, something ending in .avi..
Admin
Holy crap! People are still discussing the atomicity of renames(), etc. in this thread? After a day, and 6 pages of responses? Just download to some alternate directory, regardless of whether it is on the same filesystem or not, and create a symlink when the file is downloaded. Easy. The (one and only, imo) correct answer to the interview question for any modern OS, and most non-modern. Otherwise, use a tmpname() function to create a directory on the same filesystem for downloading, then use an atomic rename(). And yes, atomic renames within a filesystem are the rule, not the exception.
Admin
As for "shutting up" (mind yourself) about non-atomic moves, I have faith that (maybe) with a few more years experience, you will understand the difference between robust and "just works". In order to design "robust", you need to know all existing assumptions and not make any new ones.
OK - another precondition/assumption of the exercise = do not use /tmp. We should get a list somewhere...Admin
You make a good point about how (in general) atomic renames (or moves) are the rule and not the exception. That point has been made before, too. My point, at least, is that too many assumptions go into the whole "one right answer" type of problem. Although many seems to suffer from a severe case of "just don't get it" when it comes to that... :)
Admin
Maybe when you graduate college and have to actually work with people for more than 3 months at a time you'll learn to listen.
Yeah, it's called "jtwine addine assumptions of his own and not owning up to them when called out". I never said /tmp, I said "A temp directory" right next to the part where I said "on the same filesystem" shortly before someone decided to interpret filesystem as the whole file tree. Either discuss the solution or don't. Setting up strawmen is just juvenile.
Admin
As to why you are failing to realize this, it might have to do with the fact that I was not logged in when I wrote the "James R. Twine" posts...
Shhhh... You are confusing me with someone that did not actually demonstrate their experience with that topic. My point has not yet been (intelligently) contested. Shhhh... Again, you are confused. And also severely lacking in awareness of my position and experience.Given the above three issues, there is no need to address the remainder of your post.
Admin
You know, we have open positions, but if I had the time to sit with 5 or 6 interviewees a week and explain a solution to even half of them, I'd be divorced because I'd have to spend that time plus some at home just to keep up with my project schedules.
Admin
Well I guess you told me. Not like I have anything to judge you on other than your harping on /tmp incessantly.
Admin
Seriously, I believe that the tone of our exchanges were due to a misunderstanding and I am man enough to admit that the misunderstanding may have been my fault and that I may have gotten more aggressive than was necessary.
Peace!
Admin
Admin
Hey, I can be nice and bury hatchets and all that. Now if you'll excuse me, I'm needed in the bureaucracy thread.
Admin
Like it or not, Yes.
If, because you're a douchebag, you choose not to hire a qualified person, and that person happens to be female, black, hispanic, gay[*], or crippled, then that person has the option of suing your ass into the ground.
If, because you're a douchebag, you choose not to hire a qualified person, and that person happens to be a straight white or asian male, then that person has the option of interviewing with a non-douchebag employer.
[*] in some states
Admin
Everyone is missing what is clearly the correct solution. It doesn't involve modifying the watcher, the kernel, having a temp directory, or even moving or copying files.
Modify the downloader to be a FUSE file system that makes the desired file appear to be local (This is technically not modifying the kernel because it is user space). When a process tries to open and read a file it will innitiate the download and the filesystem will block the read until the download is complete. Just configure the watcher to watch whatever directory the new filesystem is mounted as and you're done.
Admin
I pose interviewees problems from our current projects, and if the solution sounds good I use it. Free work.
Admin
Admin
Indeed, I see this every week with a little backup process I have scheduled, that "moves" 1Gb of file from a samba mounted unix file system to a windows workstation drive.
During the "move", the windows directory listing shows the file present on both systems, in its full size. When the "move" is complete, the source file disappears. This is an example of a move being done as a copy/delete. I haven't tried accessing the files during the "move", so I can't tell you if locking occurs. This takes about 5 minutes typically.
I notice this in contrast to what happens when moving similar files between unix filesystems. The file is present on both systems during the move, but the new file has a size that grows while the file is being moved. The original file disappears when the new file reaches full-size. Between different unix file systems, move is done as a copy delete, and takes about a minute per gig.
Of course, if I move the file to different directories on the same file system, it happens apparently instantaneously, as the move is not a copy/delete, but just a relink.
Folks, you're all right. If you can count on being able to use a temporary file on the same filesystem, use the move (mv) method to supply a complete file to the watcher. You can count on this, as there will need to be space in the watcher's pickup area for typical downloaded files, so the staging area will have that much space available anyway, being on the same file system.
If the staging area cannot be on the same filesystem as the watcher's pickup area (for whatever reason--policy, security, because other processes depend on that staging area), then you need to think a bit harder, considering all the suggestions above: serialising download/pickup, rearranging filesystems, getting the watcher fixed, etc.
It was an interview hypothetical, with a simplified scenario. Otherwise, in a real shop, with a network of different platforms, cross-mounted filesystems and legacy systems, I think people are right to ask, "Whoa. Can we assume there won't be some race condition in this other method? Can we assume we have the space we need for a staging area? What else might be using the pickup area? Can we rearrange our filesystems without breaking something else? Can we work around the limitations of the components? Can we replace the braindead components?"
I sometimes suspect that people who say, "What's the big deal? Use this simple solution, bam, sorted." have not got a lot of experience in a larger shop with multiple interlocking constraints. Solutions that work fine on a single linux desktop in your bedroom might not be appropriate in a 20+ mixed platform shop with high throughput and availability requirements. One poster whose response to one what-if was "don't do it that way" overlooks the case where it "has to be done that way" because firewall policy says so, and their higher-ups have refused requests to change the policy, or because some fragile legacy system requires it that way, and it's too hard to change.
And you know, I reckon there's some back story we're not hearing too. Jeremy seems to be what, zinging someone in retrospect? "Nyah nyah, see how simple my solution is?"
In the real world, it's not always that simple. Sometimes it is, sometimes it ain't.
Admin
Well, if that's the case then the watcher program isn't my problem. Let them deal with a possible incomplete file.
Admin
Admin
Admin
Oooh look at you!! Perhaps you should go onto Experts Exchange rather than read the DailyWTF!!!
Admin
Admin
And what if the file move also takes time?
Admin
Yes, it is. But until the interviewer answers, "Well, ok, it can be," any answers that require it be modified are not going to result in a job offer. In fact, they're WTF.
Admin
PHP should have been the hint for the third. I remember a phone interview for such a PHP role and the most advanced the questions got were OOP definitions. I ended up having to raise the level of competency in the discussion myself, ended up having two short discussions on IDE and framework choices and reasons and a slightly extended discussion on feasibility and scalability and how PHP and the LAMP stack fits in.
Admin
Here we go again...
Admin
No. "Only one right answer" questions need to make the conditions open to discovery. Jeremy did that.
In the real world, you don't get your scenarios handed to you on a silver platter. You need to discover them. It could be that Watcher is owned by a different user, and you don't have permission, and the owner refuses to change it. It could be that it's a legacy app, with no source code, and as yet no rewrite. It could be that it's a third party app. It could be that it's deemed business critical, and they're not willing to let junior programmers touch it.
It could just be that it's a complete mess, and all the current programmers refuse to touch it because they're afraid of breaking it. (I've been given a few of those to maintain. No fun. On the other hand, I've never handed one off to someone else - I've always cleaned them up first.)
However, when you're in an interview question, your best bet is to take the interviewer at their word - if they say don't touch, don't touch.
Obviously, you've not seen what happens to a Solaris box when /tmp fills, given their default config. (/tmp is a separate partition there, and it's not pretty. It's also a DoS.)
Actually, /tmp is for temporary files, generally those whose deletion on reboot would not matter. It is not a temporary holding place for more permanent files, or those which may need to be retained between system boots. As such, unless Watcher is looking in a /tmp directory, /tmp is not appropriate for this sort of file.
For what it's worth, I've known people who've managed to answer questions with "one right answer" with a different answer, and still get hired - it usually just needs to actually handle the situation at least as well as the "one right answer". And, none of the other answers given here have.
I've been a system administrator for over 10 years. I know how complicated the real world is. I also see time and time again people applying a complicated fix to a simple situation, because they're too focused on the complicated. Most of the time, it works, and I don't complain too much. Some of the time, it creates (or would create - sometimes I can veto) a WTF.
I've seen one bit of code that answers the above problem by writing the file to a temporary directory (and with a leading '.', even), locking the file with flock, locking it with fcntl, using a semaphore, and a '.lock' file, then renaming it, and undoing all of the locks. Not in the reverse order, for what it's worth. The coder defended it with, "But I didn't know the target OS would remain static, nor what the underlying filesystem would be."
Note that this complicated solution had several failure points (program names changed to match this example):
It applied flock and fcntl on the same file. On most OSes, this would deadlock with itself. Of all the possible target OSes at that shop, it only worked on the target OS the program was initially written for.
It applied flock and fcntl to the temporary file name - not the new name. (It couldn't hit the new name, because it used rename(2), and so the move completed before it could possibly target the new name - which should've been a clue.)
Its .lock file could have caused the same problem that the whole process was made to circumvent - watcher would've crashed on it had watch ever seen that file, just like watcher would any other 'incomplete' file.
watcher didn't honor any of those locks. Putting them in downloader didn't do anything useful.
rename(2) is required to fail if it's not atomic - so even if the underlying file is NFS, and remotely the two directories are actually on separate partitions, it either works atomically or fails.
He didn't check for rename(2) failures.
Since he also put a lock around his semaphore (just in case it ran on a system that didn't do semaphores correctly, but did atomic file creates correctly), and he mangled his unwrap just right, he had a potential deadlock on unlocking.
His group owned the server on which watcher ran, and were authoritative for the filesystem layout. The responsibility for making sure watcher ran properly went along with that authority, so whoever owned watcher would be able to ensure that constraint continued to be enforced. And, since downloader died on rename(2) failures (he didn't check it, but that doesn't mean his code handled it correctly), if they failed to enforce that constraint, they would have been alerted to that fact immediately, and it would have been associated with a change that they had performed, thus blame would naturally fall to them.
He wasn't fired for it. Not even though he claimed to have fixed the problem, but the problem was still there. (Hint: downloader wasn't the only program writing files to that directory - but it was the only one he updated.)
Disclaimer: it wasn't actually the same situation, because 'downloader' wasn't downloading from the net, but rather pulling data from a database. 'watcher' didn't crash on incomplete files, it just corrupted its own database, and continued merrily running. Until, of course, it corrupted it so bad it got a null pointer where it wasn't expecting it. And it wasn't that 'watcher' couldn't be modified, it's just that nobody was willing to until I came along. The situation was made much simpler when I modified a 'strcmp(file, ".")' to 'strncmp(file, ".", 1)'. Although, not as much simpler as when I replaced 12k lines of C code (no comments) with 150 lines of perl code (with comments).
Admin
Cutting out the WTF, I was talking Windows - because it's the same situation. If the partition is the same between src and dst, rename(2) is atomic. If the partition is not the same between src and dst, rename(2) is atomic - but probably an error.
The WTF changed the case he was talking about, despite my having quoted enough of his prior statements that his switcheroo showed.
We weren't simply assuming it wasn't going across partitions; we were requiring that the rename(2) did not go across partitions.
Admin
Actually, we don't. We know the interviewee indicated that a Linux kernel patch was a possibility. But, since we know that the interviewee indicated that a Linux kernel patch was a possibility, we can easily surmise that the interviewee was not a good source of information on stuff.
I mean, honestly - a kernel patch to make a downloader/watcher scenario work? That's even worse than suggesting FUSE when people don't like the trivial rename from another directory on the same partition answer.
(Hint: FUSE may already be implemented in the kernel, but it's still a sledgehammer. A big one, at that. Not to mention, the watcher program could be timing sensitive, and could possibly crash or corrupt data if it's suspended for too long. Yes, I'm reaching.)
Admin
Admin
I'm glad I only read the first story.
Admin
I like the simplicity, but that wouldn't solve the problem, would it? Wouldn't that essentially be the same problem, because a multi-gig file would be incomplete during a move, and the Watcher would crash while processing the incomplete file? Instead of moving a file slowly piece by piece from the internet, your just moving a file faster piece by piece from another folder.
Admin
Wait wait wait wait. How can it not be atomic? Remove the directory entry from that place, and create one right here in its place? How could this /not/ be atomic? Seems a directory entry either exists, or does not exist, excluding of course bi-state SchrödingerVerzeichnisse and bi-coastal FlyoverDirectories.
Admin
I've had to do that exact thing. My solution was to have the watcher watch a seperate log directory and on completion of uploading of orders, the downloader would dump a log of the downloads and paths to the log directory. this ensures all transfers were successful before the log wrote and the watcher acted. Also helped on journaling the download as a seperate transactionally based log.
Admin
For the downloader problem, I would have suggested the script touch another file in a different directory, indicating the file is done.
However, I guess modifying the Linux Kernel works too...
Admin
Holy Crap!X2 ... Did I really read through 7 pages before someone came up with what I thought was obvious on the 1st?
And has an added bonus of not mucking up the real directory in case of a crash.
Admin
Has anyone ever heard of Scheduling Software, maybe like Control-M? I swear, the real WTF is NO ONE around here has ever worked for an IT shop with more then a handful of people.
Admin
Why not just let the Watcher know which files are done through some file like finished.log. You could save the date/time of the events and keep track of all the downloaded files. You would also have better control on how the watcher works and I think it's more efficient for it to read one file instead of checking the whole directory for new files.
Admin
Oddly enough, as a person who doesn't have a job programming, that was the first thought I had.
Let the last function of the Downloader be to start the Watcher. Why complicate matters having something run all the time when it doesn't need to?
Admin
Sorreeee! thats what happens when your blog is in Hebrew, and you send a trackback with a link to this post... sigh
Admin
If the Downloader program downloads files to a directory and the Watcher program processes them immediately, you couldn't use a temp dir -- as it would become the the "main" directory which is processed by the Watcher program.
I'd say there's a slight contradiction between the task description and the "correct" solution.
Admin
Meh.
I've been a technical recruiter. Being a technical recruiter requires no technical experience. All it requires is the ability to sell. You have to be able to sell the candidate on your ability to place them (and then get the candidate and manager leads from them that will eventually get you more business), and the client on your ability to find qualified candidates. You can do those two things? Great, you too can make $100K+ plus a year working 9-5, as long as you don't mind a phone being surgically attached to your head and making 100+ phone calls a day.
Recruiters don't have the technical knowledge to know whether or not, based on your resume, whether you're qualified for the job or not. They're just going off of buzzwords and requirements that the client has given them. Even the best recruiters will make mistakes half the time. Their bread and butter are the repeat superstar candidates who they can place multiple times, combined with exclusive clients who have become fed up with poor results from various agencies.
I did quite well as a recruiter, but I didn't enjoy the work. It might be one of the few positions that are less honorable than trial lawyers. To be really good, you will have to be able to work people with half-truths until it becomes a natural way of dealing with people.
Admin
Too funny!
I know it's just a troll, but I can't resist giving my "what he said/what he meant" translation...
Because I'm such an a** to work with, people on our team jump ship as soon as they find something even tentatively viable, as a result we currently have 2 open positions. I interview about 5 or 6 people a week and have memorized my technical interview questions, based on problems that I perceive in our code. I see if they can immediately see how I would solve the problem or, lacking that, whether they can be convinced to agree with my solution.
It is hard to find people who either think exactly like I do or are good enough suck-ups to agree with my opinion no matter what. Some people realize halfway through my technical interview that I am so thoroughly obnoxious that they simply cut it short and walk out of the room, I assume in embarrassment (because if I acknowledged the real reason it would lower my sense of self esteem).
Admin
+2 for Marc's suggestion. .. I'd suggest checking that the Estate of Heath Robinson wouldn't have any claims over any profits you make from marketing this solution -- or at least negotiate a fee/royalty up-front.
Do not feed the lawyers!
Admin
I like that line of thinking. First I would ask exactly what the downloader and watcher do though and who owns the code. Then I would be most likely be misled by the phrasing of the question as most candidates apparently were and propose one or two "complicated" answers. Then I would ask them to read the problem definition again. Then I would suggest getting the code for whatever the watcher does and adding it to the downloader program so that the watcher's function is called on the file that has been downloaded as soon as the downloading is complete. Then I would probably fight against the irrational insistence that neither program be modified (probably this was an unnecessary design flaw grandfathered in from the beginning) for a minute or two and eventually give up on that track and finally suggest downloading to a temp file first or using a rename (mv).
By the way, the real WTF here really is the fact that there are 300 plus comments on these mundane item(s) (although maybe what makes it so popular is that they are common situations that we can all relate too). Or maybe we are all just common.
Admin
in fact, at thedailywtf.com, it's a requirement.
Admin
SIGSTOP? SIGCONT?
Admin
From the article: "However, because downloading takes significantly longer than processing, the Watcher program will crash if it reads a file that has not been fully downloaded."
So this is clearly not related to the problem that the watcher starts processing before the file is completely downloaded, the problem is that the watcher is processing faster than the download happens, and reaches the end of the download file while the download is still in progress. It surely requires some clarification with the interviewer, but it is quite likely that copying the file would be much faster than processing, and that starting the processing before the copying is finished is no problem, as long as the copying is faster.
Admin
Yes, but if the interviewee had come up with that in the first place, the interviewer probably would not have resorted to "You can't modify the watcher, now what do you do?"
Admin
It's true for any unix filesystem because it is required by POSIX and nobody is going to design a unix filesystem that doesn't provide for this.
It's true for NTFS - I don't know if it's atomic as implemented on windows (but it probably is), but NTFS's support for hardlinks means the operation can be broken down into "create new hardlink, delete old hardlink" which, while two operations, can be done quickly enough that it can be done without the kernel returning control to the application (and the rename() function [used for moving when possible] is guaranteed atomic - it will fail if it can't be done atomically, so you can check this in the new code in the downloader)
I don't know about FAT. Why would you be using FAT?
Admin
What we have here is a failure to communicate.
In this situation, /mine and /mine/tmp are no longer "in the same filesystem" as the terms are used by people who know unix.
When savar said "But the way Linux works, the entire filesystem is represented as being contiguous, even when the physical storage isn't.", Franz_Kafka assumed he was talking about RAID or LVM or something, not "mounting two partitions in subdirectories" (since those are traditionally still called two filesystems)