- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Admin
All depends on whether you're giving or receiving.
Admin
OK, a FAQ about the different "solutions" offered
Q. What if it's on a different partition? A. Then make a temporary directory inside the same directory. You're guaranteed it will be on the same partition.
Q. Locking? A. You can't control the watcher
Q. Rename instead of move? A. It's the same damn thing. The system call, not the system utility "mv" (which is the same thing when on the same partition).
Q. Activate the watcher after the downloader. A1. You are downloading multiple files... the watcher could process any "unfinished" file in this case. A2. There is probably a difference between the "watcher" and the file "processor" (as in functionality, though it sounds they are the same program in this case). You probably want to start the processor on the downloaded file, but crap, you can't modify it.
Admin
The post didn't say anything about a retail application. How did you know it was Point Of Sale? oh... I get it...
never mind
Admin
The candidate could have established that line immediately by just asking a couple of basic questions to determine the parameters of the problem. Instead he follows a progressively more convoluted trail to answer a problem he was told has a simple solution - culminating at modifying the linux kernel itself.
The interviewer did not say it was the best solution. He said it was simple. If I get someone in front of me who can't see the forest for the trees, I'm going to pass on him too.
Admin
It holds for every single file on any POSIX OS, regardless of whether that file is on a native filesystem or not. This is because rename(2) is atomic, according to POSIX, so if you want to be POSIX, rename better be atomic. And it's incredibly easy to have rename(2) be sufficiently atomic for this purpose. (Note: even if rename was implemented by doing a link(2) followed by an unlink(2) of the old name, it would still be fast enough, unless Watcher is so fragile that it dies if it ever sees a file with multiple links. For it to do that, it'd pretty much have to be coded to do that, however - and it would still be such a tiny window that it's virtually inconceivable that it would happen regularly.)
Btw, this means Windows, too. POSIX is a big thing.
(Note: there are many unix OSes which have optional POSIX support. However, their non-POSIX mode is still loosely POSIX, and still does this. It's too necessary to not do it, and it's too useful to write the code to do it and then not always use it.)
Admin
The Watcher program runs as a different user. suid doesn't work. Since both programs run as non-privileged users, one cannot spawn the other.
Oh, and Watcher's a fragile piece of crap; it'll freeze if it hits a file it can't read, it requires a kill -9 to restart, and that will corrupt its database. So don't even go there.
I'm sorry about that - I didn't write Watcher; it's 25 years old, and it's 25M of hand-written assembly - no comments. All of the developers retired; most of them are dead, but there's one guy left who's living in an insane asylum. Some say it was from working with that code for five years after all of the other developers had retired.
For what it's worth, it's not 'changing the scenario' if you're revealing more information about the situation which wasn't previously known. If the new information is inconsistent with the old, that's a different story.
Admin
I call troll. Nobody could possibly think having the 'downloading' application running on a different machine would improve the IPC (Inter-process communication, for those clueless who obviously reading (and posting) here.)
Admin
No one is saying that you have to hire someone with a "problem on the personal level." What (I think) the Duke is saying is that at the very least...
a) being an abuisive jerk isn't a defense in court. b) "Shari" would have a legitimate case against which you would have to defend yourself and who wants to go through that hassle.
You don't have to hire anyone who doesn't fit in... you have my permission to be an abusive jerk... But when decide, based on a piece of paper and two spoken sentenances, that a job applicant has "proplems at the personal level" you had better be prepared to explain what those problems are and why the make her someone you wouldn't hire.
Admin
In fact, I like to give interview questions which require one to understand that "not being able to change entity #1 does not mean one cannot change entity #2." It's such a fundamental bit of logic, which has come up many times in my career.
And, in the many times I've been called to clean up someone else's royal mess, a failure to apply that bit of logic has almost always been the root cause of the initial mess, as well as the cause of numerous of the additional messes added on top of that initial mess to raise it from a simple mess through complete mess to royal mess.
So, while I feel sympathy for those people who cannot understand it, there is no way I want to sponsor employment for such people in any 'thinking' job.
Admin
Dave never wanted to give you the job, whoever you turned out to be. He just wanted the interview to go badly so he could get back to other things.
Admin
Admin
I see a common thread on these interview response discussions I'd like to mention.
Specifically, the interviewer sets up a scenario. He or she then refines the scenario by adding complications into the picture.
Then, the interviewee and the various forum members start adding their own complications into the picture. "What if there's no other directory on the partition you can write to?", for example.
Please stop this. It is the interviewer's place to add clarification. (Yes, I realize I put a post above with a clarification claiming that the Watcher was excessively fragile, and I spawned some interviewer-ish explanation why. But I wasn't explaining why the interviewer's answer wouldn't work; I was providing justification for the conditions the interviewer gave arbitrarily.)
As an interviewee, you should be looking for possibilities that the interviewer has left open, rather than trying to close the remaining holes.
Then, later, as a new employee, you should be looking for possibilities that the various PHBs and clueless gits who have preceded you have not managed to block off. Don't complete the barricade, but figure out what path remains open and take it.
Admin
The first story reminds me of one of the many interviews I had when I was first looking for an IT job, many years ago.
I arrived 25 minutes early, but I brought a book to read. I chatted briefly with the receptionist, and indicated that I was intentionally early, as that way I couldn't possibly lose track of time - she'd alert me when he was ready to interview me.
About 5 minutes later, a man hurried out of the office area. He told the receptionist he was running late and going to grab a quick bite - he might be late for his 1:00 interview. She indicated I was already there. He looked over, and said, "Hey, want to grab lunch?"
The interview didn't go well; he didn't get the employee. But, on the bright side, he did get a few pointers on where he was going wrong on some of his projects, and he was able to get reimbursed for the meal. Years later, I encountered the guy he did end up hiring; my suggestions saved him several months of effort.
(Note: technically, he turned me down as 'over-qualified'. I got that a lot, as I'd blown off my classes, but to mess around with computers more and learn more about them.)
As someone who has given interviews to quite a few people (my first job was at a rapidly growing contracting company; it was continuously hiring for several years, during which time I gave dozens if not hundreds of interviews), I'd say Shari appears to have an excellent case. It's possible that discovery would turn up a consistent behavior pattern for the defendant, but it's also possible her alma mater isn't the only thing in the story which is brown, in which case that may not matter so much. (Same goes for any other racial minority, although to a lesser extent.) I'm not 100% certain, because it has been quite some time since I've given an interview, but I think Dave broke virtually every rule which we were explicitly given for interview behavior. Since it doesn't say anything about how he looked at her, I would rather just assume he broke that rule, too, to keep things simple. (Note: they didn't explicitly forbid any overt sexual actions, except by stating that the employee behavior guidelines still applied, since interviewers were getting paid for their time, so I'm not including those. I'm just betting that he didn't keep his gaze "neutral.")
Admin
you need a better operating if mv is not atomic
Admin
And the temporary directory solution will pose the same problem: what if the copy function is not finished yet and the Watcher gets more cpu cycles? The only solution in this case is to modify the protocol (or rather introduce a very crude one) with a separate flag file which contains a value indicating that either the provider is downloading, the provider is ready, the watcher is reading or the watcher is ready. From these 4 states both parties can derive what the other is or has been doing (or not) and whether it is save to do its stuff. If there is a contention for this status file you could add a second locking file (just keep it locked, nothing more) and create a true double semaphore.
Admin
I haven't read more than page 1 and 2, but about copying not being atomic - i.e. that you can see partial files in a directory while copying...
I would assume that the OS doesn't put the filename into a directory before it has finished copying the data. If that is the case, you'ld never get partial files in a directory from a copy.
I mean, this is how i thought copying worked:
But it does it the other way around?
Admin
Hmm... just tried copying a large file (on windows). It does show the file in the destination directory while copying (in explorer and cmd)... So it looks like it does insert the file entry first.. Never mind :)
(but, well.. this is also a somewhat contrived example)
Admin
This reminds me of one of my jobs (well known software company) - they don't have a process for vetting that you are allowed to interview people, they just toss you in with a vague idea of how to proceed. I didn't hear about us being sued when I was there, but I sort of expected to.
For the last time, this is a copy and not a move. move takes almost no time, and is atomic. Just forget about it.
Admin
You're missing the fact that the first quote refers to the candidate and the second refers to the interviewer.
Admin
Where would you store the data you were copying if there wasn't a file to put it in? Just how do you think files are stored? (hint: lots of stuff on google to satisfy your curiosity).
Admin
1-The watcher is activated by the downloader
2-The watcher and the downloader are one same application
3-While being downloaded, the file is named "tmp_[filename]" and then renamed "[filename]" and the watcher knows not to open files starting with "tmp_"
4-Downloader is scheduled to work at the top of the hour, Watcher scheduled to work at the bottom of the hour, and if you have a T1, what kind of a file would take more than 30 minutes to download?!?
5-Watcher is coded properly with TRY/CATCH clauses and you go from there...
Why would you have a "watcher" spend it's time hogging processor time and memory 24/7 anyway?
Admin
The above is what my previous comment would be placed under if I had any clue what I was doing at a time of day when I haven't slept yet and am supposed to get up in five hours to go to work.
Admin
Admin
Admin
Long time since i've been looking at real industrial strength filesystems, but in the old CBM DOS V2.0 a file is 1) an entry in the directory, 2) the data itself.
The entry in the directory contains the filename and a pointer to the first track/sector of the data, the file should contain.
The data itself, is of course stored at the track/sectors of the disk.
So, it is easily possible to save the data first, and insert the entry in the directory (which essentially is just a pointer to the data + a name) afterwards. (atleast in CBM DOS V2)
It seems like you suggest that the data for a file is stored inside the directory entry itself..? (if that is the case, i don't see how hardlinking in unix could work - that would have to copy all the data for each hardlink..)
Or am i missing something? (maybe it's all different in modern filesystems)
Admin
My above post should have been a reply to:
Admin
It's all different in modern filesystems, at least unix ones.
A common implementation for unix filesystems is as follows: each file is stored in an inode, which stores permission and ownership info along with pointers to the blocks that contain the data. Directory entries contain a name and an inode and are themselves files.
Generally, inodes contain enough block pointers (that point to the data blocks) to fill up leftover space in a data block, minus some entries at the end that do various levels of indirection for big files. The actual data is stored contiguously (mostly) when possible, and fses like ext2 do things like allocating 8 blocks at a time to speed up access.
Meanwhile, directories can be indexed by name, so 10,000 files in a dir is fast as hell.
So you see, there's no way to stre data without creating a file to put it in, but since you can just write the file somewhere, then create a file entry in some other dir, it's no big deal.
Admin
Ok, thanx for the clearification. Not sure i get the point about why the directory entry can't be made afterwards, though. But i'm tired, so that is probably why. ;)
(I just pulled out my old tanenbaum book (modern operating systems), took a quick look at the section on the UNIX V7 file system. It says that a directory entry, is in fact just a filename + a pointer to an i-node. So it does seem to me that it is possible to create the i-node (etc) before creating the directory entry.)
Anyway, i'm tired, and you're right - it's getting a bit off topic. :)
Admin
Real people. Personally, while reading the problem I right away came up with a solution - Have 'Watcher' look for the next file before processing the present file.
Example, If you expect to see File1, File2, File3 .... FileX then process File1 when File2 appears, process File2 when File3 appears ... etc ... FileX I would process after a reasonable delay or add a dummy End_Of_Files file.
Admin
Seconded.
Admin
Typical CS graduate solution.
In the real world, the watcher finds the file, tries to open it, fails, dumps about 347 error messages which trash /var/log and your patience, refuses to process anything further (thus overflowing the directory, causing a week of outage), sacrifices your first-born to Steve, and then it will get real nasty.
Admin
And there we are again: the beginning of an enterprisy solution.
Admin
Woosh!
Admin
We solved this by deleting the file from the source directory after transfer and have The Watcher (we could modify The Watcher) have a look into this directory. Once a file is in the working directory but not in the source directory anymore it was assumed that the file was transferred completely (and successfully). Not fool-proof but worked quite well, though.
Admin
http://thedailywtf.com/Articles/I-Didn%e2%80%99t-Know-You-Could-Do-That!.aspx
Admin
I missed the point by posting here.
Apologies to ContraCorners.
I'll have to find another case of political correctness to flame....
Admin
Jeremy H. is a tool. He said that you can't modify the Watcher but you can modify the Downloader?
I've had tools like this ask me questions where they really want one golden answer. He is not looking at how you approach the problem, rather whether you reach his solution.
The temporary directory solution is a poor one - it creates a race condition. It assumes that a copy operation from temp to the real dir is going to be an atomic and allows the watcher to pick up half written files. Sure it will crash less - but it will still crash.
Communication between the downloader and watcher makes sense to me. Sounds like the guy was just getting annoyed that Jeremy is a tool.
Admin
Was that first interviewer Steve Jobs? I've heard anecdotes about that kind of thing...
Admin
“What about if the Downloader just wrote files to a temporary directory, and then moved the file to the appropriate directory when the download was complete.” Hang on a minute - surely the watcher program will try to process the file while the downloader program was moving it from the temporary directory (which would take a while due to it's size), which would produce precisely the same result - the Watcher program crashes as it processes an incomplete file. Surely the answer is for the Watcher program to check the last edited timestamp of the file - once that is say, 10 minutes old, then it's likely the file has finished downloading.
Admin
We're looooooping here...
Seriously guys, the real solution is to mount the remote location (there are plenty of FUSE modules for every imaginable protocol and we already know these guys are using Linux from the context), have the downloader just add a symbolic link to the remote file (that should be as atomic as it gets), and then the watcher will act on the file pointed to by the symbolic link.
Sheesh.
Admin
Admin
Nope, sorry.
Watcher tries to open file, and gets can not open file error and abends. -904 resource unavailble
The proper way to do this is to have the downloader trigger the watcher program when done. Has no one heard the term "batch" or "job schedule" before?
It's a good thing you *nix kiddies haven't tried to reinvent the car, as I suspect it would not have wheels.
Admin
Admin
You are the tool for insulting others. The "simple" solution is a move, as POSIX guarantees atomicity. This is discussed at length in the first 4 pages of comments, where the differences between copy and move have also been discussed.
As you are a tool, you wouldn't get the job for unspecified "personal problems", but we'll all really know it's because you are a tool.
Admin
Admin
Anyway, modifying the Watcher was not an option, apparently.
Admin
Sorry but that temporary files copying question is a crock of shit. You still have the same problem. If the file is large, and the disks are slow/physically separate from each other, the operating system will copy it piece by piece from the source directory to the target; it's not an atomic operation, so you still can't know when it's fully complete.
Admin
So how will the Watcher tell the difference between downloading a file from the internet to a directory and copying a file from another directory to the one it is watching?
Admin
AAAaaaaaarghhh!!! Enough!!! Someone say something DIFFERENT... PLEASE!!