- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Read what I and others have pointed out. mv is atomic If-and-ONLY-IF you're on the same partition.
Before we got O_EXCL or advisory locking in Unix, the common trick was to use the fact that the kernel would execute a hardlink (ln) atomically to create a lockfile, e.g. ln filename.lock. The application would -poll for the absence- of filename.lock to determine that the file was unlocked and that the download was completed.
The advantage of opening a file either exclusive or using advisory locking, is that it doesn't have to be a polling operation. It can be a blocking operation where the block/unblock is all managed by the OS (kernel and supporting library.)
If you don't understand why polling in this situation is A Bad Thing, I don't want to be on your systems engineering team.*
dave
*sorry, but the original posting was (a) a bit insulting because it was (b) a bit uninformed and (c) failed to RTFPosting; and (d) I'm feeling cranky today.
Admin
Admin
Admin
if the two directories are in the same filesystem, it doesn't matter if they're in different disks. In fact, i can guarantee that they don't. Also, you have to assume that the system (the part you control) is correctly configured, so the file is downloaded to a staging dir, then moved. Allowing for malicious config makes the problem impossible.
I already pounded your locking scheme, so never miind about that. Anyway, nfs locking isn't reliable.
Admin
Polling isn't a bad thing, depending on the interval - poll once per minute (files are downloaded a few times per day) and the fast fail case makes for almost no system load. Also, it's simple and predictable.
Admin
So you're given a problem with 2 entities. The person who gives you the problem says "Solve this problem without changing entity #1." You assume that you can't do anything to entity #2?
The only rational thing I can assume is that changing entity #2 is the ONLY way to solve it or else this is a bullshit trick question.
Admin
Yes, I also passed many interviews with flying colors, after answering very basic questions.
This comes from the fact that our industry attracts many wannabees who just don't have what it takes, or just spent years slacking in college and managed somehow to pass their exams by cramming but have little understanding of what it is we IT guys do.
I even told an interviewer once that, really, these questions were very simple, to which he agreed and said that I'd be surprised at how bad most applicants are.
An advice I always give is this: when you do not know, say you do not know, do not make anything up. Better: say that you do not know YET.
Admin
You use e.g. when you want to give specific examples, and i.e. when you want to elaborate on something.
Admin
"“What about if the Downloader just wrote files to a temporary directory, and then moved the file to the appropriate directory when the download was complete.”"
No experience with coding on NIX but that could still fault on windows.
If you write a service/application that monitors a folder for filewrites you might try to process the file before it's fully copied there. Thread a loop for exclusive access on the file to prevent locks/errors.
Admin
I too could find FIVE different solutions, just as simple
1-The watcher is activated by the downloader 2-The watcher and the downloader are one same application 3-While being downloaded, the file is named "tmp_[filename]" and then renamed "[filename]" and the watcher knows not to open files starting with "tmp_" 4-Downloader is scheduled to work at the top of the hour, Watcher scheduled to work at the bottom of the hour, and if you have a T1, what kind of a file would take more than 30 minutes to download?!? 5-Watcher is coded properly with TRY/CATCH clauses and you go from there...
Why would you have a "watcher" spend it's time hogging processor time and memory 24/7 anyway?
In my world, my SSIS packages fire on schedule and do their tasks sequentially.
If the interviewer does not accept any other solution than is, this means only one thing: control freak. Take your stuff and say bye bye.
Admin
Seems to me that this is a 3rd party application (the watcher) as it's not something we can change, yet it crashes on a pretty obvious scenario. I think the solution is to contact the vendor and tell them to fix their buggy POS software and send me a patch.
Admin
That's why you move the file.
Admin
Admin
That is to say, you talked with one of the Tier 1 helpdesk people and realized they had some talent and were quick to learn…
Naw, still sounds allright too me. Wereas if you replaced teh "That is" width "For Example", it'd sound awkward w/o some miner changes- at very least put the hole thing in present tense.
Its ok, I can handle bean rong. Guess Im a diamond in the rogue.
Admin
4: no locking - that's insane. 5: try/catch doesn't work when the process dies
sitting around watching a dir takes no appreciable cpu, and likely not much memory either.
Admin
The best solution would not need a change as a temporary directory. The Watcher would just wait for two files. Then it processes one and waits until the Downloader begins the next. When the Downloader stops, the Watcher would process the last file.
Admin
A specific example like "You talked with one of the Tier 1 helpdesk people and realized they had some talent and were quick to learn. You get them some further training, move them to a junior development position, and in a short time they become a major contributor to success?"
I.e. was wrong here. E.g. would have been the correct choice. Deal with it.
Admin
The first move is not atomic but it doesn't matter because the watcher watches over the other directory. Problem solved anyway.
Admin
Ok - so the majority here believe that mv is an atomic operation if the src and dst are on the same filesystem (or partition?)...
Does this hold true for ALL filesystems running under Linux/Unix? It might be that the directory is not running on ext2/ext3... What if it was a mounted FAT32 partition, or RiserFS, or UFS, or a SMB share?
That is the problem with questions that have "only one right answer" IMHO -- they either omit important assumptions (i.e. the underlying filesystem supports atomic moves and the src/dst are on the same filesystem/partition), or are poorly conceived and disregard important implementation details like that that the really smart people think about. :)
Admin
Not sure what you meant to say here. It does matter whether they are on the same disk or partition. I don't know what you can guarantee.
Here's what they are talking about:
Now imagine the filesystem was set up one of these ways:
#1 is going to be effectively atomic. The other two are not.
It's not malicious config, it's a very common type of config in unixesq systems...and one which is normally hidden and inconsequential but could cause unstable behavior due to this shortcut. It doesn't necessarily make the problem impossible.
The mv method works and I've used it. However, there are other methods which add functionality that might well be needed in the real world so his characterization of it as the "only real solution" was questionable either on the "only" or "real" front.
Admin
Of course, there's no real reason given to not have the downloader kick off the "watcher" (or processor in this case). Which could easily be set up so that watcher never starts before the downloaded file is ready.
Admin
Jeremy isn't very good at guiding a candidate to the correct answer. His guidance was repeatedly 'you can't do that'. Rather than say "you can’t modify the Watcher" why not say "you can only modify the Downloader". This immediately gets the candidate out of the incorrect mindset of modifying the processor and into the correct mindset of modifying the downloader.
Admin
I was going to suggest that but it has some of its own ugliness.
The Watcher doesn't necessarily have a way of knowing which of the two files is fully downloaded and which is in progress. This is easy to workaround if the files are named sequentially.
If the Watcher takes a long time to process a certain file and the next file is the "runt" of the bunch, you could end up with 3 (or more) files in the directory at the same time. Again, if the files are named sequentially... not a big problem.
I don't think the Downloader has a way to notify that it has "stopped", and you can process the last file now. Otherwise it would probably have a way to notify that it has finished each file and we wouldn't need to worry about anything.
Admin
That being said, trying to start up some sort of IPC link for this was a bit overkill either way. Envisioning a problem when trying to put the temporary directory on different drives might be premature as well, but then that's the problem with these types of recruiting questions, you're working off of a set of assumptions and you never really know what they are.
Admin
That was my thought. Lock the file until fully downloaded and then have the downloader invoke the processing program.
No mucking about with polling a file for it's lock status. No goofing around. No worries some idiot will see an empty directory and remove it in a "clean up" effort.
Admin
You're right, it is. And stop calling me Shirley.
Admin
“Where the hell are you getting these people then? Any amateur could answer those questions.”
I actually said something like that once, after being told I was the only person who answered a simple SQL question correctly. After three interviews and flying halfway across the USA (on their dime) for a fourth interview, I'm pretty sure that comment is why I'm not living in Phoenix right now.
Also, from reading the comments, I'm wondering if anyone really has any practical experience with the download/pickup routine. Having worked for possibly the largest printing company in the world, I can tell you this directory-watching thing is pretty common, and on every system I've worked on, which includes Unix, Windows, and AS400 - the "move file when fully downloaded" procedure is the simplest and most reliable, and most other schemes are prone to problems you can't even imagine - things that I never thought could happen, such as whole file systems failing because of race conditions between processes. The interviewer has this one right - however, it's not the only simple solution, and it does require some experience to know that renaming schemes and such can lead to problems. If the intent is to see if the person can come up with simple ideas for simple problems, he could probably go with a simpler question that isn't so reliant on practical experience - like the one about the weight of a 747.
Admin
Yeah... kinda like Firefox!
Admin
No, it only matters that they are on the same filesystem. If you don't know what that means, go read a book.
#2 and #3 are not on the same filesystem.
Setting up the move so it goes across filesystems is a serious config error and part of what I was referring to as malicious behavior. The mv method works across filesystems now, but it didn't always work like that. Used to be, it would fail if the filesystems differed.
Admin
It holds true for all sensible choices for a server filesystem. I assume sensible configuration and enforce it when possible, because allowing stupid things like fat32 on a server just encourages idiots.
Admin
I have experience with the whole download/pickup routing, and the way it worked was this:
this process is easy to understand and troubleshoot, and also to repair when it burps, which is handy when dealing with the one part of the IF that can't be easily modified remotely and must not break, even at 2am.
Admin
Shari would do well to rat out this company and employee by name somewhere (not necessarily here), because what's described is a rather blatant case of illegal employment discrimination.
Admin
ftp the file to /somedir/.filename.dat and rename it to /somedir/filename.dat after you are finished writing it.
Admin
Who watches the watcher...?
Admin
Since when is being a Brown alum a protected class?
Admin
You could give the watcher a low priority and starve CPU cycles indefinitely until you know the file has been downloaded.
This is accomplished with two monkeys trained in Morse code. A monkey on the server side taps the file size to a monkey on the client side through a sophisticated can/string setup. When the file sizes are equal, the client side monkey can remove a banana from a bin placed on a balance scale. The downward pressure from the other side of the scale causes fluid motion in a hydraulic system. A lever arm connected to the hydraulic presses a button with four independent contacts. If 2 out of 4 contacts are detected, the priority of the watcher task is temporarily increased. The hydraulic system will include a pressure operated release valve to ensure the button is unpressed in a timely manner.
The client side must constantly display a security application which logs file access. A video camera pointed at the display uses normalized cross correlation and unsupervised learning algorithms to detect additional file access in the security log. This results in the watcher being downgraded to the starvation priority until the redundant switch contacts are again activated.
Periodically, both the client side monkey and the server side monkey must finish a game of Tic Tac Toe with a trained chicken to verify they are still paying attention/not dead. The game must end in a tie. If the game is not completed or the monkey loses, an alarm is sounded and the monkey is replaced. If the chicken loses, the chicken is replaced.
Admin
host monitoring watches the dir and alerts when files sit around too long.
Admin
Depends on how long you want to keep the file around. If it's download -> proces -> delete then the following can be applied:
Assume: /final is partition A, /tmp is partion B.
Download the file 'whatever' in /tmp, create a symbolic link in /final/ to /tmp/whatever process the file remove symbolic link remove real file
Done.
Admin
Admin
So she's a woman and it's automatically sex discrimination? He took a long lunch and blew off an appointment because he's a self centered schmuck and mostly looked down on here for going to brown. The personality fit is a valid test, but in this case, the personality he wanted was 'sycophant'.
Admin
What if the temporary directory is located on a different storage volume? You'll end up moving several gigabyte file from a volume to a volume, which is not that fast - isn't it why you explicitly mentioned "several gigabytes" so that this approach should be ignored? What if the volume with the temporary directory is smaller than the work dir? The download just fails if done into the temporary directory - while it would successfully work if done into the work dir. What if the the temporary directory is on the ramdisk? A nice side effect of your approach would be the complete system lock-up. What if your work directory is located on a different filesystem, which doesn't support several-Gb-long files? FAT-16 with its 4 Gb limit? Or what if the FS on your work directory doesn't support all the possible names which are allowed in the work dir? CP-1250 limited, while you need to download a file with Chinese hieroglyphs in the name? Or supports 8.3 names only? Finally - what if you don't have a temporary directory AT ALL? If it is an embedded solution - an ATM, a hardware switch or something? Have you heard of ADSL switches which have in-build Bittorrent clients?
Too many "what if"s. Do you know Jeremy, what they mean? That you didn't mention anything of these in your problem preconditions. While you should have. Instead, you were changing the rules during the game, again and again. The proper software architect plays the rules given by the customer (i.e. YOU) and tries his best to cover all cases which are not defined explicitly. If he doesn't know, if he uses a Windows-based one-HDD-PC or a memory limited Linux/RISC-based ADSL router with no embedded HDD at all (but an SMB client), he assumes his worst. Mark it, Jeremy, nowhere in your questions you ever mentioned the conditions which make your own "solution" valid. While ALL of the solutions from your candidate are perfectly valid. His solutions are complex and valid. Your solution is simple and invalid. Sorry to say, Jeremy. You failed your own interview.
Admin
I'm amazed nobody has answered "chmod" yet, as "I Guess That Would Work, Too"
Admin
I suspect they already know exactly why they have high turnover, and the reason is something beyond their control. (I know it is where I worked recently...)
Admin
Admin
Admin
What's your fixation on /tmp? you download to one dir, then move it over. your questions are the equivalent of asking 'what if you deleted all your source code and set yourself on fire?'. completely irrelevant.
That's stupid. Don't do it.
Well if you can't download the file at all, then you're SOL.
his solutions were pretty bad - hack the kernel for the sake of a file downloader?
Admin
This interviewer did several things that HR people specifically tell you in interview training never to do, because they are difficult to explain in court as something other than dscrimination. If the company has an EEO policy, he violated it, regardless of his reasons.
Admin
Admin
One time I was interviewing people for this job. One of the candidates was completely unqualified. I mean, he went to BROWN. ALL of our other candidates came from top notch schools, like Harvard or MIT. So about 30 minutes after the interview was supposed to go to lunch.
When I got back, I made him wait another 15 minutes. By then he seemed a little pissed, so I asked him why. He had just got mad and left! Can you believe it? He should be giving me sexual favors! I have a PHD from Harvard!
Admin
He interpreted her objection to getting blown off for nearly an hour as a personality problem. If I were that guy (I'm not) and I got sued, I would stand up in court and say "Sorry your Honor, but I'm a complete asshole. I treat men and women equally poorly".