The Daily WTF: Curious Perversions in Information Technology

2008-09-24 Reply Admin

Jules Winnfield:
brodie:
You use e.g. when you want to give specific examples, and i.e. when you want to elaborate on something.
I use IE when I want to install malware.

Really? How odd. I use Apache, myself.

Franz_Kafka · 2008-09-24 Reply Admin

moz:
Jules Winnfield:
brodie:
You use e.g. when you want to give specific examples, and i.e. when you want to elaborate on something.
I use IE when I want to install malware.
Really? How odd. I use Apache, myself.

All depends on whether you're giving or receiving.

gero · 2008-09-24 Reply Admin

OK, a FAQ about the different "solutions" offered

Q. What if it's on a different partition? A. Then make a temporary directory inside the same directory. You're guaranteed it will be on the same partition.

Q. Locking? A. You can't control the watcher

Q. Rename instead of move? A. It's the same damn thing. The system call, not the system utility "mv" (which is the same thing when on the same partition).

Q. Activate the watcher after the downloader. A1. You are downloading multiple files... the watcher could process any "unfinished" file in this case. A2. There is probably a difference between the "watcher" and the file "processor" (as in functionality, though it sounds they are the same program in this case). You probably want to start the processor on the downloaded file, but crap, you can't modify it.

2008-09-24 Reply Admin

blindio:
Seems to me that this is a 3rd party application (the watcher) as it's not something we can change, yet it crashes on a pretty obvious scenario. I think the solution is to contact the vendor and tell them to fix their buggy POS software and send me a patch.

The post didn't say anything about a retail application. How did you know it was Point Of Sale? oh... I get it...

never mind

2008-09-24 Reply Admin

Say that to customer? Was the candidate explicitly permitted to apply his own restrictions - to be able to say to an imaginary customer the words like "well, we'll do the simple solution, but it won't work if you mount the temporary directory on a different disk, etc, etc"?

What if's are great and a valuable tool to any software engineer. However, if you follow the thread for too long, you never get anything done. There are an infinite chain of "what if's" that must be accounted for in even the most trivial of exercises. The difference between someone good at asking questions and someone good at finding solutions is knowing when to draw the line.

The candidate could have established that line immediately by just asking a couple of basic questions to determine the parameters of the problem. Instead he follows a progressively more convoluted trail to answer a problem he was told has a simple solution - culminating at modifying the linux kernel itself.

The interviewer did not say it was the best solution. He said it was simple. If I get someone in front of me who can't see the forest for the trees, I'm going to pass on him too.

tgape · 2008-09-24 Reply Admin

James R. Twine:
Ok - so the majority here believe that mv is an atomic operation if the src and dst are on the same filesystem (or partition?)...
Does this hold true for ALL filesystems running under Linux/Unix? It might be that the directory is not running on ext2/ext3... What if it was a mounted FAT32 partition, or RiserFS, or UFS, or a SMB share?

It holds for every single file on any POSIX OS, regardless of whether that file is on a native filesystem or not. This is because rename(2) is atomic, according to POSIX, so if you want to be POSIX, rename better be atomic. And it's incredibly easy to have rename(2) be sufficiently atomic for this purpose. (Note: even if rename was implemented by doing a link(2) followed by an unlink(2) of the old name, it would still be fast enough, unless Watcher is so fragile that it dies if it ever sees a file with multiple links. For it to do that, it'd pretty much have to be coded to do that, however - and it would still be such a tiny window that it's virtually inconceivable that it would happen regularly.)

Btw, this means Windows, too. POSIX is a big thing.

(Note: there are many unix OSes which have optional POSIX support. However, their non-POSIX mode is still loosely POSIX, and still does this. It's too necessary to not do it, and it's too useful to write the code to do it and then not always use it.)

tgape · 2008-09-24 Reply Admin

klenow:
Is it just me, or does Jeremy sound like a bit of a jerk? He simply wouldn't let the guy use any solution that wasn't his pet solution. Seems the simplest solution is to have the Downloader activate the Watcher when it's done.

The Watcher program runs as a different user. suid doesn't work. Since both programs run as non-privileged users, one cannot spawn the other.

Oh, and Watcher's a fragile piece of crap; it'll freeze if it hits a file it can't read, it requires a kill -9 to restart, and that will corrupt its database. So don't even go there.

I'm sorry about that - I didn't write Watcher; it's 25 years old, and it's 25M of hand-written assembly - no comments. All of the developers retired; most of them are dead, but there's one guy left who's living in an insane asylum. Some say it was from working with that code for five years after all of the other developers had retired.

For what it's worth, it's not 'changing the scenario' if you're revealing more information about the situation which wasn't previously known. If the new information is inconsistent with the old, that's a different story.

tgape · 2008-09-24 Reply Admin

James:
No, that wouldn't work. The file record will show up in the watched directory in the midst of copying and your watcher program catches it in mid-copy. You have the same problem as mid-download, just a much narrower range of failure opportunity.
The better solution (assuming you cannot PUSH from the downloading location)

I call troll. Nobody could possibly think having the 'downloading' application running on a different machine would improve the IPC (Inter-process communication, for those clueless who obviously reading (and posting) here.)

2008-09-24 Reply Admin

Franz_Kafka:
Duke of New York:
Franz_Kafka:
If I were that guy (I'm not) and I got sued, I would stand up in court and say "Sorry your Honor, but I'm a complete asshole. I treat men and women equally poorly".
That wouldn't keep him from having to testify as to the specific nature of the "personal problem" (not "personality problem") that was not related to her professional qualifications for the job. Or from having lawyers sift through his e-mails for evidence of a past pattern of behavior.
Anyone who does interviews and doesn't see how clear-cut this was, is a walking liability and needs to get trained.

That's easy - he's hiring toadies and footstools. In a less perjorative example, I'm allowed to pass on hiring someone because they don't fit in with the team, even if they're professionally qualified.

Being an abusive jerk isn't illegal.

No one is saying that you have to hire someone with a "problem on the personal level." What (I think) the Duke is saying is that at the very least...

a) being an abuisive jerk isn't a defense in court. b) "Shari" would have a legitimate case against which you would have to defend yourself and who wants to go through that hassle.

You don't have to hire anyone who doesn't fit in... you have my permission to be an abusive jerk... But when decide, based on a piece of paper and two spoken sentenances, that a job applicant has "proplems at the personal level" you had better be prepared to explain what those problems are and why the make her someone you wouldn't hire.

tgape · 2008-09-24 Reply Admin

akatherder:
Mover And Copier:
The problem might be that when I'm told I may not modify the Watcher, how the heck am I supposed to know that I might be allowed to modify the Downloader?

So you're given a problem with 2 entities. The person who gives you the problem says "Solve this problem without changing entity #1." You assume that you can't do anything to entity #2?

The only rational thing I can assume is that changing entity #2 is the ONLY way to solve it or else this is a bullshit trick question.

In fact, I like to give interview questions which require one to understand that "not being able to change entity #1 does not mean one cannot change entity #2." It's such a fundamental bit of logic, which has come up many times in my career.

And, in the many times I've been called to clean up someone else's royal mess, a failure to apply that bit of logic has almost always been the root cause of the initial mess, as well as the cause of numerous of the additional messes added on top of that initial mess to raise it from a simple mess through complete mess to royal mess.

So, while I feel sympathy for those people who cannot understand it, there is no way I want to sponsor employment for such people in any 'thinking' job.

2008-09-24 Reply Admin

Dave never wanted to give you the job, whoever you turned out to be. He just wanted the interview to go badly so he could get back to other things.

gero · 2008-09-24 Reply Admin

ContraCorners:
you had better be prepared to explain what those problems are and why the make her someone you wouldn't hire.

the receptionist:
And actually, Dave, the guy you’ll be meeting with, is out to lunch.

Being "politically correct" could probably be the right thing to do in certain situations (I personally hate this use of female gender personal pronouns but that's not the point) but in this case, calling Dave "her" would be close to insulting. And because I hate this political "correctness" I pick on it.

tgape · 2008-09-24 Reply Admin

I see a common thread on these interview response discussions I'd like to mention.

Specifically, the interviewer sets up a scenario. He or she then refines the scenario by adding complications into the picture.

Then, the interviewee and the various forum members start adding their own complications into the picture. "What if there's no other directory on the partition you can write to?", for example.

Please stop this. It is the interviewer's place to add clarification. (Yes, I realize I put a post above with a clarification claiming that the Watcher was excessively fragile, and I spawned some interviewer-ish explanation why. But I wasn't explaining why the interviewer's answer wouldn't work; I was providing justification for the conditions the interviewer gave arbitrarily.)

As an interviewee, you should be looking for possibilities that the interviewer has left open, rather than trying to close the remaining holes.

Then, later, as a new employee, you should be looking for possibilities that the various PHBs and clueless gits who have preceded you have not managed to block off. Don't complete the barricade, but figure out what path remains open and take it.

tgape · 2008-09-24 Reply Admin

The first story reminds me of one of the many interviews I had when I was first looking for an IT job, many years ago.

I arrived 25 minutes early, but I brought a book to read. I chatted briefly with the receptionist, and indicated that I was intentionally early, as that way I couldn't possibly lose track of time - she'd alert me when he was ready to interview me.

About 5 minutes later, a man hurried out of the office area. He told the receptionist he was running late and going to grab a quick bite - he might be late for his 1:00 interview. She indicated I was already there. He looked over, and said, "Hey, want to grab lunch?"

The interview didn't go well; he didn't get the employee. But, on the bright side, he did get a few pointers on where he was going wrong on some of his projects, and he was able to get reimbursed for the meal. Years later, I encountered the guy he did end up hiring; my suggestions saved him several months of effort.

(Note: technically, he turned me down as 'over-qualified'. I got that a lot, as I'd blown off my classes, but to mess around with computers more and learn more about them.)

As someone who has given interviews to quite a few people (my first job was at a rapidly growing contracting company; it was continuously hiring for several years, during which time I gave dozens if not hundreds of interviews), I'd say Shari appears to have an excellent case. It's possible that discovery would turn up a consistent behavior pattern for the defendant, but it's also possible her alma mater isn't the only thing in the story which is brown, in which case that may not matter so much. (Same goes for any other racial minority, although to a lesser extent.) I'm not 100% certain, because it has been quite some time since I've given an interview, but I think Dave broke virtually every rule which we were explicitly given for interview behavior. Since it doesn't say anything about how he looked at her, I would rather just assume he broke that rule, too, to keep things simple. (Note: they didn't explicitly forbid any overt sexual actions, except by stating that the employee behavior guidelines still applied, since interviewers were getting paid for their time, so I'm not including those. I'm just betting that he didn't keep his gaze "neutral.")

2008-09-25 Reply Admin

you need a better operating if mv is not atomic

2008-09-25 Reply Admin

crystal mephistopheles:
“Oh, okay,” the candidate replied. He pondered for a full minute and said “so in that case, I would hae the Watcher listen on a TCP/IP port, and have the Downloader tell it when it was done downloading.”
“That seems like a lot of work,” I said....

I don't think it's fair to say network I/O is "a lot of work". Granted, his temporary directory solution is even simpler, but most high level languages (.NET, Java, etc.) have fully defined classes you can use to implement this in only a few lines of code, and I would classify this as an acceptable solution.

And the temporary directory solution will pose the same problem: what if the copy function is not finished yet and the Watcher gets more cpu cycles? The only solution in this case is to modify the protocol (or rather introduce a very crude one) with a separate flag file which contains a value indicating that either the provider is downloading, the provider is ready, the watcher is reading or the watcher is ready. From these 4 states both parties can derive what the other is or has been doing (or not) and whether it is save to do its stuff. If there is a contention for this status file you could add a second locking file (just keep it locked, nothing more) and create a true double semaphore.

2008-09-25 Reply Admin

I haven't read more than page 1 and 2, but about copying not being atomic - i.e. that you can see partial files in a directory while copying...

I would assume that the OS doesn't put the filename into a directory before it has finished copying the data. If that is the case, you'ld never get partial files in a directory from a copy.

I mean, this is how i thought copying worked:

first copy the data to the destination sectors on the hd
AFTER this is done, place an new file entry into the directory

But it does it the other way around?

2008-09-25 Reply Admin

Hmm... just tried copying a large file (on windows). It does show the file in the destination directory while copying (in explorer and cmd)... So it looks like it does insert the file entry first.. Never mind :)

(but, well.. this is also a somewhat contrived example)

2008-09-25 Reply Admin

tgape:
I'm not 100% certain, because it has been quite some time since I've given an interview, but I think Dave broke virtually every rule which we were explicitly given for interview behavior. Since it doesn't say anything about how he looked at her, I would rather just assume he broke that rule, too, to keep things simple. (Note: they didn't explicitly forbid any overt sexual actions, except by stating that the employee behavior guidelines still applied, since interviewers were getting paid for their time, so I'm not including those. I'm just betting that he didn't keep his gaze "neutral.")

This reminds me of one of my jobs (well known software company) - they don't have a process for vetting that you are allowed to interview people, they just toss you in with a vague idea of how to proceed. I didn't hear about us being sued when I was there, but I sort of expected to.

Cpt:
And the temporary directory solution will pose the same problem: what if the copy function is not finished yet and the Watcher gets more cpu cycles?

For the last time, this is a copy and not a move. move takes almost no time, and is atomic. Just forget about it.

2008-09-25 Reply Admin

You're missing the fact that the first quote refers to the candidate and the second refers to the interviewer.

2008-09-25 Reply Admin

Zaippa:
I haven't read more than page 1 and 2, but about copying not being atomic - i.e. that you can see partial files in a directory while copying...
I would assume that the OS doesn't put the filename into a directory before it has finished copying the data. If that is the case, you'ld never get partial files in a directory from a copy.

I mean, this is how i thought copying worked:

first copy the data to the destination sectors on the hd

AFTER this is done, place an new file entry into the directory

But it does it the other way around?

Where would you store the data you were copying if there wasn't a file to put it in? Just how do you think files are stored? (hint: lots of stuff on google to satisfy your curiosity).

Tourist · 2008-09-25 Reply Admin

Nicolas Verhaeghe:
I too could find FIVE different solutions, just as simple

1-The watcher is activated by the downloader

what's the point with the watcher in this context?

2-The watcher and the downloader are one same application

sounded from the description they were not the same process.

3-While being downloaded, the file is named "tmp_[filename]" and then renamed "[filename]" and the watcher knows not to open files starting with "tmp_"

modifying the watcher which wasn't allowed

4-Downloader is scheduled to work at the top of the hour, Watcher scheduled to work at the bottom of the hour, and if you have a T1, what kind of a file would take more than 30 minutes to download?!?

here you are assuming the size and the download speed are always predictable and that downloader is not downloading say every five minutes. You know what they say about assumptions

5-Watcher is coded properly with TRY/CATCH clauses and you go from there...

;-) the problem is not on this level

Why would you have a "watcher" spend it's time hogging processor time and memory 24/7 anyway?

because maybe it is not predictable when files arrive? or the watcher feeds some other program where they want so see the results asap etc.

2008-09-25 Reply Admin

gero:
ContraCorners:
you had better be prepared to explain what those problems are and why the make her someone you wouldn't hire.
the receptionist:
And actually, Dave, the guy you’ll be meeting with, is out to lunch.
Being "politically correct" could probably be the right thing to do in certain situations (I personally hate this use of female gender personal pronouns but that's not the point) but in this case, calling Dave "her" would be close to insulting. And because I hate this political "correctness" I pick on it.

The above is what my previous comment would be placed under if I had any clue what I was doing at a time of day when I haven't slept yet and am supposed to get up in five hours to go to work.

2008-09-25 Reply Admin

tgape:
my first job was at a rapidly growing contracting company

Sounds like the company I worked for in the 90s. It grew rapidly, then contracted.

Eternal Density · 2008-09-25 Reply Admin

Smash King:
You cross a dimensional portal and gets stuck in a world where interviewers are unable to use adaptive interviews "IE" those that change to turn whatever answer the interviewee provided automatically wrong. Now what do you do?

I pick up my crowbar and look for zombies to bash. No, not zombie processes, headcrab zombies.

2008-09-25 Reply Admin

Long time since i've been looking at real industrial strength filesystems, but in the old CBM DOS V2.0 a file is 1) an entry in the directory, 2) the data itself.

The entry in the directory contains the filename and a pointer to the first track/sector of the data, the file should contain.

The data itself, is of course stored at the track/sectors of the disk.

So, it is easily possible to save the data first, and insert the entry in the directory (which essentially is just a pointer to the data + a name) afterwards. (atleast in CBM DOS V2)

It seems like you suggest that the data for a file is stored inside the directory entry itself..? (if that is the case, i don't see how hardlinking in unix could work - that would have to copy all the data for each hardlink..)

Or am i missing something? (maybe it's all different in modern filesystems)

2008-09-25 Reply Admin

My above post should have been a reply to:

Franz Kafka:
Zaippa:
I haven't read more than page 1 and 2, but about copying not being atomic - i.e. that you can see partial files in a directory while copying...
I would assume that the OS doesn't put the filename into a directory before it has finished copying the data. If that is the case, you'ld never get partial files in a directory from a copy.

I mean, this is how i thought copying worked:

first copy the data to the destination sectors on the hd

AFTER this is done, place an new file entry into the directory

But it does it the other way around?

Where would you store the data you were copying if there wasn't a file to put it in? Just how do you think files are stored? (hint: lots of stuff on google to satisfy your curiosity).

2008-09-25 Reply Admin

Zaippa:
Long time since i've been looking at real industrial strength filesystems, but in the old CBM DOS V2.0 a file is 1) an entry in the directory, 2) the data itself.
The entry in the directory contains the filename and a pointer to the first track/sector of the data, the file should contain.

The data itself, is of course stored at the track/sectors of the disk.

So, it is easily possible to save the data first, and insert the entry in the directory (which essentially is just a pointer to the data + a name) afterwards. (atleast in CBM DOS V2)

It seems like you suggest that the data for a file is stored inside the directory entry itself..? (if that is the case, i don't see how hardlinking in unix could work - that would have to copy all the data for each hardlink..)

Or am i missing something? (maybe it's all different in modern filesystems)

It's all different in modern filesystems, at least unix ones.

A common implementation for unix filesystems is as follows: each file is stored in an inode, which stores permission and ownership info along with pointers to the blocks that contain the data. Directory entries contain a name and an inode and are themselves files.

Generally, inodes contain enough block pointers (that point to the data blocks) to fill up leftover space in a data block, minus some entries at the end that do various levels of indirection for big files. The actual data is stored contiguously (mostly) when possible, and fses like ext2 do things like allocating 8 blocks at a time to speed up access.

Meanwhile, directories can be indexed by name, so 10,000 files in a dir is fast as hell.

So you see, there's no way to stre data without creating a file to put it in, but since you can just write the file somewhere, then create a file entry in some other dir, it's no big deal.

2008-09-25 Reply Admin

Franz Kafka:
It's all different in modern filesystems, at least unix ones.
A common implementation for unix filesystems is as follows: each file is stored in an inode, which stores permission and ownership info along with pointers to the blocks that contain the data. Directory entries contain a name and an inode and are themselves files.

Generally, inodes contain enough block pointers (that point to the data blocks) to fill up leftover space in a data block, minus some entries at the end that do various levels of indirection for big files. The actual data is stored contiguously (mostly) when possible, and fses like ext2 do things like allocating 8 blocks at a time to speed up access.

Meanwhile, directories can be indexed by name, so 10,000 files in a dir is fast as hell.

So you see, there's no way to stre data without creating a file to put it in, but since you can just write the file somewhere, then create a file entry in some other dir, it's no big deal.

Ok, thanx for the clearification. Not sure i get the point about why the directory entry can't be made afterwards, though. But i'm tired, so that is probably why. ;)

(I just pulled out my old tanenbaum book (modern operating systems), took a quick look at the section on the UNIX V7 file system. It says that a directory entry, is in fact just a filename + a pointer to an i-node. So it does seem to me that it is possible to create the i-node (etc) before creating the directory entry.)

Anyway, i'm tired, and you're right - it's getting a bit off topic. :)

2008-09-25 Reply Admin

ST:
Thanks for the interview tales, this is one of my favourite sections. Mind you, I'm pretty shocked at how many of the resident professionals are trying to come up with alternative answers for the problem in the second tale. Obviously you use a temp filename (ignored by the watcher) or a temp directory. What kind of mindset comes up with anything else?

Real people. Personally, while reading the problem I right away came up with a solution - Have 'Watcher' look for the next file before processing the present file.

Example, If you expect to see File1, File2, File3 .... FileX then process File1 when File2 appears, process File2 when File3 appears ... etc ... FileX I would process after a reasonable delay or add a dummy End_Of_Files file.

2008-09-25 Reply Admin

Matt:
TRWTF is that this message board doesn't do threaded replies!!!

Seconded.

2008-09-25 Reply Admin

Typical CS graduate solution.

In the real world, the watcher finds the file, tries to open it, fails, dumps about 347 error messages which trash /var/log and your patience, refuses to process anything further (thus overflowing the directory, causing a week of outage), sacrifices your first-born to Steve, and then it will get real nasty.

ClaudeSuck.de · 2008-09-25 Reply Admin

crystal mephistopheles:
“Oh, okay,” the candidate replied. He pondered for a full minute and said “so in that case, I would hae the Watcher listen on a TCP/IP port, and have the Downloader tell it when it was done downloading.”
“That seems like a lot of work,” I said....

I don't think it's fair to say network I/O is "a lot of work". Granted, his temporary directory solution is even simpler, but most high level languages (.NET, Java, etc.) have fully defined classes you can use to implement this in only a few lines of code, and I would classify this as an acceptable solution.

And there we are again: the beginning of an enterprisy solution.

2008-09-25 Reply Admin

JamesQMurphy:
TopCod3r:
We have 2 open positions on our team, due to high turnover, so I interview probably about 5 or 6 people a week and have gotten really good at giving technical interviews. It usually involves giving them a real problem from some code we have, and seeing if they solve it the right way, and then I explain to them how it should be done and make sure they agree.
It is hard to find people who have the right mix of skills and personality. Some people realize halfway through my technical interview that they lack the required knowledge and simply cut it short and walk out of the room, I assume in embarrassment.

Have you ever asked why you have high turnover?

Woosh!

ClaudeSuck.de · 2008-09-25 Reply Admin

Max:
Oh, seriously. Each one of these is a job you shouldn't take anyway.

Obvious reasons...

An interviewer who thinks downloads in progress are a problem but file copies in progress are not shows a lack of understanding.

Interviews are two-way -- if the people interviewing you are clueless, the job will suck.

We solved this by deleting the file from the source directory after transfer and have The Watcher (we could modify The Watcher) have a look into this directory. Once a file is in the working directory but not in the source directory anymore it was assumed that the file was transferred completely (and successfully). Not fool-proof but worked quite well, though.

ClaudeSuck.de · 2008-09-25 Reply Admin

Azeroth:
There is another simple solution with the downloader/watcher problem - downloader should open the file exclusively while it's being downloaded, this way watcher won't be able to access it until it's closed. This way it's not even required to move anything anywhere.

http://thedailywtf.com/Articles/I-Didn%e2%80%99t-Know-You-Could-Do-That!.aspx

gero · 2008-09-25 Reply Admin

I missed the point by posting here.

Apologies to ContraCorners.

I'll have to find another case of political correctness to flame....

2008-09-25 Reply Admin

Jeremy H. is a tool. He said that you can't modify the Watcher but you can modify the Downloader?

I've had tools like this ask me questions where they really want one golden answer. He is not looking at how you approach the problem, rather whether you reach his solution.

The temporary directory solution is a poor one - it creates a race condition. It assumes that a copy operation from temp to the real dir is going to be an atomic and allows the watcher to pick up half written files. Sure it will crash less - but it will still crash.

Communication between the downloader and watcher makes sense to me. Sounds like the guy was just getting annoyed that Jeremy is a tool.

2008-09-25 Reply Admin

Was that first interviewer Steve Jobs? I've heard anecdotes about that kind of thing...

2008-09-25 Reply Admin

“What about if the Downloader just wrote files to a temporary directory, and then moved the file to the appropriate directory when the download was complete.” Hang on a minute - surely the watcher program will try to process the file while the downloader program was moving it from the temporary directory (which would take a while due to it's size), which would produce precisely the same result - the Watcher program crashes as it processes an incomplete file. Surely the answer is for the Watcher program to check the last edited timestamp of the file - once that is say, 10 minutes old, then it's likely the file has finished downloading.

gero · 2008-09-25 Reply Admin

Joel H.:
The temporary directory solution is a poor one - it creates a race condition. It assumes that a copy operation from temp to the real dir is going to be an atomic and allows the watcher to pick up half written files. Sure it will crash less - but it will still crash.

Chris Leather :
Hang on a minute - surely the watcher program will try to process the file while the downloader program was moving it from the temporary directory (which would take a while due to it's size), which would produce precisely the same result - the Watcher program crashes as it processes an incomplete file.

We're looooooping here...

Seriously guys, the real solution is to mount the remote location (there are plenty of FUSE modules for every imaginable protocol and we already know these guys are using Linux from the context), have the downloader just add a symbolic link to the remote file (that should be as atomic as it gets), and then the watcher will act on the file pointed to by the symbolic link.

Sheesh.

The General · 2008-09-25 Reply Admin

SoonerMatt:
Marc:
Rename?
Yeah I was thinking that too. Rather than make it move a 3 gb file (which could fail in itself), I would start the transaction as a .tmp file then remove the .tmp when it's completed.

That's just what we do to solve a very similar problem - the "watcher" is a 3rd party app looking for a particular file type. Even though the files are rarely 3Mb let alone 3Gb, and there's no network connection involved in writing them, that damn watcher is just too fast. The files are therefore written as .tmp then renamed when done.

2008-09-25 Reply Admin

Azeroth:
There is another simple solution with the downloader/watcher problem - downloader should open the file exclusively while it's being downloaded, this way watcher won't be able to access it until it's closed. This way it's not even required to move anything anywhere.

Nope, sorry.

Watcher tries to open file, and gets can not open file error and abends. -904 resource unavailble

The proper way to do this is to have the downloader trigger the watcher program when done. Has no one heard the term "batch" or "job schedule" before?

It's a good thing you *nix kiddies haven't tried to reinvent the car, as I suspect it would not have wheels.

2008-09-25 Reply Admin

SoonerMatt:
Marc:
Rename?
Yeah I was thinking that too. Rather than make it move a 3 gb file (which could fail in itself), I would start the transaction as a .tmp file then remove the .tmp when it's completed.

Why does everybody seem to think that moving a file involves copying it? Not even MSWindows does it that stupidly, I think. (Or does it?) Anyway this was apparently about Linux.

2008-09-25 Reply Admin

Joel H.:
Jeremy H. is a tool. He said that you can't modify the Watcher but you can modify the Downloader?
I've had tools like this ask me questions where they really want one golden answer. He is not looking at how you approach the problem, rather whether you reach his solution.

The temporary directory solution is a poor one - it creates a race condition. It assumes that a copy operation from temp to the real dir is going to be an atomic and allows the watcher to pick up half written files. Sure it will crash less - but it will still crash.

Communication between the downloader and watcher makes sense to me. Sounds like the guy was just getting annoyed that Jeremy is a tool.

You are the tool for insulting others. The "simple" solution is a move, as POSIX guarantees atomicity. This is discussed at length in the first 4 pages of comments, where the differences between copy and move have also been discussed.

As you are a tool, you wouldn't get the job for unspecified "personal problems", but we'll all really know it's because you are a tool.

NeoMojo · 2008-09-25 Reply Admin

JamesQMurphy:
Have you ever asked why you have high turnover?

It's because of the little one. He's always saying "turn over. turn over." So they all turn over and one falls out.

2008-09-25 Reply Admin

James:
b) I was going to suggest having Watcher poll the filesize and do its scan a fixed time after it stops changing. I think that can work, if you're able to be sure that Downloader is using a protocol with a spec'd timeout. Assuming, of course, that Watcher is interested in failed downloads as well as those that finish...

My idea was to watch the directory, and once a new file appears, the other ones must be finished. (For the last file, your solution or something like it would be needed)

Anyway, modifying the Watcher was not an option, apparently.

2008-09-25 Reply Admin

Sorry but that temporary files copying question is a crock of shit. You still have the same problem. If the file is large, and the disks are slow/physically separate from each other, the operating system will copy it piece by piece from the source directory to the target; it's not an atomic operation, so you still can't know when it's fully complete.

2008-09-25 Reply Admin

So how will the Watcher tell the difference between downloading a file from the internet to a directory and copying a file from another directory to the one it is watching?

2008-09-25 Reply Admin

AAAaaaaaarghhh!!! Enough!!! Someone say something DIFFERENT... PLEASE!!

A Problem at the Personal Level & More

Leave a comment on “A Problem at the Personal Level & More”