Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello all,

I've written a daemon that runs in the background checking for files in a directory. If files are found, I immediately fork. The child will process the file and exit once the file has been processed, the parent goes back to checking the directory for new files and we repeat the cycle again.

In the true spirit of "you don't want to do it like that, you want to do it like this", I'm wondering if anyone can offer any advice as to what they'd do differently and why.

my $dir = "/path-to/some-dir"; while () { opendir(DIR, $dir) || do { # my error code here... }; for my $filefound (grep /\.foo$/, readdir DIR) { my $fullPath = File::Spec->catdir($dir, $filefound); # immediately rename the file my $newPath; ($newPath = $fullPath) =~ s/foo/bar/; rename ($fullPath, $newPath) || do { # my error code here... }; my $kidPid; unless (defined $kidPid = fork()) { # my error code here... } next if $kidPid; # parent continues with dir grep # I'm the child, I process the file then exit } }

Replies are listed 'Best First'.
Re: Daemon To Check For Files
by perrin (Chancellor) on Nov 04, 2003 at 23:06 UTC
    Won't this pound your CPU and disk by running in a tight loop? I would make it cron job and have it keep a "last-run" file that it updates each time it runs. Then just check for any files with -M >= -M last-run.
        Then just check for any files with -M >= -M last-run

      I think this is not a sure way to check that a file is not being written to though, say, an FTP process might have stalled, or a long running process that hasn't flushed its file handle yet...

      To reliably check for the completion of a file, I think it is necessary to introduce some trigger files, or to have the process that creates the file to write to a different extension first, and then rename it to the matching extension.

        It's no worse than what he's already doing. Is there a way to ask the OS if any process has a file open for writing? I don't know of a way to do it off the top of my head. I think you'd need cooperation from the process doing the writing, which this person doesn't have.

      Won't this pound your CPU and disk by running in a tight loop?

      Yes it does. Unfortunately the files have to get processed *immediately* so a cron job won't work here.

      A bit more info would help here. A legacy third party app (of which I have no control over) drops a file into a directory and waits for the file to be processed by my Perl program. The Perl program processes the file and appends results to it before renaming the file to something the legacy app is expecting (the file extension is changed to .done). Once the legacy app sees the .done extension it knows processing is complete and it reads the processing results from the file. Behind that legacy app is a patient user waiting for their results hence the reason why the files must be processed immediately.

      Seeing as I'm kinda stuck in legacy app hell, is there anything else my Perl daemon could do to aleviate the CPU & disk pounding here?

        That's pretty nasty. I can't think of anything simple that will solve the essential problem, i.e. polling the directory, but you could probably help things out just by adding a sleep(1) to the end of your loop.
Re: Daemon To Check For Files
by Roger (Parson) on Nov 05, 2003 at 01:26 UTC
    Take a look at the module POE::Component::DirWatch on CPAN, I think this might be what you are looking for. It watches a directory for files, and once a filename matching the filter is found, it will kick off a sub-process and invokes the given user callback.

    Sample code pulled straight from the doco:
    use POE::Component::DirWatch; POE::Component::DirWatch->spawn( Alias => 'dirwatch', Directory => '/some_dir', Filter => sub { $_[0] =~ /\.gz$/ && -f $_[1] }, Callback => \&some_sub, PollInterval => 1, );
Re: Daemon To Check For Files
by waswas-fng (Curate) on Nov 04, 2003 at 22:51 UTC
    Do you have control over the application generating the files? You seem to have a few race conditions in your logic. What happens if the parent process the directory as files are created and spawns the child to process before the file is actually closed and static (this is an issue depending on the OS and or what the child actually does -- how does it handle broke/truncated files?). Also you have taken out the error code on the rename function -- I assume you handle duplicate file names in the fullpath/newpath locations well?


    -Waswas

      Do you have control over the application generating the files?

      Unfortunately no. See my reply to Perrin below.

      What happens if the parent process the directory as files are created and spawns the child to process before the file is actually closed and static...

      Files placed in the directory are zero byte size until the legacy app closes them (i.e completes the write). I should have included this line of code in my original example. The files themselves are tiny, less than 200 bytes.

      next if (! -s $fullPath);

      I assume you handle duplicate file names in the fullpath/newpath locations well?

      The legacy app creates the file with the date/time/PID in the filename. The timestamp goes down to milliseconds so that combined with the PID of the app should ensure a unique filename.