Ferret has asked for the wisdom of the Perl Monks concerning the following question:

Hi all.

I'm working on a patch download tool for Solaris - an interminable task which is almost as much for the experience as for the end product.

I'm using LWP and a non-threaded Perl, between 5.6.0 and 5.8.0, depending on the system the end product's running on. If it matters, I'm testing on Sol8, but the end product will be on servers from 6 to 9.

What I'm currently wrestling with is a bit of a feature creep. I want to have a semidecent indication of progress (how much has been downloaded), and I want to make sure that I have the whole file or no file at all.

What I've done is create a nice little fork to stat the size of the file on a second-by-second basis. Not as nice as a thread, but better than letting it silently proceed with no idea of success or failure.

Here's my question (finally). If someone breaks (SIGINT), I want to unlink the file I'm currently downloading, so I won't have an incomplete file. Unfortunately, using the code below, I'm getting the below message fairly often when I Ctrl-C:

Couldn't remove incomplete download of 111647-01.zip: No such file or +directory $ ls -la .patchcache/111647-01.zip -rw-r--r-- 1 ferret staff 108175 Aug 20 16:26 111647-01.zip
The file is there after execution. The location is correct in the code. What obvious problem am I missing? I'd like to correctly remove the file when the signal handler trips.
my $pid; if ( $pid = fork ) { #parent gets child pid local $| = 1; my $res; { sleep 1; print "\rUpdating $file...", sprintf "%-5dk retrieved", ( -s $localfile || 0 ) / +1024 unless $opt{q}; redo unless ( $res = waitpid $pid, WNOHANG ); } print STDERR $res ? " ... done!\n" : " ... failed!\n" unless $opt{q}; return $res?$localfile:0; } else { $SIG{INT} = sub { print STDERR 'Removing incomplete download...'; unlink $localfile or print STDERR "\rCouldn't remove incomplete ". "download of $localfile: $!\n" }; exit is_success( getstore( $uri, $localfile ) ); }

Replies are listed 'Best First'.
Re: $SIG{INT} unlink problem
by fokat (Deacon) on Aug 21, 2002 at 01:28 UTC
    I think the answer by sauoq will most likely be the right one.

    I just wanted to chime in with something else. Your code is assuming that fork() always succeeds (a dangerous assumption). Remember that it will return undef upon failure. A failure can be triggered by many events outside of your control, so you should be prepared to deal with them gracefully.

    I offer you this substitute code instead:

    use strict; use warnings; use constant FORK_WAIT => 2; use constant MAX_ATTEMPTS => 10; our $attempt = 0; my $pid = undef; while (not defined ($pid = fork())) { die "Too many failed fork() attempts: $!\n" if ++$attempt > MAX_ATTEMPTS; warn "fork() failed: $!\n"; sleep $FORK_WAIT; } if ($pid) { # I'm the father } else { # I'm the son }
    Hope this helps,

    Edit: Thanks to wtp for the correction in the Perl's fork() behavior.

      Oh, if only it were so! The Perl version of fork returns undef on failure.

      my $pid;
      
      until ( defined ($pid = fork)) {
      ...
      
      When I saw this, I knew that I had seen what ballplayers call "a thing of beauty."
      while (not defined ($pid = fork())) { die "Too many failed fork() attempts: $!\n" if ++$attempt > MAX_ATTEMPTS; warn "fork() failed: $!\n"; sleep $FORK_WAIT; }

      Every artifice that is used here is in its finest form and in proper proportion. And I must know fokat, did your posting originally say that fork() returns -1 on failure? (That is the SVR4 behavior for the system call I believe, and until last week, I had forgotten totally about this as well as the original author of this thread. "Oh," I said, "of course fork can fail! Let's do check for that." While secretly he thinks, Damn, I wonder how much code I should revisit to account for this.)

      ...All the world looks like -well- all the world, when your hammer is Perl.
      ---v

Re: $SIG{INT} unlink problem
by Aristotle (Chancellor) on Aug 21, 2002 at 01:08 UTC
    I have little actual experience with signal handlers, but as I understand you should do as little as possible in them and certainly avoid calling anything that might not be reentrant. The following may or may not work better than your current solution. YMMV.
    my $is_break = 0; $SIG{INT} = sub { $is_break = 1 }; my $result = is_success( getstore( $uri, $localfile ) ); if($is_break) { print STDERR 'Removing incomplete download...'; -f $localfile and ( unlink $localfile or print STDERR "\rCouldn't remove incompletely downloaded $lo +calfile: $!\n" ); } exit $result;
    Update: added file check as per sauoq's post.

    Makeshifts last the longest.

      I agree entirely with Aristotles suggestion. Additionally, you should probably just check to see if the file exists before trying to remove it. That should take care of your error.
      -sauoq
      "My two cents aren't worth a dime.";
      
      FYI, here is a tidbit on signal handlers, from the 5.8.0 perldelta:

      Safe Signals

      Perl used to be fragile in that signals arriving at inopportune moments could corrupt Perl's internal state. Now Perl postpones handling of signals until it's safe (between opcodes).

      This change may have surprising side effects because signals no longer interrupt Perl instantly. Perl will now first finish whatever it was doing, like finishing an internal operation (like sort()) or an external operation (like an I/O operation), and only then look at any arrived signals (and before starting the next operation). No more corrupt internal state since the current operation is always finished first, but the signal may take more time to get heard. Note that breaking out from potentially blocking operations should still work, though.

Re: $SIG{INT} unlink problem
by sauoq (Abbot) on Aug 21, 2002 at 00:25 UTC
    Your program is probably getting the sigint after it knows the name of the file but before it has actually opened it for writing.
    -sauoq
    "My two cents aren't worth a dime.";
    
Re: $SIG{INT} unlink problem
by fglock (Vicar) on Aug 21, 2002 at 00:55 UTC

    Did you stop LWP download before unlinking the file? If you didn't than the file might still be open, and then it can't be unlinked.

    In order to check this, try to unlink the file "by hand", while LWP is running. You might have to kill the LWP process first.

      You can quite well unlink an open file on Unix without closing it. That's often useful for temporary files; the file hangs around as long as it's open, even if there's no directory entry for it anymore. As soon as all the handles on it are closed, it disappears in the bit bucket.

      Makeshifts last the longest.

      This may be true on Windows. It's not true on Solaris. You can unlink an open file.
      -sauoq
      "My two cents aren't worth a dime.";
      
Re: $SIG{INT} unlink problem
by JSchmitz (Canon) on Aug 21, 2002 at 12:31 UTC
    You may be aware of this already but Sun has lots of tools that relate to this. There is the older patchtool, and also much newer versions that check all the patches and download updates automatically. Check out patchk.pl on Suns website although I believe there is another script you must add to it to actually have it install needed patches.

    cheers

    Jeffery
Re: $SIG{INT} unlink problem
by Ferret (Scribe) on Aug 21, 2002 at 18:13 UTC

    Thank you for the help, all. You've made me realize how little I actually remember of the signal handling portion of Unix.

    sauoq - Your initial comment was one of the theories I had, but didn't make any sense since either it should have been already created (since after the signal handler's done killing it, the file is indeed there), or not created at all. Do you know what might cause that behavior?

    aristotle, your handler makes sense on a couple of levels - particularly, I'll only be actually doing cleanup in a location where I have control of the code, so I don't have to wonder at the behavior of LWP. Additionally, I was trying to remove the file, then letting it finish downloading (forgot to exit), none too bright. Even after that change, though, the signal handler's not removing the file - claims it doesn't exist yet, and yet there it is after the program terminates.

    So I'll just use your idea and tell it to clean up afterwards if it gets SIGINTed.

    fokat and wtp, thanks for the correct forking method - IIRC, my systems programming teacher had the (poor, but amusing) attitude that if the program's having fork problems, there's probably something worse wrong with the system anyway.

    JSchmitz - I have looked at the Sun tools every once in a while. I admit it's been six months or more since I have, and perhaps they've gotten better. I should try them again and see whether they'll save me the work. As I said, though, this problem's almost as much for the experience as for the end product ;-)

    Thanks, all!

Re: $SIG{INT} unlink problem
by cybear (Monk) on Aug 22, 2002 at 12:56 UTC
    So much good stuff in this thread! If Ferret figures out what's up
    he should consider writing a basic tutorial on the signal handler stuff

    Unless Aristotle or sauoq want to get to it first.

    - cybear

      Thanks for your vote of confidence, cybear...

      Unfortunately, from my inexperienced point of view, basic and signal handling are mutually exclusive - it's pretty complex. I need a few more attempts and a little more time reading the docs before I'd consider writing a tutorial on something that in the Real World often gets enough chapters to make its own book.

      Until then, read perlipc and perlfork, Chapter 16 of the Camel, and the chapter(s) on signal handling in the nearest System Programming for Unix book. I'm going to go through them again (it's been awhile) and see if I can't grok in fullness.

      --Ferret
      I too have to decline the offer. I pick up a lot of hints on various topics mainly by the reading about experience of fellow monks, and of course also poking around in the docs, so sometimes I can offer limited help on topics I'm not actually experienced with. As stated in my above post, signal handlers are one such topic. If someone did write a lightweight tutorial on them, I would however definitely be interested in reading it; accumulating knowledge never hurts. :-)

      Makeshifts last the longest.

        accumulating knowledge never hurts
        Unless its done inside a signal handler lacking enough preallocated memory to store it.... <grin>

        -Blake