Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Self resurrecting perl scripts

by RhetTbull (Curate)
on Apr 11, 2001 at 18:55 UTC ( [id://71682]=perlquestion: print w/replies, xml ) Need Help??

RhetTbull has asked for the wisdom of the Perl Monks concerning the following question:

Synopsis: I have a script (not CGI, *nix environment) that can 'die' for a variety of reasons. I want to restart the script automatically depending on what caused it to die.

Detail: It's a script that does many things including using Net::FTP to download some files over an extremely unreliable connenction. As a result, the Net::FTP session occasionally dies with an error like: "Timeout at /usr/local/lib/perl5/site_perl/5.6.0/Net/FTP.pm line 471" That's ok. I expect it to happen. It could also die for other reasons than a time out. Also note that the script uses lock files, and a bunch of other stuff that gets cleaned up in sig handler. e.g.

$SIG{__DIE__} = \&handle_die;
So, what I want to happen is to somehow have the script know if it died because Net::FTP timed out or for some other reason. If it was due to a time out, I want to restart the script. Possible solutions I've thought of for restarting it: use backticks or system() to execute the script from another script that restarts it, turn the existing script into a sub and call it from another script, etc. What I'd really like to do is have the signal handler in the script itself check this condition and restart itself. (i.e. a self resurrecting script) That avoids having an extra script to keep track of. Is this possible? Another problem is that I don't know how to tell what caused the script to 'die'. I looked at $? and it seems to be the same value no matter what caused the die. Is there a way to find out if Net::FTP caused it to die or if it was one of my own 'die' statements? I only want to restart for a time out condition, not any other error condition.

Any thoughts are welcome! Thanks,

--RT

PS -- yes, I know I can set the Timeout value in Net::FTP but that doesn't solve the problem. When the link goes down, it's down for a while.


The angel said to the women, "Do not be afraid, for I know that you are looking for Jesus, who was crucified. He is not here; he has risen, just as he said.

Matthew 28:5-6, NIV

Replies are listed 'Best First'.
Re: Self resurrecting perl scripts
by physi (Friar) on Apr 11, 2001 at 19:07 UTC
    Ok, why do you want to restart your script, or let it die ?
    Try to put yout Net::FTP part into an eval block:
    eval { CODE WHAT TO DO || die; }; print $@;
    So that script in the eval block will fail, you can avoid the die of your main program. Just check the Code in $@.
    If all went well, ok, else doitagain Sam :-)

    Just a question to your SIG{__DIE__}: Is this the same like an END{} Block ?

    UPDATE: added a ; behind the eval block ;-)

    ----------------------------------- --the good, the bad and the physi-- -----------------------------------
      Just a question to your SIG{__DIE__}: Is this the same like an END{} Block ?

      END happens after the die..

      #!/usr/bin/perl -w $SIG{__DIE__} = sub { print "Dead\n"; }; die "Blah"; END { print "This is the end, my friend.\n"; }

      Outputs:

      Dead blah at ./foo.pl line 5. This is the end, my friend.

      But, you need to define the SIG before you use it. The END will happen regardless. For example:

      #!/usr/bin/perl -w die "Blah"; $SIG{__DIE__} = sub { print "Dead\n"; }; END { print "This is the end, my friend.\n"; }

      Will output:

      Blah at ./foo.pl line 3. This is the end, my friend.

      Cheers,
      KM

      This is what I do and what I usually find. Note, however, that $@ is the only help you get about what failed. eval retuns undef if it died. That means you have to parse $@ to find out why it died (useful if it was a system event that killed it), or you can set a package or global variable to tell you why it died. I tend to do the latter whenever possible.

      traveler

      Try to put yout Net::FTP part into an eval block:
      eval { CODE WHAT TO DO || die; }; print $@;
      Thanks, that's exactly what I needed! The Net::FTP code was actually dieing (I didn't call die) so I needed to trap the die and verify that it died because it was timing out. Putting the call to "Net::FTP->get()" in an eval block let me trap that. I didn't know I could do that with eval. Now I just need to look at $@ to see if the Net::FTP Timeout is there. Thanks again!

      As for your question about SIG(__DIE__), KM's response basically sums it up. The END block gets executed last and no matter what caused the script to end. However, when the script dies, I want to know so I can do some extra clean up (log the errors, close files, etc) that doesn't need to happen when the script executes normally.

      Update:

      Added a ; after eval block. :-)

Re: Self resurrecting perl scripts
by astanley (Beadle) on Apr 11, 2001 at 18:59 UTC
    Just a thought but have you looked into the fork() function? I think this would do what you want - you could run the parent once and have it just sit around and wait for the child to either exit successfully or die...in which case you could handle it appropriately. It would take a bit of extra coding but would be better than calling a seperate script through backticks or a system call. HTH.

    -Adam Stanley
    Nethosters, Inc.
Re: Self resurrecting perl scripts
by ChOas (Curate) on Apr 11, 2001 at 19:00 UTC
    Hey,

    A not Perl answer (just because it`s an option)

    If you have control of the box, let the script
    be spawned by init... It will keep an eye on it, and
    restart it, if nescesarry (nice Scrabble word, but I
    can't spell it)

    GreetZ!,
      ChOas

    print "profeth still\n" if /bird|devil/;
Re: Self resurrecting perl scripts
by suaveant (Parson) on Apr 11, 2001 at 19:11 UTC
    Probably the best solution is to put in lots of code to catch various errors, then you know what died and how to fix it in 95 percent of the cases. For example, I am almost sure that Net::FTP does not actually do an exit when it has an error, so if you read the perldocs and figured out what it does do, you should be able to wrap it in if statements and the like and repair the error with letting the program die. for the case where an exit is called, there is always wrapping it in a eval and checking the value of $@ for the error that broke it. But there is always more than one way to do it
                    - Ant
Re: Self resurrecting perl scripts
by perigeeV (Hermit) on Apr 11, 2001 at 20:44 UTC

    I had something similar once. My answer wasn't awfully pretty, but it worked. I wrapped each step into its own function, and had each sub return a unique error code upon failure instead of dying. One sub would return 12, another would return 8, etc. The main code in the program would then be able to handle each error condition as it happened, knowing exactly what blew up.

    Just try to keep the sub calls out of loops if you don't want to eat too many cycles.

Re: Self resurrecting perl scripts
by ftforger (Sexton) on Apr 11, 2001 at 21:54 UTC
    I have not tried this myself, so take this with a grain (handful, or other suitable quantity) of salt. die is "just a function". In your code, I'm assuming you are using something like get_the_file() || die; Replace the "die;" with "get_the_file_failed();" and then write a function/handler for that condition. Extend as necessary to handle your other conditions. You only have to have the function to the left of || return true on success.
Re: Self resurrecting perl scripts
by birdbrane (Chaplain) on Apr 12, 2001 at 18:20 UTC
    Actually, we have similar scripts to this. We have a threshold of so many errors before the script croaks. Basically, the script attempts to ftp to the file. If it fails, it calls a "redo" and keeps trying until there is success or the max number of failures is exceeded. Here are a couple of snippets from the code, to at least let you see how we do it.

    sub get_files { # ftp files from remote server while (1) { $ftp = Net::FTP->new("$From_Host", Debug => 0); $ftp->login("$From_User","$From_Pass") || do { &print_ +ftp_error; redo; }; $ftp->cwd("$From_Dir"); (@dir_files = $ftp->dir) || do { &print_ftp_error; red +o; }; $ftp->quit; last; } sub print_ftp_error { # Error handling proc $save = $ftp->error; $ftp->quit; print STDERR "ftp error: $save\n"; print_to_log("ftp error: $save\n"); if (++$ftp_errors{"$save"} >= $Max_ftp_Errors) { print STDERR "\n\nExceeded maximum number of ftp error +s. Exiting!!!\n"; print_to_log("\n\nExceeded maximum number of ftp error +s. Exiting!!!\n"); system("$Notify_Proc \"ftpfiles.pl Exceeded maximum nu +mber of ftp errors. Please check $Log_File.\" 2>&1 > /dev/null"); rename "$Log_File", "$Log_File.$year.$mon.$mday\_$hour +.$min"; exit 1; } else { # # Give the ftp server a 60 second grace period to get # it's act together. # sleep 60; } }

    birdbrane

      That's very similar to what I needed to do. Thanks for posting the snippet! I think I'll incorporate this idea into my script. I've got a "download_file" sub similar to your get_files so it should be a fairly easy fix. That seems much easier than trapping "die", etc. Although, I must admit that in working on this and reading the replies here I've learned *a lot* about perl that I didn't know two days ago! ;-)

      --RT

Re: Self resurrecting perl scripts
by jwest (Friar) on Apr 11, 2001 at 22:39 UTC
    Apart from the various Perl-based solutions, if you have or can get root access on the box that the script will be running on, you can use 'init', to automagically re-run the script for you should it ever exit.

    This doesn't help with determining why it died in the first place, nor reacting appropriately based on the cause of death. init is fairly simple. If the runlevel is appropriate, and the process isn't running, respawn it.
Re: Self resurrecting perl scripts
by bobtfish (Scribe) on Apr 12, 2001 at 12:32 UTC
    Two suggestions:
    1) You can call exec and re-start your script from within your end block.
    By passing it enviromental variables and/or command line parameters you can give it all the information it needs.
    2) Override the built in die function. (See camel, p306 in v3).
    <code> *CORE::GLOBAL::die = sub {
    print STDERR "In my own die function!"; CORE::die(@_);
    #Notreached exit(1); }

    N.B. This is just untested example code.
    When you call die it prints In my own die function then call's the builtin die function.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://71682]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (6)
As of 2024-04-24 01:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found