I'm trying to implement a "recover" option into a script that may fail at any of a few given points. Currently we need to rerun the whole thing in the event of a failure, but it's a big data warehousing project, and as the size of the data files grows, the importance of recovery will become more and more important.

I prototyped a system that seems to be pretty reliable, but it makes use of the dreaded "goto". I think that's okay, that it is in a very controlled circumstance and won't wreak havoc, but I wanted to pose this to the group at large: what is the best method to insert "checkpoints", that leave behind the ability to resume a task at the point where it failed?

I've attached two scripts: the first is the main script that performs three tasks. Any time one of these tasks fails, a file is left behind, ensuring that the process can be picked up where it left off. The other is a "wrapper" script, which calls either the recovery file or the whole process from the beginning.

NOTE: There are a few housekeeping things that this obviously doesn't do, e.g. checking for a valid code label, etc. I left them out intending to do that when it comes time for real implementation.

The main script, recover.pl

#!/usr/local/bin/perl use Getopt::Std; our ($opt_r); getopts('r:'); my $recover = uc $opt_r; $recover and goto ($recover); my $rec_file_name = 'start.rec'; FIRST: first(); SECOND: second(); THIRD: third(); CleanUp(); # do the first task sub first { # some other stuff may happen here, # so we'll go ahead and write the recovery info write_recovery_file('first'); print STDERR "This is first\n"; sleep(1); # die "died in first"; } sub second { write_recovery_file('second'); print STDERR "This is second\n"; sleep(1); # die "died in second"; } sub third { write_recovery_file('third'); print STDERR "This is third\n"; sleep(1); # die "died in third"; } sub CleanUp { `rm start.rec`; print "Recovery file deleted\n" unless ($?); } sub write_recovery_file { my $str = shift; open RECOVER, ">$rec_file_name"; print RECOVER "$0 -r$str\n"; close RECOVER; }
And here's the wrapper script:
#!/usr/local/bin/perl # This tests the recovery system $recovery_file = check_for_recovery_file(); # execute the script at either the recovery step or the beginning $recovery_file ? recover($recovery_file) : recover(); sub recover { my $file_name = shift; my $cmd_line = 'recover.pl'; if ($file_name) { open INFILE, "$file_name"; # assumes the recovery file contains # no more than one line of text chomp($cmd_line = <INFILE>); print STDERR "Resuming failed process: '$cmd_line'"; } `$cmd_line`; die "Could not execute $file_name: $!" if ($?); print STDERR "'$cmd_line' successful\n"; } sub check_for_recovery_file { # won't hard-code this in real life $_ = 'start.rec'; (-s) ? return $_ : return 0 }

Any thoughts?

MM


In reply to Best method for failure recovery? by Maestro_007

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.