It might be better if you recorded your status after you've finished with each file. The advantage of this is that then even if your script dies unexpectedly, you can continue processing from wherever you were. For example, your script can record the list of files it has fully processed in a separate progress file, and when you re-run the script, it should read that file and skip the files that are already processed.

Once you implement this, if processing the files are idempotent, then you can simply kill the script any time, whatever it's doing. Otherwise, you may want to implement a cleaner way to shut down, like you mention in your question: eg. periodically check for existence of a file, and if it does not exist (because you've deleted it), quit the script. Even then though, it's worth to save the progress occasionally, such as by writing to the progress file after each file, to avoid having to redo all the computation if the script dies for some unexpected reason, be it unexpected input, bug in your script, power failure, memory full, or something else.

Let me point to an example that may help you. The script wgetas - download many small files by HTTP, saving to filename of your choice has two measures for continuing after an interruption. Namely, to avoid repeating successful downloads, the script does not attempt to download any file if the output file it would save to already exists locally – this works only because the script creates the output file atomically, so the output file cannot exist if the script was interrupted during the download of that same file. Further, to avoid retrying downloads that have failed in a permanent way (such as the file not existing on the remote server), if the script is invoked with the -e option, a progress file is written with the names of downloads already processed. (It's important that output to the progress file is flushed after writing each filename.)


In reply to Re: Gracefully exiting and restarting at a later time. by ambrus
in thread Gracefully exiting and restarting at a later time. by Largins

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.