Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

I have a script that connects to a remote database, extracts one DNA sequence, runs it through a BLAST program on a different remote machine, and then updates the database with the new information. Rinse and repeat, for every sequence. I want this to be automated, such that if something goes horribly askew, my script can detect this and attempt to pick up where it left off, until it succeeds in doing so. For example, let's say the remote BLAST server goes down and my BLAST system call hangs. In this case, I want to pound the server with a "let me in?", say once every five minutes, until it connects, and then restart from the last successful sequence, which is stored in a tempfile. I also want an interruption to be printed to a log file, and I think that both of these tasks could include altering the SIG handlers, though I am not sure if this is the best way, or what SIG handler(s) would need attention. I suppose a crontab is an option, but I would like to keep everything as contained in my script as possible, rather than having little files flying around here and there, to make it more portable. But if cron is the best way, so be it. So, can anyone enlighten me as to what is involved in having a script "listen" to a process, and facilitate its continuity in the event of interruptions? Many thanks.

Replies are listed 'Best First'.
Re: monitor process, sig handlers(?)
by tall_man (Parson) on Apr 01, 2003 at 19:01 UTC
    It might make sense for you to have a parent process that forks off a child process to do the work. Then it can catch SIGCHLD signals when things go wrong. Check out the IPC::Run for a convenient wrapper for this, including a time-out option for the child process.
Re: monitor process, sig handlers(?)
by Aragorn (Curate) on Apr 01, 2003 at 19:10 UTC
    I think that in this case, you don't need to worry about signal handlers (not for the reasons you think anyway). Your script does 2 things:
    1. Maintain a connection to a remote database
    2. Call a remote BLAST server
    I don't have a clue about what a BLAST server is, but I imagine it does something interesting with a DNA sequence.

    To detect problems with the database, you can reconnect for every query so that when the database is down, you get an error message, or use a permanent connection and create a timeout handler using signals and the alarm system call.

    You'll have to do something similar with the requests to the BLAST server. I don't know what a call to the BLAST server looks like, but I assume something like the database query can be done.

    Using the temp file for intermediate results is a wise thing, because when a connection to one of the servers is down, you can detect it and wait for it to come back up (using sleep for example), possibly notify you, and go on its merry way. This way, you can do all error handling within the script and there's no need to monitor external processes.

    Arjen

Re: monitor process, sig handlers(?)
by Anonymous Monk on Apr 01, 2003 at 23:37 UTC
    I'm the guy who started this thread, and I wanted to follow up with some code that I wrote, and ask other monks if this is ok. Again, thanks in advance.
    sub blast { my $data = shift; my $temp = "$ENV{HOME}/temp/seq.tmp"; my $fifo = "ssh myhost /path/to/blast"; my @args = ('-p blastn', '-d nt', '-e 0.0000001', '-m 7'); ### load input file with fasta-formatted queryid/sequence ### open SEQ, ">$temp" or die "Could not open for write:$!"; print SEQ ">",$data->[0],"\n",$data->[1]; close SEQ; ### open a handle to BLAST output, retry if unsuccessfull ### { if (!open BLAST, "$fifo @args <$temp|") { # print LOG "BLAST connection refused, retrying...\n"; # sleep(300); redo; } } ### process the report ### parse(*BLAST); }