How to break up a long running process

markh has asked for the wisdom of the Perl Monks concerning the following question:

I have a perl app that has a couple of functions which take a very long time to complete. I would like to be able to essentially throw these functions into the background (maybe not literally, but figuratively) and still be able to give periodic updates to another process (through a socket that I have open). This sounds kind of convoluted, so I'll give a simplified example. (Note, this isn't the actual app, so don't worry about specifics, typos or if the socket connection fails).

#!/usr/bin/perl -w

use IO::Socket;
use strict;

#sock is my connection where I report what is going on
my $sock = IO::Socket::INET->new(PeerAddr => 'hostname.org',
                                        PeerPort => '1234',
                                        Proto => 'tcp'
                                        );

my $return_value = long_running_function();

print $sock "$return_value\n";
close $sock;
[download]

Now, I'd like to be able to have the long_running_function run, while I still have the ability to communicate on the socket. I'm wanting to send an "I'm still here" to the other end of the socket every 2-3 minutes, while I'm waiting for the long_running_function to finish processing (which can take up to an hour to complete). I've been racking my brain on this, and cannot really figure out a clean way to do this. I thought about using fork, but I'm not sure there isn't a simpler way I'm just overlooking.

This app is running on the win32 platform, though the other end of the socket is FreeBSD (not that the other end should matter).

Any good ideas?

Comment on How to break up a long running process Download Code

Replies are listed 'Best First'.
Re: How to break up a long running process by madbombX (Hermit) on Oct 25, 2006 at 01:24 UTC
You don't only have the option to use threads. You also have the option of forking off processes (which I know you considered) and letting them run in parallel assuming that the processes are not dependent upon each other's completion for another to start. This method worked well for me for transactions that would take potentially up to 10 minutes a piece. You can do this with Parallel::ForkManager using the following code: use Parallel::ForkManager; # Begin ForkManager my $_max_procs = 5; $_pm = new Parallel::ForkManager($_max_procs); # Log at process fork $_pm ->run_on_start( sub { my ($pid, $host) = @_; print "Forking process PID: $pid\n"; } ); # Log at process copmletion $_pm ->run_on_finish( sub { my ($pid, $exit_code, $func) = @_; print "Finishing up process PID: $pid\n"; } ); # This run_on_wait is currently set to print every 5 sec # It can easily be modified to write to a socket every X sec $_pm->run_on_wait( sub { print "Waiting for children to finish.\n" }, 5.0 ); foreach my $func (@func_list) { # Fork off the children and get going on the queries my $pid = $_pm->start($func) and next; # foo here # Closing the forked process $_pm->finish; } # Ensure all children have finished $_pm->wait_all_children; exit(1); [download] I am assuming that this module is available on the win32 platform since I just got it off of CPAN.	[reply] [d/l]
Re: How to break up a long running process by BrowserUk (Patriarch) on Oct 25, 2006 at 00:48 UTC
T'is easy. `#!/usr/bin/perl -w use IO::Socket; use strict; use threads; use threads::shared; #sock is my connection where I report what is going on my $sock = IO::Socket::INET->new( PeerAddr => 'hostname.org', PeerPort => '1234', Proto => 'tcp' ); my $done:shared = 0; my $thread = async{ long_running_function(); $done = 1 }; print $sock "I'm still here" and sleep 180 until $done; my $return_value = $thread->join; print $sock "$return_value\n"; close $sock;` [download] Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re: How to break up a long running process by GrandFather (Saint) on Oct 25, 2006 at 00:15 UTC
Ultimately the answer, in some form or another, is threads. The tutorials section contains Threads: why locking is required when using shared variables and Things you need to know before programming Perl ithreads, neither of which is a tutorial that introduces the concepts you require, nor paints in the big yellow and black warning borders around the tricky areas. The Perl documentation contains perlthrtut, which may be a good start. Another way to go about things (by hiding the tread stuff under the hood somewhat) is to use POE. Beware, threads can be tricky creatures that bite in nasty and unusual ways. On the other hand, there are a few Perl Monks around with a fair amount of experience taming them so pleas for help will certinally be answered. Good luck ;). DWIM is Perl's answer to Gödel	[reply]
Re^2: How to break up a long running process by BrowserUk (Patriarch) on Oct 25, 2006 at 08:19 UTC
perlthrtut is almost completely useless as a starting point on how to use iThreads. `Things you need to know before programming Perl ithreads` (which I refuse to link), describes a situation circa. v5.8.0 (the first ever iThreads build) & v5.8.1 (the buggiest threaded perl build ever, that lasted a whole 41 days before being superceded), from the perspective of someone who attempted to emulate iThreads with forks--and failed. Update: Are you going to refer newbies to Things you should need to know before using Perl regexes. (Humour, with a serious point) when they ask about regexes? And, in the process succeeded in littering the entire cpan Thread::* and (as I recently discovered) thread::* namespaces with over-ambitious, unsupported (and almost unsupportable), mostly defunct, and do nothing modules that are in great part the cause of the difficulties that people have trying to use iThreads, (Now sadly billed as the "greatest authority on Perl threading"!). Finally, how about you try writing a POE solution before you recommend it to others on the basis of hearsay. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re: How to break up a long running process by ailivac (Sexton) on Oct 25, 2006 at 00:54 UTC
It depends on how the long running function runs and why it takes so long to run. If it's CPU-bound and something you can break up into smaller pieces, POE would be a good option, and you could use delay events to periodically report status. If it's IO-bound or for some other reason you can't otherwise subdivide it, then you might have to run another thread to send status updates.	[reply]
Re: How to break up a long running process by markh (Scribe) on Oct 26, 2006 at 01:00 UTC
Want to thank everyone for their ideas and warnings. I'm going to try and implement this with the threads as mentioned by BrowserUk. I do have some concerns though regarding error situations, and what kind of curveballs Windows will throw at me. Most of my coding is on FreeBSD, and so I cringe anytime I have to write code for the Win32 platform. I guess we will see what happens. Also, I figured I'd mention that my "long_running_function()" is using OLE to launch an external application (Quickbooks to be exact), and is performing a query to Quickbooks that can take up to an hour to return a response. (And this is on reasonably high end hardware btw) I'm trying to work with the client to redesign a couple of things to break up this Quickbooks query into several smaller queries, but the client isn't wanting to break the query up yet.	[reply]